AI & ML interests
None yet
Organizations
None yet
lblaoke/opt-350m-hh-rlhf-rm-trl-v5
0.3B
•
Updated
•
1
lblaoke/opt-350m-hh-rlhf-dpo-trl-v5
0.3B
•
Updated
•
1
lblaoke/opt-350m-hh-rlhf-chosen-sft-trl-v5
0.3B
•
Updated
•
1
lblaoke/opt-125m-hh-rlhf-rm-trl-v5
0.1B
•
Updated
lblaoke/opt-125m-hh-rlhf-dpo-trl-v5
0.1B
•
Updated
lblaoke/opt-125m-hh-rlhf-chosen-sft-trl-v5
0.1B
•
Updated
•
1
lblaoke/qwama-0.5b-hh-rlhf-sft-chosen-trl-v4
0.5B
•
Updated
•
1
lblaoke/qwama-0.5b-skywork-pref-sft-chosen-dpo-trl-v3
0.5B
•
Updated
•
1
lblaoke/qwama-0.5b-skywork-pref-sft-rejected-chosen-trl-v3
0.5B
•
Updated
•
1
lblaoke/qwama-0.5b-skywork-pref-sft-chosen-trl-v3
0.5B
•
Updated
•
1
lblaoke/qwama-0.5b-skywork-pref-sft-rejected-trl-v3
0.5B
•
Updated
•
2
lblaoke/qwama-0.5b-skywork-pref-dpo-trl-v2
0.5B
•
Updated
•
1
lblaoke/qwama-0.5b-skywork-pref-dpo-llama-factory-v1
0.5B
•
Updated
•
1
lblaoke/qwama-0.5b-skywork-pref-dpo-trl-v1
0.5B
•
Updated
lblaoke/mistral-v0.3-7b-ppo-self-human
lblaoke/mistral-v0.3-7b-ppo-self
7B
•
Updated
•
1
lblaoke/mistral-v0.3-7b-ppo-human
7B
•
Updated
•
1
lblaoke/qwen2.5-7b-ppo-self
8B
•
Updated
•
2
lblaoke/qwen2.5-7b-ppo-self-human
8B
•
Updated
•
1
lblaoke/qwen2.5-7b-ppo-human
8B
•
Updated
•
1
lblaoke/llama-3.1-8b-ppo-self-human
8B
•
Updated
•
2
lblaoke/qwen2.5-7b-rm-human
7B
•
Updated
•
1
lblaoke/qwen2.5-7b-rm-self-human
7B
•
Updated
•
1
lblaoke/qwen2.5-7b-rm-self
7B
•
Updated
lblaoke/llama-3.1-8b-ppo-self
8B
•
Updated
•
1
lblaoke/llama-3.1-8b-ppo-human
8B
•
Updated
lblaoke/llama-3.1-8b-rm-self-human
8B
•
Updated
•
1
lblaoke/llama-3.1-8b-rm-self
8B
•
Updated
•
1
lblaoke/llama-3.1-8b-rm-human
8B
•
Updated
•
2
lblaoke/mistral-v0.1-7b-ppo-self-human
7B
•
Updated
•
1