Collection related to the paper, "Training a Generally Curious Agent" (Project page: https://paprika-llm.github.io/)
Fahim Tajwar
ftajwar
AI & ML interests
LLMs, RLHF
Organizations
models
4
ftajwar/paprika_Meta-Llama-3.1-8B-Instruct_SFT_only
Text Generation
•
8B
•
Updated
•
1
ftajwar/llama_3.1_8B_baseline_DPO
Text Generation
•
8B
•
Updated
•
2
ftajwar/llama_3.1_8B_baseline_SFT
Text Generation
•
8B
•
Updated
•
3
ftajwar/paprika_Meta-Llama-3.1-8B-Instruct
Text Generation
•
8B
•
Updated
•
1
•
2
datasets
22
ftajwar/evaluation_bitwise_arithmetic-2
Viewer
•
Updated
•
110
•
11
ftajwar/training_bitwise_arithmetic-2
Viewer
•
Updated
•
20k
•
27
ftajwar/evaluation_family_relationships_5
Viewer
•
Updated
•
100
•
9
ftajwar/evaluation_family_relationships_4
Viewer
•
Updated
•
100
•
5
ftajwar/training_family_relationships_5
Viewer
•
Updated
•
20k
•
11
ftajwar/training_family_relationships_4
Viewer
•
Updated
•
20k
•
23
ftajwar/evaluation_bitwise_arithmetic-4
Viewer
•
Updated
•
110
•
7
ftajwar/evaluation_bitwise_arithmetic-3
Viewer
•
Updated
•
110
•
17
ftajwar/training_bitwise_arithmetic-4
Viewer
•
Updated
•
20k
•
7
ftajwar/training_bitwise_arithmetic-3
Viewer
•
Updated
•
20k
•
8