infinitylogesh/qwen3_1_7b_base_grpo_math_12k_fullfinetuning_baseline_50 2B • Updated 1 day ago • 12
infinitylogesh/qwen3_1_7b_base_grpo_math_12k_fullfinetuning_baseline_100 2B • Updated 1 day ago • 7
infinitylogesh/qwen3_1_7b_base_srt_grpo_math_12k_single_stage_fullfinetuning_ckpt50 2B • Updated 21 days ago • 8
infinitylogesh/qwen3_1_7b_base_srt_grpo_math_12k_single_stage_fullfinetuning_ckpt100 2B • Updated 21 days ago • 10
infinitylogesh/qwen3_1_7b_base_srt_grpo_math_12k_single_stage_rollout_16_fullfinetuning_merged 2B • Updated 21 days ago • 8
infinitylogesh/Qwen3-1.7B-GRPO-SRT-Math-12k-Single-Stage-Rollout-16-Full-Finetuning 2B • Updated 23 days ago • 8
infinitylogesh/Qwen3-1.7B-GRPO-SRT-Math-12k-Stage-2 Text Generation • 2B • Updated 25 days ago • 29
infinitylogesh/Qwen3-1.7B-GRPO-SRT-Math-12k-Stage-1 Text Generation • 2B • Updated 29 days ago • 11.7k
infinitylogesh/Qwen2.5-1.5B-GRPO-SRT-Math-12k-Stage-0 Text Generation • 2B • Updated 30 days ago • 6
infinitylogesh/Qwen3-1.7B-GRPO-SRT-Math-12k-Stage-0 Text Generation • 2B • Updated 30 days ago • 10.1k