Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published 13 days ago • 96
view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective 16 days ago • 51
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment Paper • 2507.20984 • Published Jul 28, 2025 • 58
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published Jan 8, 2025 • 99
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs Paper • 2506.14245 • Published Jun 17, 2025 • 45
view article Article AI Policy @🤗: Response to the 2025 National AI R&D Strategic Plan Jun 2, 2025 • 14