ProAct: Agentic Lookahead in Interactive Environments Paper • 2602.05327 • Published 5 days ago • 23
EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control Paper • 2511.15248 • Published Nov 19, 2025 • 7
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model Paper • 2311.13231 • Published Nov 22, 2023 • 28