TheOneTrueNiz
's Collections
Papers
updated
R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs
via Bi-Mode Annealing and Reinforce Learning
Paper
•
2508.21113
•
Published
•
110
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement
Learning for General LLM Reasoning
Paper
•
2508.16949
•
Published
•
23
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for
General Robot Control
Paper
•
2508.21112
•
Published
•
77
UItron: Foundational GUI Agent with Advanced Perception and Planning
Paper
•
2508.21767
•
Published
•
12
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn
Tool-Integrated Reasoning
Paper
•
2509.02479
•
Published
•
83
K2-Think: A Parameter-Efficient Reasoning System
Paper
•
2509.07604
•
Published
•
13
THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical
Reasoning
Paper
•
2509.13761
•
Published
•
16
FlowRL: Matching Reward Distributions for LLM Reasoning
Paper
•
2509.15207
•
Published
•
114
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic
Data and Scalable Reinforcement Learning
Paper
•
2509.13305
•
Published
•
91
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid
Vision Tokenizer
Paper
•
2509.16197
•
Published
•
56
ReSum: Unlocking Long-Horizon Search Intelligence via Context
Summarization
Paper
•
2509.13313
•
Published
•
80
Understanding the Thinking Process of Reasoning Models: A Perspective
from Schoenfeld's Episode Theory
Paper
•
2509.14662
•
Published
•
13
Meta-R1: Empowering Large Reasoning Models with Metacognition
Paper
•
2508.17291
•
Published
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning
Paper
•
2509.25760
•
Published
•
55
Diffusion Transformers with Representation Autoencoders
Paper
•
2510.11690
•
Published
•
165