Papers - a HariharaIII Collection

HariharaIII 's Collections

Papers

Papers

updated Apr 10, 2025

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published Apr 7, 2025 • 26
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models

Paper • 2504.04718 • Published Apr 7, 2025 • 42
SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement

Paper • 2504.03561 • Published Apr 4, 2025 • 18
Concept Lancet: Image Editing with Compositional Representation Transplant

Paper • 2504.02828 • Published Apr 3, 2025 • 16
ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning

Paper • 2503.22738 • Published Mar 26, 2025 • 17
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay

Paper • 2504.03601 • Published Apr 4, 2025 • 17
LiveVQA: Live Visual Knowledge Seeking

Paper • 2504.05288 • Published Apr 7, 2025 • 15
Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)

Paper • 2504.03151 • Published Apr 4, 2025 • 15
Generative Evaluation of Complex Reasoning in Large Language Models

Paper • 2504.02810 • Published Apr 3, 2025 • 14
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model

Paper • 2504.05594 • Published Apr 8, 2025 • 11
MedSAM2: Segment Anything in 3D Medical Images and Videos

Paper • 2504.03600 • Published Apr 4, 2025 • 10
DiaTool-DPO: Multi-Turn Direct Preference Optimization for Tool-Augmented Large Language Models

Paper • 2504.02882 • Published Apr 2, 2025 • 7
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning

Paper • 2504.05520 • Published Apr 7, 2025 • 11
3D Scene Understanding Through Local Random Access Sequence Modeling

Paper • 2504.03875 • Published Apr 4, 2025 • 5
Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking

Paper • 2504.03947 • Published Apr 4, 2025 • 4
JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model

Paper • 2504.03770 • Published Apr 3, 2025 • 2
SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills

Paper • 2504.07079 • Published Apr 9, 2025 • 12
Rethinking Reflection in Pre-Training

Paper • 2504.04022 • Published Apr 5, 2025 • 80
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31, 2025 • 62
Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26, 2025 • 59
RIG: Synergizing Reasoning and Imagination in End-to-End Generalist Policy

Paper • 2503.24388 • Published Mar 31, 2025 • 29
Agentic Knowledgeable Self-awareness

Paper • 2504.03553 • Published Apr 4, 2025 • 27
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models

Paper • 2503.22165 • Published Mar 28, 2025 • 28
Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

Paper • 2504.00906 • Published Apr 1, 2025 • 27
Effectively Controlling Reasoning Models through Thinking Intervention

Paper • 2503.24370 • Published Mar 31, 2025 • 19
Expanding RL with Verifiable Rewards Across Diverse Domains

Paper • 2503.23829 • Published Mar 31, 2025 • 23
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models

Paper • 2503.24377 • Published Mar 31, 2025 • 18
ActionStudio: A Lightweight Framework for Data and Training of Large Action Models

Paper • 2503.22673 • Published Mar 28, 2025 • 12
MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis

Paper • 2502.18924 • Published Feb 26, 2025 • 16
Interpreting Emergent Planning in Model-Free Reinforcement Learning

Paper • 2504.01871 • Published Apr 2, 2025 • 12
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6, 2025 • 113
Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6, 2025 • 96
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14, 2025 • 127
SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 202
UniF^2ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models

Paper • 2503.08120 • Published Mar 11, 2025 • 31
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published Mar 16, 2025 • 35
CoRe^2: Collect, Reflect and Refine to Generate Better and Faster

Paper • 2503.09662 • Published Mar 12, 2025 • 33
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Paper • 2503.10291 • Published Mar 13, 2025 • 36
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity

Paper • 2503.16418 • Published Mar 20, 2025 • 36
Modifying Large Language Model Post-Training for Diverse Creative Writing

Paper • 2503.17126 • Published Mar 21, 2025 • 36
Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation

Paper • 2503.22675 • Published Mar 28, 2025 • 36
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice

Paper • 2503.05978 • Published Mar 7, 2025 • 36
API Agents vs. GUI Agents: Divergence and Convergence

Paper • 2503.11069 • Published Mar 14, 2025 • 36
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

Paper • 2503.01307 • Published Mar 3, 2025 • 38
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

Paper • 2503.16365 • Published Mar 20, 2025 • 40
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking

Paper • 2502.20730 • Published Feb 28, 2025 • 38
Process-based Self-Rewarding Language Models

Paper • 2503.03746 • Published Mar 5, 2025 • 39
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Paper • 2503.21614 • Published Mar 27, 2025 • 42
EgoLife: Towards Egocentric Life Assistant

Paper • 2503.03803 • Published Mar 5, 2025 • 46