20250903 - a ShiqiangWoo Collection

ShiqiangWoo 's Collections

AI-generaed code

EO

20250903

updated Sep 4, 2025

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 228
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2, 2025 • 83
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion

Paper • 2509.01215 • Published Sep 1, 2025 • 50
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published Aug 31, 2025 • 84
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2, 2025 • 124
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1, 2025 • 76
Baichuan-M2: Scaling Medical Capability with Large Verifier System

Paper • 2509.02208 • Published Sep 2, 2025 • 42
Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

Paper • 2509.02522 • Published Sep 2, 2025 • 25
Kwai Keye-VL 1.5 Technical Report

Paper • 2509.01563 • Published Sep 1, 2025 • 37
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic

Paper • 2509.01363 • Published Sep 1, 2025 • 58
Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published Sep 2, 2025 • 24
GenCompositor: Generative Video Compositing with Diffusion Transformer

Paper • 2509.02460 • Published Sep 2, 2025 • 25
OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning

Paper • 2509.01644 • Published Sep 1, 2025 • 33
Attributes as Textual Genes: Leveraging LLMs as Genetic Algorithm Simulators for Conditional Synthetic Data Generation

Paper • 2509.02040 • Published Sep 2, 2025 • 14
M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision

Paper • 2509.01360 • Published Sep 1, 2025 • 11
FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games

Paper • 2509.01052 • Published Sep 1, 2025 • 21
Universal Deep Research: Bring Your Own Model and Strategy

Paper • 2509.00244 • Published Aug 29, 2025 • 13
Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing

Paper • 2509.01984 • Published Sep 2, 2025 • 6
Fantastic Pretraining Optimizers and Where to Find Them

Paper • 2509.02046 • Published Sep 2, 2025 • 13
MedDINOv3: How to adapt vision foundation models for medical image segmentation?

Paper • 2509.02379 • Published Sep 2, 2025 • 1
Improving Large Vision and Language Models by Learning from a Panel of Peers

Paper • 2509.01610 • Published Sep 1, 2025 • 2
Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views

Paper • 2509.01250 • Published Sep 1, 2025 • 2
SQL-of-Thought: Multi-agentic Text-to-SQL with Guided Error Correction

Paper • 2509.00581 • Published Aug 30, 2025 • 10
C-DiffDet+: Fusing Global Scene Context with Generative Denoising for High-Fidelity Object Detection

Paper • 2509.00578 • Published Aug 30, 2025 • 1
Metis: Training Large Language Models with Advanced Low-Bit Quantization

Paper • 2509.00404 • Published Aug 30, 2025 • 6
FastFit: Accelerating Multi-Reference Virtual Try-On via Cacheable Diffusion Models

Paper • 2508.20586 • Published Aug 28, 2025 • 3