-
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
Paper • 2509.16197 • Published • 56 -
InternRobotics/VLAC
Robotics • 2B • Updated • 42 • 37 -
LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence
Paper • 2509.12203 • Published • 19 -
A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning
Paper • 2509.15937 • Published • 20
fysp
fysp
·
AI & ML interests
tech, ai, climate, social, disrupt
Recent Activity
liked
a dataset
15 days ago
google/mobile-actions
updated
a collection
16 days ago
Faves
updated
a collection
3 months ago
Faves