Less is More: Recursive Reasoning with Tiny Networks
Paper
•
2510.04871
•
Published
•
501
Cache-to-Cache: Direct Semantic Communication Between Large Language
Models
Paper
•
2510.03215
•
Published
•
97
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Paper
•
2510.07499
•
Published
•
48
StreamingVLM: Real-Time Understanding for Infinite Video Streams
Paper
•
2510.09608
•
Published
•
50
LiteStage: Latency-aware Layer Skipping for Multi-stage Reasoning
Paper
•
2510.14211
•
Published
•
7
Every Attention Matters: An Efficient Hybrid Architecture for
Long-Context Reasoning
Paper
•
2510.19338
•
Published
•
114
LightMem: Lightweight and Efficient Memory-Augmented Generation
Paper
•
2510.18866
•
Published
•
111
Glyph: Scaling Context Windows via Visual-Text Compression
Paper
•
2510.17800
•
Published
•
67
DeepSeek-OCR: Contexts Optical Compression
Paper
•
2510.18234
•
Published
•
86
Deep Self-Evolving Reasoning
Paper
•
2510.17498
•
Published
•
11
Latent Sketchpad: Sketching Visual Thoughts to Elicit Multimodal
Reasoning in MLLMs
Paper
•
2510.24514
•
Published
•
21
The End of Manual Decoding: Towards Truly End-to-End Language Models
Paper
•
2510.26697
•
Published
•
116
Exploring Conditions for Diffusion models in Robotic Control
Paper
•
2510.15510
•
Published
•
39
Kimi Linear: An Expressive, Efficient Attention Architecture
Paper
•
2510.26692
•
Published
•
119
Delta Attention: Fast and Accurate Sparse Attention Inference by Delta
Correction
Paper
•
2505.11254
•
Published
•
48