view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 23 days ago • 109
💧 LFM2.5 Collection Collection of Instruct, Base, and Japanese LFM2.5-1.2B models. • 19 items • Updated 4 days ago • 58
DEER: Draft with Diffusion, Verify with Autoregressive Models Paper • 2512.15176 • Published 23 days ago • 42
Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs Paper • 2512.07525 • Published Dec 8, 2025 • 57
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss Paper • 2512.23447 • Published 11 days ago • 93
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper • 2512.16969 • Published 22 days ago • 111
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI Paper • 2512.16676 • Published 22 days ago • 203
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27, 2025 • 225
artificial-hivemind Collection This collection contains datasets for the Artificial Hiveminds paper. • 4 items • Updated May 16, 2025 • 12
WUSH: Near-Optimal Adaptive Transforms for LLM Quantization Paper • 2512.00956 • Published Nov 30, 2025 • 20
CaptionQA: Is Your Caption as Useful as the Image Itself? Paper • 2511.21025 • Published Nov 26, 2025 • 27
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models Paper • 2511.18890 • Published Nov 24, 2025 • 32
Guided Self-Evolving LLMs with Minimal Human Supervision Paper • 2512.02472 • Published Dec 2, 2025 • 51
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published Nov 26, 2025 • 114
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published Dec 1, 2025 • 96