DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models Paper • 2512.15713 • Published 16 days ago • 16
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published 16 days ago • 90
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published 24 days ago • 78
Scaling Latent Reasoning via Looped Language Models Paper • 2510.25741 • Published Oct 29, 2025 • 221
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts Paper • 2401.04081 • Published Jan 8, 2024 • 73
MMLU Pro benchmark for GGUFs (1 shot) Collection "Not all quantized model perform good", serving framework ollama uses NVIDIA gpu, llama.cpp uses CPU with AVX & AMX • 13 items • Updated Aug 15, 2025 • 9
LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS Paper • 2507.07136 • Published Jul 9, 2025 • 39
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy +4 Sep 18, 2024 • 272
view article Article Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs +7 Apr 29, 2025 • 43