AI & ML interests

None defined yet.

Recent Activity

AdinaY 
posted an update about 18 hours ago
view post
Post
263
LLaDA 2.1 is out 🔥 A new series of MoE diffusion language model released by AntGroup

inclusionAI/LLaDA2.1-mini
inclusionAI/LLaDA2.1-flash

✨LLaDA2.1-mini: 16B - Apache2.0
✨LLaDA2.1-flash: 100B - Apache2.0
✨Both delivers editable generation, RL-trained diffusion reasoning and fast inference
  • 2 replies
·
sergiopaniego 
posted an update 1 day ago
view post
Post
221
if you're looking for a good first issue to get your open-source journey started, you could contribute to this TRL issue by documenting one impactful paper in the docs

we have a broad list to cover!! 🧐

https://github.com/huggingface/trl/issues/4407
AdinaY 
posted an update 6 days ago
view post
Post
2406
AI for science is moving fast🚀

Intern-S1-Pro 🔬 a MoE multimodal scientific reasoning model from Shanghai AI Lab

internlm/Intern-S1-Pro

✨ 1T total / 22B active
✨ Apache 2.0
✨ SoTA scientific reasoning performance
✨ FoPE enables scalable modeling of long physical time series (10⁰–10⁶)
  • 2 replies
·
AdinaY 
posted an update 7 days ago
view post
Post
1252
✨ China’s open source AI ecosystem has entered a new phase

https://huggingface.co/blog/huggingface/one-year-since-the-deepseek-moment-blog-3

One year after the “DeepSeek Moment,” open source has become the default. Models, research, infrastructure, and deployment are increasingly shared to support large-scale, system-level integration.

This final blog examines how leading Chinese AI organizations are evolving ,and what this implies for the future of open source.
AdinaY 
posted an update 7 days ago
view post
Post
300
GLM just entered the OCR field🔥

zai-org/GLM-OCR

✨ 0.9B
✨ MIT licensed
✨ Multimodal GLM-V architecture
✨ #1 on OmniDocBench v1.5 (94.62)
AdinaY 
posted an update 8 days ago
AdinaY 
posted an update 12 days ago
view post
Post
1074
What a week 🤯

Following DeepSeek, Kimi, Qwen, Baidu, and Ant Group, Unitree Robotics
has now released a VLA model on the hub too!

unitreerobotics/UnifoLM-VLA-Base
sergiopaniego 
posted an update 12 days ago
view post
Post
424
Meet the Post-Training Toolkit (PTT), which easily integrates with TRL via a single callback, by Aditya Challapally ( @microsoft ):

🔍 Detects training issues early
🛠 Lets you intervene safely
📊 Keeps long training runs stable, auditable & efficient

Microsoft blog: https://devblogs.microsoft.com/engineering-at-microsoft/diagnosing-instability-in-production-scale-agent-rl/

Integration guide: https://huggingface.co/docs/trl/main/en/ptt_integration

Code: https://github.com/microsoft/post-training-toolkit
victor 
posted an update 12 days ago
view post
Post
511
Interesting article: use Claude Code to help open models write CUDA kernels (for eg) by turning CC traces into Skills. They made a library out of it 👀

https://huggingface.co/blog/upskill
AdinaY 
posted an update 13 days ago
view post
Post
267
LongCat-Flash-Lite🔥 a non-thinking MoE model released by Meituan LongCat team.

meituan-longcat/LongCat-Flash-Lite

✨ Total 68.5B / 3B active - MIT license
✨ 256k context
✨ Faster inference with N-gram embeddings
sergiopaniego 
posted an update 13 days ago
AdinaY 
posted an update 13 days ago
view post
Post
250
Ant Group is going big on robotics 🤖

They just dropped their first VLA and depth perception foundation model on huggingface.

✨ LingBot-VLA :
- Trained on 20k hours of real-world robot data
- 9 robot embodiments
- Clear no-saturation scaling laws
- Apache 2.0

Model: https://huggingface.co/collections/robbyant/lingbot-vla
Paper:
A Pragmatic VLA Foundation Model (2601.18692)

✨ LingBot-Depth:
- Metric-accurate 3D from noisy, incomplete depth
- Masked Depth Modeling (self-supervised)
- RGB–depth alignment, works with <5% sparse depth
- Apache 2.0

Model: https://huggingface.co/collections/robbyant/lingbot-depth
Paper:
Masked Depth Modeling for Spatial Perception (2601.17895)
AdinaY 
posted an update 14 days ago
AdinaY 
posted an update 14 days ago