ML Foundations

non-profit

AI & ML interests

None defined yet.

Recent Activity

jieyuz2 submitted a paper 2 days ago

Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?

yuhuizhang authored a paper 8 days ago

CellFlux: Simulating Cellular Morphology Changes via Flow Matching

yuhuizhang authored a paper 8 days ago

Closing the Modality Gap for Mixed Modality Search

View all activity

jieyuz2

submitted a paper to Daily Papers 2 days ago

Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?

Paper • 2602.07055 • Published 7 days ago • 20

yuhuizhang

authored 8 papers 8 days ago

CellFlux: Simulating Cellular Morphology Changes via Flow Matching

Paper • 2502.09775 • Published Feb 13, 2025

Closing the Modality Gap for Mixed Modality Search

Paper • 2507.19054 • Published Jul 25, 2025

MuSLR: Multimodal Symbolic Logical Reasoning

Paper • 2509.25851 • Published Sep 30, 2025 • 12

MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment Tasks

Paper • 2310.19677 • Published Oct 30, 2023

No Tokens Wasted: Leveraging Long Context in Biomedical Vision-Language Models

Paper • 2510.03978 • Published Oct 4, 2025 • 4

Stanza: A Python Natural Language Processing Toolkit for Many Human Languages

Paper • 2003.07082 • Published Mar 16, 2020

Transductive Visual Programming: Evolving Tool Libraries from Experience for Spatial Reasoning

Paper • 2512.20934 • Published Dec 24, 2025

PaperSearchQA: Learning to Search and Reason over Scientific Papers with RLVR

Paper • 2601.18207 • Published 17 days ago • 19

Sunny111

posted an update 26 days ago

Post

1613

Are you familiar with reverse residual connections or looping in language models?

Excited to share my Looped-GPT blog post and codebase 🚀
https://github.com/sanyalsunny111/Looped-GPT

TL;DR: looping during pre-training improves generalization.

Plot shows GPT2 LMs pre-trained with 15.73B OWT tokens

P.S. This is my first post here — I have ~4 followers and zero expectations for reach 😄

3 replies

·

sedrickkeh

authored a paper 2 months ago

SkillFactory: Self-Distillation For Learning Cognitive Behaviors

Paper • 2512.04072 • Published Dec 3, 2025 • 5

anas-awadalla

updated a model 3 months ago

mlfoundations/Gelato-30B-A3B

Image-Text-to-Text • 31B • Updated Nov 15, 2025 • 299 • 31

anas-awadalla

updated a dataset 3 months ago

mlfoundations/Click-100k

Viewer • Updated Nov 11, 2025 • 101k • 719 • 15

djghosh

updated a collection 3 months ago

🍨 Gelato

From Data Curation to Reinforcement Learning: Building a Strong Grounding Model for Computer-Use Agents • 5 items • Updated Nov 15, 2025 • 1

djghosh

updated a dataset 3 months ago

mlfoundations/gelato-osworld-agent-trajectories

Viewer • Updated Nov 6, 2025 • 13.5k • 46 • 1

djghosh

published a dataset 3 months ago

mlfoundations/gelato-osworld-agent-trajectories

Viewer • Updated Nov 6, 2025 • 13.5k • 46 • 1

anas-awadalla

updated a collection 4 months ago

🍨 Gelato

From Data Curation to Reinforcement Learning: Building a Strong Grounding Model for Computer-Use Agents • 5 items • Updated Nov 15, 2025 • 1