ML Foundations

non-profit

AI & ML interests

None defined yet.

Recent Activity

yuhuizhang authored a paper 3 days ago

CellFlux: Simulating Cellular Morphology Changes via Flow Matching

yuhuizhang authored a paper 3 days ago

Closing the Modality Gap for Mixed Modality Search

yuhuizhang authored a paper 3 days ago

MuSLR: Multimodal Symbolic Logical Reasoning

View all activity

yuhuizhang

authored 8 papers 3 days ago

CellFlux: Simulating Cellular Morphology Changes via Flow Matching

Paper • 2502.09775 • Published Feb 13, 2025

Closing the Modality Gap for Mixed Modality Search

Paper • 2507.19054 • Published Jul 25, 2025

MuSLR: Multimodal Symbolic Logical Reasoning

Paper • 2509.25851 • Published Sep 30, 2025 • 12

MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment Tasks

Paper • 2310.19677 • Published Oct 30, 2023

No Tokens Wasted: Leveraging Long Context in Biomedical Vision-Language Models

Paper • 2510.03978 • Published Oct 4, 2025 • 4

Stanza: A Python Natural Language Processing Toolkit for Many Human Languages

Paper • 2003.07082 • Published Mar 16, 2020

Transductive Visual Programming: Evolving Tool Libraries from Experience for Spatial Reasoning

Paper • 2512.20934 • Published Dec 24, 2025

PaperSearchQA: Learning to Search and Reason over Scientific Papers with RLVR

Paper • 2601.18207 • Published 12 days ago • 19

Sunny111

posted an update 21 days ago

Post

1610

Are you familiar with reverse residual connections or looping in language models?

Excited to share my Looped-GPT blog post and codebase 🚀
https://github.com/sanyalsunny111/Looped-GPT

TL;DR: looping during pre-training improves generalization.

Plot shows GPT2 LMs pre-trained with 15.73B OWT tokens

P.S. This is my first post here — I have ~4 followers and zero expectations for reach 😄

3 replies

·

sedrickkeh

authored a paper 2 months ago

SkillFactory: Self-Distillation For Learning Cognitive Behaviors

Paper • 2512.04072 • Published Dec 3, 2025 • 5

anas-awadalla

updated a model 3 months ago

mlfoundations/Gelato-30B-A3B

Image-Text-to-Text • 31B • Updated Nov 15, 2025 • 283 • 30

anas-awadalla

updated a dataset 3 months ago

mlfoundations/Click-100k

Viewer • Updated Nov 11, 2025 • 101k • 740 • 15

djghosh

updated a collection 3 months ago

🍨 Gelato

From Data Curation to Reinforcement Learning: Building a Strong Grounding Model for Computer-Use Agents • 5 items • Updated Nov 15, 2025 • 1

djghosh

updated a dataset 3 months ago

mlfoundations/gelato-osworld-agent-trajectories

Viewer • Updated Nov 6, 2025 • 13.5k • 41 • 1

djghosh

published a dataset 3 months ago

mlfoundations/gelato-osworld-agent-trajectories

Viewer • Updated Nov 6, 2025 • 13.5k • 41 • 1

anas-awadalla

updated a collection 3 months ago

🍨 Gelato

From Data Curation to Reinforcement Learning: Building a Strong Grounding Model for Computer-Use Agents • 5 items • Updated Nov 15, 2025 • 1

anas-awadalla

updated a collection 4 months ago

🍃 MINT-1T

Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 14 items • Updated Oct 22, 2025 • 65