Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
5
1
3
Sunny Sanyal
Sunny111
Follow
21world's profile picture
telcom's profile picture
thomwolf's profile picture
10 followers
ยท
5 following
https://sites.google.com/view/sunnysanyal/home
SunnySanyal9
sanyalsunny111
AI & ML interests
Efficient Training Recipes of Large Models (mostly LLMs)
Recent Activity
replied
to
their
post
about 16 hours ago
Are you familiar with reverse residual connections or looping in language models? Excited to share my Looped-GPT blog post and codebase ๐ https://github.com/sanyalsunny111/Looped-GPT TL;DR: looping during pre-training improves generalization. Plot shows GPT2 LMs pre-trained with 15.73B OWT tokens P.S. This is my first post here โ I have ~4 followers and zero expectations for reach ๐
posted
an
update
3 days ago
Are you familiar with reverse residual connections or looping in language models? Excited to share my Looped-GPT blog post and codebase ๐ https://github.com/sanyalsunny111/Looped-GPT TL;DR: looping during pre-training improves generalization. Plot shows GPT2 LMs pre-trained with 15.73B OWT tokens P.S. This is my first post here โ I have ~4 followers and zero expectations for reach ๐
upvoted
a
paper
about 1 month ago
Pre-training Small Base LMs with Fewer Tokens
View all activity
Organizations
Sunny111
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
authored
a paper
almost 2 years ago
Pre-training Small Base LMs with Fewer Tokens
Paper
โข
2404.08634
โข
Published
Apr 12, 2024
โข
36