-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 60 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 48
Collections
Discover the best community collections!
Collections including paper arxiv:2402.16840
-
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 119 -
stabilityai/stable-video-diffusion-img2vid-xt
Image-to-Video • Updated • 193k • 3.25k -
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes
Paper • 2311.13384 • Published • 53 -
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Paper • 2311.12454 • Published • 30
-
Attention Is All You Need
Paper • 1706.03762 • Published • 115 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 26 -
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper • 1907.11692 • Published • 10 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 22
-
MART: Improving LLM Safety with Multi-round Automatic Red-Teaming
Paper • 2311.07689 • Published • 9 -
DiLoCo: Distributed Low-Communication Training of Language Models
Paper • 2311.08105 • Published • 16 -
SparQ Attention: Bandwidth-Efficient LLM Inference
Paper • 2312.04985 • Published • 40 -
Aligning Large Language Models with Counterfactual DPO
Paper • 2401.09566 • Published • 2
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 60 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 48
-
Attention Is All You Need
Paper • 1706.03762 • Published • 115 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 26 -
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper • 1907.11692 • Published • 10 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 22
-
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 119 -
stabilityai/stable-video-diffusion-img2vid-xt
Image-to-Video • Updated • 193k • 3.25k -
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes
Paper • 2311.13384 • Published • 53 -
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Paper • 2311.12454 • Published • 30
-
MART: Improving LLM Safety with Multi-round Automatic Red-Teaming
Paper • 2311.07689 • Published • 9 -
DiLoCo: Distributed Low-Communication Training of Language Models
Paper • 2311.08105 • Published • 16 -
SparQ Attention: Bandwidth-Efficient LLM Inference
Paper • 2312.04985 • Published • 40 -
Aligning Large Language Models with Counterfactual DPO
Paper • 2401.09566 • Published • 2