AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models Paper • 2309.16414 • Published Sep 28, 2023 • 19
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model Paper • 2309.13018 • Published Sep 22, 2023 • 9
Robust Speech Recognition via Large-Scale Weak Supervision Paper • 2212.04356 • Published Dec 6, 2022 • 51
GPT-Fathom: Benchmarking Large Language Models to Decipher the Evolutionary Path towards GPT-4 and Beyond Paper • 2309.16583 • Published Sep 28, 2023 • 13
Evaluating Cognitive Maps and Planning in Large Language Models with CogEval Paper • 2309.15129 • Published Sep 25, 2023 • 7
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models Paper • 2309.14717 • Published Sep 26, 2023 • 46
Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models Paper • 2309.15098 • Published Sep 26, 2023 • 7
Exploring Large Language Models' Cognitive Moral Development through Defining Issues Test Paper • 2309.13356 • Published Sep 23, 2023 • 38
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs Paper • 2309.13007 • Published Sep 22, 2023 • 1
SCREWS: A Modular Framework for Reasoning with Revisions Paper • 2309.13075 • Published Sep 20, 2023 • 18
Large Language Models for Code: Security Hardening and Adversarial Testing Paper • 2302.05319 • Published Feb 10, 2023 • 2
CodePlan: Repository-level Coding using LLMs and Planning Paper • 2309.12499 • Published Sep 21, 2023 • 80
InstructERC: Reforming Emotion Recognition in Conversation with a Retrieval Multi-task LLMs Framework Paper • 2309.11911 • Published Sep 21, 2023 • 3
Repository-Level Prompt Generation for Large Language Models of Code Paper • 2206.12839 • Published Jun 26, 2022 • 3
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction Paper • 2305.18752 • Published May 30, 2023 • 5
Boolformer: Symbolic Regression of Logic Functions with Transformers Paper • 2309.12207 • Published Sep 21, 2023 • 11
Chain-of-Verification Reduces Hallucination in Large Language Models Paper • 2309.11495 • Published Sep 20, 2023 • 40
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models Paper • 2309.12284 • Published Sep 21, 2023 • 19
LMDX: Language Model-based Document Information Extraction and Localization Paper • 2309.10952 • Published Sep 19, 2023 • 67
Stabilizing RLHF through Advantage Model and Selective Rehearsal Paper • 2309.10202 • Published Sep 18, 2023 • 11
SlimPajama-DC: Understanding Data Combinations for LLM Training Paper • 2309.10818 • Published Sep 19, 2023 • 11
The Rise and Potential of Large Language Model Based Agents: A Survey Paper • 2309.07864 • Published Sep 14, 2023 • 8
Sparse Autoencoders Find Highly Interpretable Features in Language Models Paper • 2309.08600 • Published Sep 15, 2023 • 15
Clinical Text Summarization: Adapting Large Language Models Can Outperform Human Experts Paper • 2309.07430 • Published Sep 14, 2023 • 28
Compositional Foundation Models for Hierarchical Planning Paper • 2309.08587 • Published Sep 15, 2023 • 11
CCEdit: Creative and Controllable Video Editing via Diffusion Models Paper • 2309.16496 • Published Sep 28, 2023 • 9
ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model Paper • 2304.01116 • Published Apr 3, 2023 • 2
NeuRBF: A Neural Fields Representation with Adaptive Radial Basis Functions Paper • 2309.15426 • Published Sep 27, 2023 • 15
DECO: Dense Estimation of 3D Human-Scene Contact In The Wild Paper • 2309.15273 • Published Sep 26, 2023 • 7
ProPainter: Improving Propagation and Transformer for Video Inpainting Paper • 2309.03897 • Published Sep 7, 2023 • 28
Spatially Guiding Unsupervised Semantic Segmentation Through Depth-Informed Feature Distillation and Sampling Paper • 2309.12378 • Published Sep 21, 2023 • 4
MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation Paper • 2309.13042 • Published Sep 22, 2023 • 9
DualToken-ViT: Position-aware Efficient Vision Transformer with Dual Token Fusion Paper • 2309.12424 • Published Sep 21, 2023 • 11
High-Resolution Image Synthesis with Latent Diffusion Models Paper • 2112.10752 • Published Dec 20, 2021 • 15
Learning Transferable Visual Models From Natural Language Supervision Paper • 2103.00020 • Published Feb 26, 2021 • 19
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Paper • 2010.11929 • Published Oct 22, 2020 • 15
U-Net: Convolutional Networks for Biomedical Image Segmentation Paper • 1505.04597 • Published May 18, 2015 • 18
Adding Conditional Control to Text-to-Image Diffusion Models Paper • 2302.05543 • Published Feb 10, 2023 • 58
Controllable Dynamic Appearance for Neural 3D Portraits Paper • 2309.11009 • Published Sep 20, 2023 • 3
SyncDreamer: Generating Multiview-consistent Images from a Single-view Image Paper • 2309.03453 • Published Sep 7, 2023 • 13
DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation Paper • 2309.16653 • Published Sep 28, 2023 • 48
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation Paper • 2309.16429 • Published Sep 28, 2023 • 11
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model Paper • 2309.16058 • Published Sep 27, 2023 • 57
RealFill: Reference-Driven Generation for Authentic Image Completion Paper • 2309.16668 • Published Sep 28, 2023 • 15
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning Paper • 2309.16650 • Published Sep 28, 2023 • 10
MotionGPT: Finetuned LLMs are General-Purpose Motion Generators Paper • 2306.10900 • Published Jun 19, 2023 • 19
Jointly Training Large Autoregressive Multimodal Models Paper • 2309.15564 • Published Sep 27, 2023 • 8
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation Paper • 2309.15818 • Published Sep 27, 2023 • 19
Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack Paper • 2309.15807 • Published Sep 27, 2023 • 34
Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition Paper • 2309.15223 • Published Sep 26, 2023 • 23
Efficient Post-training Quantization with FP8 Formats Paper • 2309.14592 • Published Sep 26, 2023 • 11
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models Paper • 2309.14509 • Published Sep 25, 2023 • 20
Breathing New Life into 3D Assets with Generative Repainting Paper • 2309.08523 • Published Sep 15, 2023 • 5
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation Paper • 2309.06380 • Published Sep 12, 2023 • 33
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning Paper • 2309.15091 • Published Sep 26, 2023 • 35
Aligning Large Multimodal Models with Factually Augmented RLHF Paper • 2309.14525 • Published Sep 25, 2023 • 32
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models Paper • 2309.15103 • Published Sep 26, 2023 • 43
DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention Paper • 2309.14327 • Published Sep 25, 2023 • 23
Small-scale proxies for large-scale Transformer training instabilities Paper • 2309.14322 • Published Sep 25, 2023 • 22
Robotic Offline RL from Internet Videos via Value-Function Pre-Training Paper • 2309.13041 • Published Sep 22, 2023 • 9
MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning Paper • 2309.07915 • Published Sep 14, 2023 • 4
DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving Paper • 2309.09777 • Published Sep 18, 2023 • 2
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers Paper • 2305.07185 • Published May 12, 2023 • 10
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions Paper • 2309.10150 • Published Sep 18, 2023 • 26
Multimodal Foundation Models: From Specialists to General-Purpose Assistants Paper • 2309.10020 • Published Sep 18, 2023 • 41