Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models Paper • 2405.20541 • Published May 30, 2024 • 24
RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network Paper • 2206.14098 • Published Jun 28, 2022
SPDF: Sparse Pre-training and Dense Fine-tuning for Large Language Models Paper • 2303.10464 • Published Mar 18, 2023 • 1
Sparse Iso-FLOP Transformations for Maximizing Training Efficiency Paper • 2303.11525 • Published Mar 21, 2023 • 1
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment Paper • 2405.03594 • Published May 6, 2024 • 7
Towards Characterizing Domain Counterfactuals For Invertible Latent Causal Models Paper • 2306.11281 • Published Jun 20, 2023
StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments Paper • 2401.04290 • Published Jan 9, 2024 • 3
Feature Shift Detection: Localizing Which Features Have Shifted via Conditional Distribution Tests Paper • 2107.06929 • Published Jul 14, 2021