Multilingual Multi-Label Emotion Classification at Scale with Synthetic Data Paper • 2604.12633 • Published 13 days ago • 1
Evaluating Arabic Large Language Models: A Survey of Benchmarks, Methods, and Gaps Paper • 2510.13430 • Published Oct 15, 2025 • 1
3LM: Bridging Arabic, STEM, and Code through Benchmarking Paper • 2507.15850 • Published Jul 21, 2025 • 6
NeurIPS 2025 E2LM Competition : Early Training Evaluation of Language Models Paper • 2506.07731 • Published Jun 9, 2025 • 2
Are Arabic Benchmarks Reliable? QIMMA's Quality-First Approach to LLM Evaluation Paper • 2604.03395 • Published 24 days ago • 2
Contrastive Representation Learning: A Framework and Review Paper • 2010.05113 • Published Oct 10, 2020 • 1
NeurIPS 2025 E2LM Competition : Early Training Evaluation of Language Models Paper • 2506.07731 • Published Jun 9, 2025 • 2
Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance Paper • 2507.22448 • Published Jul 30, 2025 • 71
Learning to Explore with Parameter-Space Noise: A Deep Dive into Parameter-Space Noise for Reinforcement Learning with Verifiable Rewards Paper • 2602.02555 • Published Jan 30 • 1
Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers Paper • 2601.04890 • Published Jan 8 • 44
Robust and Calibrated Detection of Authentic Multimedia Content Paper • 2512.15182 • Published Dec 17, 2025 • 17
Robust and Calibrated Detection of Authentic Multimedia Content Paper • 2512.15182 • Published Dec 17, 2025 • 17
NeurIPS 2025 E2LM Competition : Early Training Evaluation of Language Models Paper • 2506.07731 • Published Jun 9, 2025 • 2