MMTEB: Massive Multilingual Text Embedding Benchmark Paper β’ 2502.13595 β’ Published Feb 19, 2025 β’ 43
From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions Paper β’ 2502.13791 β’ Published Feb 19, 2025 β’ 5
Bridging the Data Provenance Gap Across Text, Speech and Video Paper β’ 2412.17847 β’ Published Dec 19, 2024 β’ 10
Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models Paper β’ 2412.02980 β’ Published Dec 4, 2024 β’ 15
Consent in Crisis: The Rapid Decline of the AI Data Commons Paper β’ 2407.14933 β’ Published Jul 20, 2024 β’ 14