Can We Predict Before Executing Machine Learning Agents? Paper • 2601.05930 • Published 5 days ago • 23
Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency Paper • 2601.05905 • Published 5 days ago • 16
Can We Predict Before Executing Machine Learning Agents? Paper • 2601.05930 • Published 5 days ago • 23
Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency Paper • 2601.05905 • Published 5 days ago • 16
InnoGym: Benchmarking the Innovation Potential of AI Agents Paper • 2512.01822 • Published Dec 1, 2025 • 35
Executable Knowledge Graphs for Replicating AI Research Paper • 2510.17795 • Published Oct 20, 2025 • 14
LightMem: Lightweight and Efficient Memory-Augmented Generation Paper • 2510.18866 • Published Oct 21, 2025 • 111
OceanGym: A Benchmark Environment for Underwater Embodied Agents Paper • 2509.26536 • Published Sep 30, 2025 • 35
Towards Personalized Deep Research: Benchmarks and Evaluations Paper • 2509.25106 • Published Sep 29, 2025 • 29
A Survey of LLM-Driven AI Agent Communication: Protocols, Security Risks, and Defense Countermeasures Paper • 2506.19676 • Published Jun 24, 2025
ReCode: Updating Code API Knowledge with Reinforcement Learning Paper • 2506.20495 • Published Jun 25, 2025 • 9
Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study Paper • 2506.19794 • Published Jun 24, 2025 • 8
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality Paper • 2506.19807 • Published Jun 24, 2025 • 7