Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2512.07461

Llm Augmentation

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Paper • 2512.07461 • Published Dec 8, 2025 • 78

Guided Self-Evolving LLMs with Minimal Human Supervision

Paper • 2512.02472 • Published Dec 2, 2025 • 55
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29, 2025 • 146
Video Reasoning without Training

Paper • 2510.17045 • Published Oct 19, 2025 • 8
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 273

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23
Soft Tokens, Hard Truths

Paper • 2509.19170 • Published Sep 23, 2025 • 16
CompLLM: Compression for Long Context Q&A

Paper • 2509.19228 • Published Sep 23, 2025 • 10
Test-Time Scaling in Reasoning Models Is Not Effective for Knowledge-Intensive Tasks Yet

Paper • 2509.06861 • Published Sep 8, 2025 • 9

about 19 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 86 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 38
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 40
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 57
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6, 2024 • 37
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Paper • 2411.14432 • Published Nov 21, 2024 • 25

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Paper • 2512.07461 • Published Dec 8, 2025 • 78
RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling

Paper • 2506.08672 • Published Jun 10, 2025 • 30
ReflectEvo: Improving Meta Introspection of Small LLMs by Learning Self-Reflection

Paper • 2505.16475 • Published May 22, 2025 • 3

A Survey of Data Agents: Emerging Paradigm or Overstated Hype?

Paper • 2510.23587 • Published Oct 27, 2025 • 67
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

Paper • 2510.05684 • Published Oct 7, 2025 • 143
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Paper • 2510.08673 • Published Oct 9, 2025 • 126
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds

Paper • 2511.08892 • Published Nov 12, 2025 • 210

MLLM Reasoning, Rewarding, and Understanding

Papers on the reasoning, rewarding, and understanding of the MLLMs and LLMs

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 277
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

Paper • 2506.02096 • Published Jun 2, 2025 • 52
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

Paper • 2506.02397 • Published Jun 3, 2025 • 36
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30, 2025 • 143

Interesting Papers

These papers are interesting (to me)

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models

Paper • 2410.02740 • Published Oct 3, 2024 • 54
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging

Paper • 2410.01215 • Published Oct 2, 2024 • 39
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 121
EuroLLM: Multilingual Language Models for Europe

Paper • 2409.16235 • Published Sep 24, 2024 • 29

Llm Augmentation

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Paper • 2512.07461 • Published Dec 8, 2025 • 78

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Paper • 2512.07461 • Published Dec 8, 2025 • 78
RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling

Paper • 2506.08672 • Published Jun 10, 2025 • 30
ReflectEvo: Improving Meta Introspection of Small LLMs by Learning Self-Reflection

Paper • 2505.16475 • Published May 22, 2025 • 3

Guided Self-Evolving LLMs with Minimal Human Supervision

Paper • 2512.02472 • Published Dec 2, 2025 • 55
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29, 2025 • 146
Video Reasoning without Training

Paper • 2510.17045 • Published Oct 19, 2025 • 8
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 273

A Survey of Data Agents: Emerging Paradigm or Overstated Hype?

Paper • 2510.23587 • Published Oct 27, 2025 • 67
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

Paper • 2510.05684 • Published Oct 7, 2025 • 143
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Paper • 2510.08673 • Published Oct 9, 2025 • 126
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds

Paper • 2511.08892 • Published Nov 12, 2025 • 210

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23
Soft Tokens, Hard Truths

Paper • 2509.19170 • Published Sep 23, 2025 • 16
CompLLM: Compression for Long Context Q&A

Paper • 2509.19228 • Published Sep 23, 2025 • 10
Test-Time Scaling in Reasoning Models Is Not Effective for Knowledge-Intensive Tasks Yet

Paper • 2509.06861 • Published Sep 8, 2025 • 9

MLLM Reasoning, Rewarding, and Understanding

Papers on the reasoning, rewarding, and understanding of the MLLMs and LLMs

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 277
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

Paper • 2506.02096 • Published Jun 2, 2025 • 52
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

Paper • 2506.02397 • Published Jun 3, 2025 • 36
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30, 2025 • 143

about 19 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 86 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 38
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

Interesting Papers

These papers are interesting (to me)

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models

Paper • 2410.02740 • Published Oct 3, 2024 • 54
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging

Paper • 2410.01215 • Published Oct 2, 2024 • 39
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 121
EuroLLM: Multilingual Language Models for Europe

Paper • 2409.16235 • Published Sep 24, 2024 • 29

Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 40
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 57
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6, 2024 • 37
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Paper • 2411.14432 • Published Nov 21, 2024 • 25

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs