Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2603.13398

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Paper • 2410.14059 • Published Oct 17, 2024 • 63
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Paper • 2503.05179 • Published Mar 7, 2025 • 46
Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6, 2025 • 96
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13, 2025 • 53

Qianfan-OCR: A Unified End-to-End Model for Document Intelligence

Paper • 2603.13398 • Published Mar 11 • 153
Running

379

Free Unlimited Google Veo 3

🌖

379

Free Unlimited Google Veo-3 NSFW Uncensored

dLLM: Simple Diffusion Language Modeling

Paper • 2602.22661 • Published Feb 26 • 153
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data

Paper • 2603.15594 • Published Mar 16 • 149
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence

Paper • 2603.13398 • Published Mar 11 • 153
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders

Paper • 2603.06569 • Published Mar 6 • 119

My notification

OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

Paper • 2601.15369 • Published Jan 21 • 21
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

Paper • 2601.15892 • Published Jan 22 • 53
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

Paper • 2601.16208 • Published Jan 22 • 55
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems

Paper • 2601.11004 • Published Jan 16 • 30

PaddlePaddle/PaddleOCR-VL

Image-Text-to-Text • 1.0B • Updated 27 days ago • 6.98k • 1.59k
nanonets/Nanonets-OCR2-3B

Image-Text-to-Text • 4B • Updated Oct 16, 2025 • 678k • 500
deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • 3B • Updated Nov 4, 2025 • 2.08M • 3.22k
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26, 2025 • 160

Read Later 📚

Interesting papers on AI, LLMs, etc. to add to reading list

Monitored Markov Decision Processes

Paper • 2402.06819 • Published Feb 9, 2024
Generalization in Monitored Markov Decision Processes (Mon-MDPs)

Paper • 2505.08988 • Published May 13, 2025
Bayesian Risk Markov Decision Processes

Paper • 2106.02558 • Published Jun 4, 2021
Sotopia-RL: Reward Design for Social Intelligence

Paper • 2508.03905 • Published Aug 5, 2025 • 23

Qianfan-OCR: A Unified End-to-End Model for Document Intelligence

Paper • 2603.13398 • Published Mar 11 • 153

Qianfan-OCR: A Unified End-to-End Model for Document Intelligence

Paper • 2603.13398 • Published Mar 11 • 153

MASS: Motion-Aware Spatial-Temporal Grounding for Physics Reasoning and Comprehension in Vision-Language Models

Paper • 2511.18373 • Published Nov 23, 2025 • 7
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO

Paper • 2511.13288 • Published Nov 17, 2025 • 19
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens

Paper • 2511.19418 • Published Nov 24, 2025 • 29
SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published Nov 20, 2025 • 135

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

Paper • 2510.02209 • Published Oct 2, 2025 • 57
MM-DREX: Multimodal-Driven Dynamic Routing of LLM Experts for Financial Trading

Paper • 2509.05080 • Published Sep 5, 2025
TradingGroup: A Multi-Agent Trading System with Self-Reflection and Data-Synthesis

Paper • 2508.17565 • Published Aug 25, 2025 • 1
QTMRL: An Agent for Quantitative Trading Decision-Making Based on Multi-Indicator Guided Reinforcement Learning

Paper • 2508.20467 • Published Aug 28, 2025

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Paper • 2410.14059 • Published Oct 17, 2024 • 63
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Paper • 2503.05179 • Published Mar 7, 2025 • 46
Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6, 2025 • 96
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13, 2025 • 53

Read Later 📚

Interesting papers on AI, LLMs, etc. to add to reading list

Monitored Markov Decision Processes

Paper • 2402.06819 • Published Feb 9, 2024
Generalization in Monitored Markov Decision Processes (Mon-MDPs)

Paper • 2505.08988 • Published May 13, 2025
Bayesian Risk Markov Decision Processes

Paper • 2106.02558 • Published Jun 4, 2021
Sotopia-RL: Reward Design for Social Intelligence

Paper • 2508.03905 • Published Aug 5, 2025 • 23

Qianfan-OCR: A Unified End-to-End Model for Document Intelligence

Paper • 2603.13398 • Published Mar 11 • 153
Running

379

Free Unlimited Google Veo 3

🌖

379

Free Unlimited Google Veo-3 NSFW Uncensored

Qianfan-OCR: A Unified End-to-End Model for Document Intelligence

Paper • 2603.13398 • Published Mar 11 • 153

dLLM: Simple Diffusion Language Modeling

Paper • 2602.22661 • Published Feb 26 • 153
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data

Paper • 2603.15594 • Published Mar 16 • 149
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence

Paper • 2603.13398 • Published Mar 11 • 153
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders

Paper • 2603.06569 • Published Mar 6 • 119

Qianfan-OCR: A Unified End-to-End Model for Document Intelligence

Paper • 2603.13398 • Published Mar 11 • 153

My notification

OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

Paper • 2601.15369 • Published Jan 21 • 21
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

Paper • 2601.15892 • Published Jan 22 • 53
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

Paper • 2601.16208 • Published Jan 22 • 55
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems

Paper • 2601.11004 • Published Jan 16 • 30

MASS: Motion-Aware Spatial-Temporal Grounding for Physics Reasoning and Comprehension in Vision-Language Models

Paper • 2511.18373 • Published Nov 23, 2025 • 7
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO

Paper • 2511.13288 • Published Nov 17, 2025 • 19
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens

Paper • 2511.19418 • Published Nov 24, 2025 • 29
SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published Nov 20, 2025 • 135

PaddlePaddle/PaddleOCR-VL

Image-Text-to-Text • 1.0B • Updated 27 days ago • 6.98k • 1.59k
nanonets/Nanonets-OCR2-3B

Image-Text-to-Text • 4B • Updated Oct 16, 2025 • 678k • 500
deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • 3B • Updated Nov 4, 2025 • 2.08M • 3.22k
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26, 2025 • 160

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

Paper • 2510.02209 • Published Oct 2, 2025 • 57
MM-DREX: Multimodal-Driven Dynamic Routing of LLM Experts for Financial Trading

Paper • 2509.05080 • Published Sep 5, 2025
TradingGroup: A Multi-Agent Trading System with Self-Reflection and Data-Synthesis

Paper • 2508.17565 • Published Aug 25, 2025 • 1
QTMRL: An Agent for Quantitative Trading Decision-Making Based on Multi-Indicator Guided Reinforcement Learning

Paper • 2508.20467 • Published Aug 28, 2025

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs