-
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 10 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 109 -
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 117
Collections
Discover the best community collections!
Collections including paper arxiv:2402.06155
-
Latent Reasoning in LLMs as a Vocabulary-Space Superposition
Paper • 2510.15522 • Published • 3 -
Language Models are Injective and Hence Invertible
Paper • 2510.15511 • Published • 69 -
Eliciting Secret Knowledge from Language Models
Paper • 2510.01070 • Published • 6 -
Interpreting Language Models Through Concept Descriptions: A Survey
Paper • 2510.01048 • Published • 2
-
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Paper • 2408.08152 • Published • 61 -
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
Paper • 2402.15220 • Published • 20 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 56 -
Simple linear attention language models balance the recall-throughput tradeoff
Paper • 2402.18668 • Published • 20
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
ToolTalk: Evaluating Tool-Usage in a Conversational Setting
Paper • 2311.10775 • Published • 9 -
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Paper • 2311.11077 • Published • 29 -
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Paper • 2311.11501 • Published • 37
-
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 10 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 109 -
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 117
-
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Paper • 2408.08152 • Published • 61 -
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
Paper • 2402.15220 • Published • 20 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 56 -
Simple linear attention language models balance the recall-throughput tradeoff
Paper • 2402.18668 • Published • 20
-
Latent Reasoning in LLMs as a Vocabulary-Space Superposition
Paper • 2510.15522 • Published • 3 -
Language Models are Injective and Hence Invertible
Paper • 2510.15511 • Published • 69 -
Eliciting Secret Knowledge from Language Models
Paper • 2510.01070 • Published • 6 -
Interpreting Language Models Through Concept Descriptions: A Survey
Paper • 2510.01048 • Published • 2
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
ToolTalk: Evaluating Tool-Usage in a Conversational Setting
Paper • 2311.10775 • Published • 9 -
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Paper • 2311.11077 • Published • 29 -
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Paper • 2311.11501 • Published • 37