LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning
Abstract
LongAct improves long-context reasoning in LLMs by implementing saliency-guided sparse updates based on high-magnitude activation patterns in query and key vectors.
Reinforcement Learning (RL) has emerged as a critical driver for enhancing the reasoning capabilities of Large Language Models (LLMs). While recent advancements have focused on reward engineering or data synthesis, few studies exploit the model's intrinsic representation characteristics to guide the training process. In this paper, we first observe the presence of high-magnitude activations within the query and key vectors when processing long contexts. Drawing inspiration from model quantization -- which establishes the criticality of such high-magnitude activations -- and the insight that long-context reasoning inherently exhibits a sparse structure, we hypothesize that these weights serve as the pivotal drivers for effective model optimization. Based on this insight, we propose LongAct, a strategy that shifts from uniform to saliency-guided sparse updates. By selectively updating only the weights associated with these significant activations, LongAct achieves an approximate 8% improvement on LongBench v2 and enhances generalization on the RULER benchmark. Furthermore, our method exhibits remarkable universality, consistently boosting performance across diverse RL algorithms such as GRPO and DAPO. Extensive ablation studies suggest that focusing on these salient features is key to unlocking long-context potential.
Community
ACL 2026
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Training Large Reasoning Models Efficiently via Progressive Thought Encoding (2026)
- Reinforced Fast Weights with Next-Sequence Prediction (2026)
- LongRLVR: Long-Context Reinforcement Learning Requires Verifiable Context Rewards (2026)
- A Decomposition Perspective to Long-context Reasoning for LLMs (2026)
- Train at Moving Edge: Online-Verified Prompt Selection for Efficient RL Training of Large Reasoning Model (2026)
- In-Context Reinforcement Learning for Tool Use in Large Language Models (2026)
- LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2604.14922 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper