view article Article Auron: Depth-Efficient Language Models via Hybrid Recurrent-Attention Weight Sharing nyxia • 25 days ago • 2