Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published 5 days ago • 199
view article Article Qwen-Image-i2L: Training Strategies for Image-to-LoRA Generation Dec 16, 2025 • 52
Emerging Properties in Unified Multimodal Pretraining Paper • 2505.14683 • Published May 20, 2025 • 133
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders Paper • 2404.05961 • Published Apr 9, 2024 • 66
ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion Paper • 2403.18818 • Published Mar 27, 2024 • 28
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model Paper • 2401.09417 • Published Jan 17, 2024 • 62