SONIC-O1: A Real-World Benchmark for Evaluating Multimodal Large Language Models on Audio-Video Understanding
Paper
•
2601.21666
•
Published
•
2
None defined yet.
SONIC-O1: A Real-World Benchmark for Evaluating Multimodal Large Language Models on Audio-Video Understanding
FastHMR: Accelerating Human Mesh Recovery via Token and Layer Merging with Diffusion Decoding