OmniEvalKit

community

AI & ML interests

None defined yet.

Recent Activity

Cuiunbo authored a paper 17 days ago

UltraEval-Audio: A Unified Framework for Comprehensive Evaluation of Audio Foundation Models

Cuiunbo authored a paper 17 days ago

MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction

xiaofff updated a dataset 2 months ago

OmniEvalKit/omnievalkit-dataset

View all activity

authored 2 papers 17 days ago

UltraEval-Audio: A Unified Framework for Comprehensive Evaluation of Audio Foundation Models

Paper • 2601.01373 • Published Jan 4 • 1

MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction

Paper • 2604.27393 • Published 27 days ago • 76

updated a dataset 2 months ago

OmniEvalKit/omnievalkit-dataset

Viewer • Updated Mar 27 • 318k • 3.33k

published a dataset 2 months ago

OmniEvalKit/omnievalkit-dataset

Viewer • Updated Mar 27 • 318k • 3.33k

authored a paper 7 months ago

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Paper • 2509.18154 • Published Sep 16, 2025 • 61

authored a paper over 1 year ago

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

Paper • 2410.10594 • Published Oct 14, 2024 • 29

authored 2 papers almost 2 years ago

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

Paper • 2408.01800 • Published Aug 3, 2024 • 96

GUICourse: From General Vision Language Models to Versatile GUI Agents

Paper • 2406.11317 • Published Jun 17, 2024 • 2

posted an update almost 2 years ago

Post

2687

Introducing GUICourse! 🎉
By leveraging extensive OCR pretraining with grounding ability, we unlock the potential of parsing-free methods for GUIAgent.
📄 Paper: ( GUICourse: From General Vision Language Models to Versatile GUI Agents (2406.11317))
🌐 Github Repo: (https://github.com/yiye3/GUICourse)
📖 Dataset: ( yiye2023/GUIAct) / ( yiye2023/GUIChat) / ( yiye2023/GUIEnv)
🎯 Model: ( RhapsodyAI/minicpm-guidance) / ( RhapsodyAI/qwen_vl_guidance)

16 replies

·