Lei Wang's picture

Lei Wang

demolei

·

https://demoleiwang.github.io/HomePage/

AI & ML interests

LLMs

Recent Activity

upvoted a paper 3 days ago

OpenComputer: Verifiable Software Worlds for Computer-Use Agents

upvoted a paper 3 days ago

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

upvoted a paper 5 days ago

AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

View all activity

Organizations

upvoted 2 papers 3 days ago

OpenComputer: Verifiable Software Worlds for Computer-Use Agents

Paper • 2605.19769 • Published 7 days ago • 57

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

Paper • 2605.11609 • Published 14 days ago • 190

upvoted a paper 5 days ago

AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

Paper • 2605.20025 • Published 7 days ago • 179

upvoted a paper 6 days ago

AI for Auto-Research: Roadmap & User Guide

Paper • 2605.18661 • Published 8 days ago • 64

upvoted a paper 7 days ago

WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation

Paper • 2605.10912 • Published 15 days ago • 45

upvoted 2 papers 11 days ago

Learning to Build the Environment: Self-Evolving Reasoning RL via Verifiable Environment Synthesis

Paper • 2605.14392 • Published 12 days ago • 8

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published 12 days ago • 109

upvoted 2 papers 13 days ago

RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards

Paper • 2605.10899 • Published 15 days ago • 75

Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning

Paper • 2605.10923 • Published 15 days ago • 13

upvoted a paper 19 days ago

ARIS: Autonomous Research via Adversarial Multi-Agent Collaboration

Paper • 2605.03042 • Published 22 days ago • 120

upvoted 10 papers about 1 month ago

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

Paper • 2604.06628 • Published Apr 8 • 326

GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

Paper • 2604.02721 • Published Apr 3 • 630

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

Paper • 2603.26164 • Published Mar 27 • 364

DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data

Paper • 2604.19859 • Published Apr 21 • 53

SWE-chat: Coding Agent Interactions From Real Users in the Wild

Paper • 2604.20779 • Published Apr 22 • 15

LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

Paper • 2604.20796 • Published Apr 22 • 242

Mind DeepResearch Technical Report

Paper • 2604.14518 • Published Apr 17 • 23

Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence

Paper • 2604.18292 • Published Apr 20 • 85

WebCompass: Towards Multimodal Web Coding Evaluation for Code Language Models

Paper • 2604.18224 • Published Apr 20 • 22

ClawEnvKit: Automatic Environment Generation for Claw-Like Agents

Paper • 2604.18543 • Published Apr 20 • 30