Models used in CHARM: Calibrating Reward Models With Chatbot Arena Scores.
shawnxzhu
shawnxzhu
AI & ML interests
None yet
Recent Activity
authored a paper 12 days ago
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL submitted a paper 13 days ago
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL upvoted a paper 13 days ago
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL