Running on Zero Agents 4 Wittgensite Leaderboard 🥇 4 Prompt consistency benchmark for AI coding agents