rvienne/layton-eval
Viewer
• Updated
• 1.01k • 14
All layton-eval related datasets
Note Dataset containing layton-eval riddles
Note Dataset containing everything to compute PPI-based benchmark score
Note Benchmark final results on several frontier models