PKU-Alignment/ProgressGym-HistLlama3-8B-C014-pretrain-v0.2 Text Generation • 8B • Updated Aug 10, 2024 • 6
PKU-Alignment/ProgressGym-HistLlama3-8B-C013-instruct-v0.2 Text Generation • 8B • Updated Aug 10, 2024 • 7
PKU-Alignment/ProgressGym-HistLlama3-8B-C013-pretrain-v0.2 Text Generation • 8B • Updated Aug 10, 2024 • 10
PKU-Alignment/beaver-7b-unified-reward Reinforcement Learning • 7B • Updated Apr 20, 2024 • 856
PKU-Alignment/beaver-7b-unified-cost Reinforcement Learning • 7B • Updated Apr 20, 2024 • 729 • 2
PKU-Alignment/beaver-7b-v1.0-reward Reinforcement Learning • 7B • Updated Apr 20, 2024 • 5.85k • 17
PKU-Alignment/beaver-7b-v1.0-cost Reinforcement Learning • 7B • Updated Apr 20, 2024 • 6.06k • 10