Gerolamo
C2: Scalable Rubric-Augmented Reward Modeling from Binary Preferences | Gerolamo