Gerolamo
$π$-Play: Multi-Agent Self-Play via Privileged Self-Distillation without External Data | Gerolamo