Gerolamo
MemPO: Self-Memory Policy Optimization for Long-Horizon Agents | Gerolamo