Gerolamo
DR$^{3}$-Eval: Towards Realistic and Reproducible Deep Research Evaluation | Gerolamo