Gerolamo
Lang2Act: Fine-Grained Visual Reasoning through Self-Emergent Linguistic Toolchains | Gerolamo