Gerolamo
RiTTA: Modeling Event Relations in Text-to-Audio Generation | Gerolamo