Gerolamo
DiffVC: A Non-autoregressive Framework Based on Diffusion Model for Video Captioning | Gerolamo