Skip to main content

"Informed AI News" is an publications aggregation platform, ensuring you only gain the most valuable information, to eliminate information asymmetry and break through the limits of information cocoons. Find out more >>

OpenVid-1M: A High-Quality Dataset for Text-to-Video Generation

OpenVid-1M addresses two critical challenges in text-to-video (T2V) generation: the scarcity of high-quality datasets and the underutilization of text data. This innovative dataset, comprising more than a million text-video pairs, features 433K high-definition videos. A novel model, the Multi-modal Video Diffusion Transformer (MVDiT), improves video generation by more effectively integrating text and visual data. Experimental results demonstrate enhancements over prior methods.

Full article>>