Defensive Publications Series

TEXT-TO-VIDEO GENERATION USING NEURAL NETWORKS

Abstract

Systems and methods for text to video conversion are provided. The system may receive text representing a natural language description for generating a video. The system may also provide the text as input to a neural network. The neural network may include a first component comprising a text to image model trained to generate one or more images based on an input text. The neural network may further include a second component comprising one or more spatiotemporal layers trained to generate a video based on the one or more images generated by the first component. The neural network may further include a third component comprising a frame interpolation network trained to increase the number of video frames of the video generated by the second component. The neural network may further include a fourth component configured to perform super-resolution across spatial and temporal dimensions. The system may also execute the neural network to generate the video representing the text.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

"TEXT-TO-VIDEO GENERATION USING NEURAL NETWORKS", Technical Disclosure Commons, (February 11, 2024)
https://www.tdcommons.org/dpubs_series/6679

Download

COinS

Technical Disclosure Commons

Defensive Publications Series

TEXT-TO-VIDEO GENERATION USING NEURAL NETWORKS

Abstract

Creative Commons License

Recommended Citation

Browse

Search

Submit

Additional Information

Technical Disclosure Commons

Defensive Publications Series

TEXT-TO-VIDEO GENERATION USING NEURAL NETWORKS

Inventor(s)

Abstract

Creative Commons License

Recommended Citation

Share

Browse

Search

Submit

Additional Information