Imagen Video: Fine-Definition AI Text-to-Video Generator
What is Imagen Video?
Share Imagen Video
Features of Imagen Video
- The output content is highly consistent with the text description
- Maintains the powerful features of the original Imagen system, such as the ability to spell text accurately
- Produces high-fidelity video: 1280 x 768 pixel resolution at 24 frames per second
- A high degree of controllability and world knowledge, including a variety of video and text animations capable of generating a variety of art styles and 3D object understanding
How to use Imagen Video?
At present, Imagen Video has not been publicly used. If you want to know more about it, you can click on the “Research Paper” on the homepage of Imagen Video official website to learn more.
Imagen AI Paper
Imagen Video Technical Principle
The Imagen Video model consists of a frozen T5 text encoder, a base video diffusion model, and interleaved spatial and temporal super-resolution diffusion models. You can also find more information in the related paper.
Imagen Video Pricing:
Not currently. The Google development team said: "While our internal tests have shown that most explicit and violent content can be filtered out, there are still social biases and stereotypes that are difficult to detect and filter. We decided not to release the Imagen Video model or its source code until these issues are mitigated."
The length of the video is limited: currently the maximum output is about 5 seconds. But Google also gave a solution. Phenaki can show a video about two minutes long, which is also one of the Google projects. And Google plans to combine the image quality of Imagen Video with the coherence and video length of Phenaki in the next step.