How to Create an AI Text-to-Video Clip in Seconds

admin April 28, 2023

0 2 minutes read

LLMs like ChatGPT provide arbitrary text and graphic generators like Stable Diffusion create images based on prompts, but text-to-video AI is still a new area. Earlier this week, we reported on an AI pizza commercial called a text-to-video conversion tool. Runway Gen-2 (opens in new tab) for that video. However, Runway Gen-2 is currently in invite-only beta. Therefore, you cannot try it unless you are invited.

Luckily, Hugging Face (the leading AI developer portal) has a completely free and easy-to-use tool called NeuralInternet Text-to-Video Playground, but it’s limited to 2 seconds, which is just enough for animated GIFs. I’m here. You don’t even have to have a Hugging Face account to use it.

How to generate a 2 second AI text video clip

1. Go to Text-to-video playground (opens in new tab) in your browser.

2. enter the prompt Complete the prompt box or try one of the sample prompts at the bottom of the page (e.g. “Astronaut on horseback”)

(Image credit: Tom’s Hardware)

3. Please enter your seed numberThe seed is a number (between -1 and 1,000,000) that AI uses as a starting point for generating images. This means that using a seed of 1 will give you the same output every time with the same prompt. A seed of -1 is recommended. This will give you a random seed number each time.

Please enter your seed

(Image credit: Tom’s Hardware)

Four. [実行]Click.

[実行]Click

(Image credit: Tom’s Hardware)

It takes a few minutes for the Text-to-Video Playground to generate results. You can check the progress by looking at the results window. It may take longer depending on the amount of server traffic.

result window

(Image credit: Tom’s Hardware)

Five. click play button play your video

click play button

(Image credit: Tom’s Hardware)

6. Right click on the video,[名前を付けてビデオを保存]Choose Download the video (as MP4) to your PC.

Save video as

(Image credit: Tom’s Hardware)

Models used and results

The Text-to-Video Playground uses a text-to-video model from a Chinese company called ModelScope. 1.7 billion parameters (opens in new tab)Like many AI models that work with images, the ModelScope model has some limitations beyond just a two second execution time.

First of all, it is clear that the training data set was taken from a variety of web images, including copyrighted and watermarked ones. In some examples, shutter stock (opens in new tab) Watermark any object in your video. Shutterstock, a leading royalty-free image provider that requires a paid membership, appears to have acquired images without permission from Training Data.

Shutterstock watermark.the circle is mine

Shutterstock watermark.the circle is mine (Image credit: Tom’s Hardware)

Also, not everything looks as it should. For example, keen kaiju fans will appreciate that my Godzilla eating pizza video below shows the monster, a giant green lizard, but doesn’t have any of the characteristics of everyone’s favorite Japanese monster. You will notice

Godzilla eating pizza, 2 second AI video

This video was created in Text-to-Video Playground and converted to a GIF for easy viewing here. (Image credit: Future)

Lastly, perhaps this goes without saying, but these videos have no audio. The best use for these is converting them into animated GIFs that you can send to your friends. The image above is an animated GIF I made from one of his two-second videos of Godzilla eating pizza.

If you want to learn more about creating in AI, see How to create autonomous agents using Auto-GPT or How to use BabyAGI.

Tags

admin April 28, 2023

0 2 minutes read

admin