Generative AI video is improving rather quickly (just like everything else around AI). In the last 2 days, there have been two big announcements; one from Runway ML and another from Pika Labs, both related to the ability to control the movement of the virtual AI camera.
If you are not familiar with those services, they are generative AI tools that allow you to create videos based on text prompts. Actually, both offer other AI tools, but for the sake of my experimenting and this article, I'm sticking with the text to video functionality of each.
One of the drawbacks of generative AI created video has always been the lack of control over the camera movement in the scene. This is slowly changing. I say slowly because even though both companies released the ability to control the camera, it's still in its infancy and we have very little actual control over it. But we are starting to get some control, which is good news.
You can join Pika labs at Pika.art. Pika actually runs in Discord, so you need to have a Discord account as well. You can animate images using the /animate command or create new videos based on a text prompt using the /create command.
To manipulate the camera, you pass it an attribute called -camera. You can pass different values for that attribute including:
zoom in / out
pan up / down / left / right (you can also combine these such as pan up right)
rotate clockwise (cw) / counterclockwise (ccw) / anticlockwise (acw)
You can also pass it a -motion attribute with a value between 0 and 4. The default is 1. 0 is slow while 4 is fast.
So, for example, I animated a beach image three times.
The first prompt, I passed it /animate [image] (where [image] was an actual image I upload it to their system) and it returned the following:
For the second prompt, I passed it /animate [image] -camera zoom in -motion 3 and it returned the following:
For the last prompt, I passed it /animate [image] -camera pan right -motion 3 and got this:
The results are pretty impressive. Now let's move on to Runway.
You can play with Runway at Runwayml.com. The interface for Runway is definitely more refined, as their product is more developed. Having said that, the number of parameters you can configure are pretty much the same as the ones Pika provides.
You basically select which motion you want to do, for example, you can select Horizontal Left, Vertical Up, Zoom Out and Roll Right. Then you select the speed and it will try to build your video with that motion. I say try because, as of now, it doesn't always hit it perfectly. I think Pika does a bit better at hitting the movement you want. The drawback to Pika is that I can't Pan Up Right and Zoom In at the same time (at least not that I could figure out how to do). You can do that in Runway.
One of the things I appreciate about Runway is that it can generate previews for free before I commit to generating the actual video, which costs credits, and credits cost money. You can see above, in the interface, there is a "PREVIEW" button. You click on that and you'll get 4 previews.
If you don't like those previews, you can click on "PREVIEW" again and get 4 more.
As you can see from the above previews, there were some versions that were a bit "funky" and would have been a waste of "credits". Below is the video I created in Runway ML after selecting one of the previews from above.
Even though it didn't move exactly how I told it to move, the results are still pretty good.
So there you have it, camera motion is now available in generative AI. Though its limited today, just wait until tomorrow. As I always say, this is THE WORST that AI will EVER get. It will only get better from here!
Other things I worked on
Worked on Fight Scene Edit for my UCLA Class. You can read more about that here.
Until the next entry!