Started as an experiment, I think it turns out pretty well, so I am sharing it here. A few tips that can help:
Square output (480x480 or 720x720) seems to be more consistent and turn out better in general. I had some fun success with other format, but it is less consistent
For I2V, if you starting image is tight close-up on the face, the man will already be behind as it zooms out. If the starting frame is less zoomed in, the man usually appears during the camera motion.




