Computational Photography and Capture, Spring 2011

Labs

Home | Syllabus | Course Work | Assessment | Labs

Labs7: Video Textures

The topic of today's labs is Video Textures. What you during this session will help you with next week's labs, which will be assessed.

A Video Texture can as defined as being a continuous infinitely varying stream of images. In other words, it is a video playing indefinitely, that does not display any cuts or jump, that you would see if you were playing a normal video in a loop. The page of the original paper is there, and you are invited to play a few videos to get a grasp on what it is supposed to do.

The most important concept used in this paper is the distance matrix. To be able to jump from a frame to another in the video, we naturally want those two frames to be similar in order to have the smoothest transition as possible. To achieve that, we have to calculate the distance matrix of the video, which is a matrix containing the distances between all the frames. For example, the entry Mij of the matrix corresponds to the distance between the frame i and the frame j. The distance metric used is simply L2 (Euclidean norm). Let's say our input video is a swinging pendulum, then its distance matrix should look like this:

Dark entries correspond to low distances, bright entries to high distances. Naturally, the diagonal is null, since the distance between a frame and itself is zero. We also expect consecutive frames of a video to be quite similar. The distance matrix is then turned into a probability matrix, describing the probability of transitioning from frame i to frame j. It looks like that:

The next step is to preserve dynamics. To exemplify that, let's think about our swinging pendulum. Let's say we are in a left-to-right swinging motion, at the frame where the pendulum is perfectly vertical, and we want to jump somewhere else in the video. We want to stay in a left-to-right motion, not to create an abrupt change. However, looking only at the similarity of the current image with other images of the video is not enough to take into account the motion. There are some very similar frames, but with the opposite motion. To overcome this problem, we require the temporally adjacent frames withing some weighted window to also be similar: we have to match subsequences of frames instead of individual frames. That can be achieved by filtering the distance matrix with a diagonal kernel. This is explained in the section 3.1 of the paper. It turns our probability matrix into the following:

Finally, an important part of the process is to avoid dead ends and anticipate the future. In other words, we don't want to jump somewhere in the video that has no good possible transition for the next jump. We have to calculate the anticipated "future cost" of choosing a given transition. The technique is explained in the chapter 3.2.

After doing all those steps, you should get a probability matrix that you can use to make jumps in the video in order to create a video texture.

Here is a summary of what you have to do today:

Read the paper until chapter 3.2 included.

Download a video to work with. I have uploaded three videos: the simplest one which is the video of a pendulum, a video of a candle flame, and the video of the same candle flame but smaller in size for those who have memory problems with Matlab. You can use the function written in load_sequence.m to load the frames into matlab.

Calculate the basic distance matrix (and probability matrix).

Refine that result by considering dynamics.

Refine that result by considering dead ends.

Create a video texture from your input video. For example, start with a given number of jumps that you want to achieve, and use your probability matrix to decide the right moment to jump.

The next assessed coursework (next week), will make use of the distance matrix that you created (supposedly the basic one only, but I recommend trying to create them all).