Movie Barcode Part 2: Square Averaging

Intro

I started my exploration of movie barcodes in my last post by making a simple barcode of Wu-Tang’s classic “Triumph” music video:

I used FFMPEG to load the YUV video file into Python as an RGB NumPy array and then took the mean value of the red, green, and blue values of each frame and sequenced them with the first frame on the left to the last frame on the right of the output image.

This worked to create a simple barcode, but the barcode came out a little dark when compared to the video. This is because my code didn’t take into account the nonlinear nature of how the video file is stored. Over the next few posts, I’m going to investigate how video files are stored and look at more accurate ways to calculate average color.

Square Averaging

Let’s start investigating how to take a better color average by reviewing this article by Sighack and the video it references from Minute Physics:

These sources detail how a pixel in an image file is effectively the square root of the light level perceived by humans. The values are stored darker than they are perceived, so you will end up with a darker average than desired when you average on the stored values.

To average correctly, you will need to square the value before taking the average so that the averaging is being done at the perceptual level. To get the average value back to store in the image file, we will need to do the inverse operation by taking the square root after averaging. We will do this for red, green, and blue channel in an RGB image.

Let’s write that in equation form. Assume each frame is an RGB image with \(n\) total pixels where each pixel \(i\) has the value of \( (R_i, G_i, B_i ) \). The average pixel value \( (R_{ave}, G_{ave}, B_{ave} ) \) of a frame is calculated for each color independently:

\[ \begin{aligned} R_{ave} = \sqrt{\sum_{i=0}^n R_i^2} \\ G_{ave} = \sqrt{\sum_{i=0}^n G_i^2} \\ B_{ave} = \sqrt{\sum_{i=0}^n B_i^2} \end{aligned} \]

The output barcode image will be n_frames wide. Each column of the barcode image will be equal to the average value of the respective frame.

Code Outline

The meat of today’s code is very similar to the simple code. I’ll just square the values of every frame, take the average, and then take the sqrt before converting back to an image to save. I’d like to compare the timing of the simple mean to the squared mean so I’m going to run both sets of code today.

I’m also going to use the handy tqdm library to show the progress of my processing loops. You can install it with conda:

conda install tqdm

And here’s the outline of the code:

Import packages
Get video info using ffmpeg’s probe
Convert video to NumPy RGB array using ffmpeg
Process mean barcode:
- Initialize Mean barcode image as NumPy array of all zeros. Width = number of frames. Height is calculated for 16:9 aspect ratio.
- Go frame by frame, calculate the average value of each color with NumPy’s “mean” method
- Set the column of the barcode image to that color
- Convert the barcode image to a Pillow Image object and save to file
- Display processing time
Process squared mean barcode:
- Initialize Squared Mean barcode image. Same dimensions as Mean barcode image.
- Go frame by frame, take square of frame value, then take the average value of each color with NumPy’s “mean” method
- Set the column of the barcode image to that color
- Take square root of completed image, then convert to Pillow Image object and save to file
- Display processing time

Code

Here’s the full code:

import time

import ffmpeg
import numpy as np
from PIL import Image
from tqdm import tqdm


# Start Time
start_time = time.time()

# I renamed the downloaded video file to be simpler to work with
filename = 'triumph.webm'

# Get Video info
probe = ffmpeg.probe(filename)
video_info = next(stream for stream in probe['streams'] if stream['codec_type'] == 'video')
width = int(video_info['width'])
height = int(video_info['height'])

# Use ffmpeg to convert video file to NumPy array
# From https://github.com/kkroening/ffmpeg-python/blob/master/examples/README.md#convert-video-to-numpy-array
out, _ = (
    ffmpeg
    .input(filename)
    .output('pipe:', format='rawvideo', pix_fmt='rgb24')
    .run(capture_stdout=True)
)
video = (
    np
    .frombuffer(out, np.uint8)
    .reshape([-1, height, width, 3])
)

# Read End Time
read_end_time = time.time()
read_time = read_end_time - start_time
print(f'Total read time: {read_time} seconds')


# Initialize barcode image as NP array of proper dimensions initialized to all zeros
# video.shape: [num_frames, height, width, colors]
num_frames = video.shape[0]

# Barcode to have 16x9 aspect ratio
output_height = round(num_frames * 9/16)


# Calculate Mean Image

# np array defined as rows (height), columns (width), colors (3 for Red, Green, Blue)
mean = np.zeros((output_height, num_frames, 3))

# Go frame by frame, calculating the mean value of each color independently
for i_frame in tqdm(range(num_frames)):
    frame = video[i_frame, :, :, :]

    mean[:, i_frame, 0] = np.mean(frame[:, :, 0]) # red
    mean[:, i_frame, 1] = np.mean(frame[:, :, 1]) # green
    mean[:, i_frame, 2] = np.mean(frame[:, :, 2]) # blue

# Convert to Pillow Image and save to file
mean = np.array(mean, dtype=np.uint8)
im_mean = Image.fromarray(mean)
im_mean.save('triumph_mean.png')

# Display Mean Processing time
mean_end_time = time.time()
mean_processing_time = mean_end_time - read_end_time
print(f'Total mean processing time: {mean_processing_time} seconds')


# Calculate Squared Mean Image

# np array defined as rows (height), columns (width), colors (3 for Red, Green, Blue)
mean_sq = np.zeros((output_height, num_frames, 3))

# Go frame by frame, take the squared value, and then calculate the mean value of each color independently
for i_frame in tqdm(range(num_frames)):
    frame = video[i_frame, :, :, :]
    frame_sq = np.array(frame, dtype=np.float64)
    frame_sq = np.square(frame_sq)

    mean_sq[:, i_frame, 0] = np.mean(frame_sq[:, :, 0]) # red
    mean_sq[:, i_frame, 1] = np.mean(frame_sq[:, :, 1]) # green
    mean_sq[:, i_frame, 2] = np.mean(frame_sq[:, :, 2]) # blue

# Take square root, convert to Pillow Image and save to file
mean_sq = np.sqrt(mean_sq)
mean_sq = np.array(mean_sq, dtype=np.uint8)
im_mean_sq = Image.fromarray(mean_sq)
im_mean_sq.save('triumph_mean_sq.png')


# Display Squared Mean Processing time
sq_end_time = time.time()
sq_processing_time = sq_end_time - mean_end_time
print(f'Total Squared Mean processing time: {sq_processing_time} seconds')

# Display total time
total_time = sq_end_time - start_time
print(f'Total time: {total_time} seconds')

Results

Here’s the Squared Mean barcode:

And here’s the Mean barcode to compare:

The Squared Mean barcode is definitely brighter. It’s easier to distinguish changes in color during the darker scenes (take note of the darker scenes about a quarter of the way through the video). The brighter scenes also appear brighter and there appears to be better contrast between the bright and dark scenes. I find this a much more appealing image.

Here’s a screenshot of the console output showing processing time and the tqdm progress bar:

It took about 256 seconds to process the Squared Mean image vs. about 152 seconds for the Mean image. That’s about 1.7 times longer. I’ll definitely need to do some code optimization in the future when going to longer videos. Also worth noting is that it takes a non-insignificant ~24 seconds to use FFMPEG to load the video and convert from YUV to RGB.

Next Steps

I think this methodology created a more pleasing barcode image of Triumph but there’s more to learn. Pay attention to this note in the video at 2:01: “The actual root used in this process can range between 1.8 and 2.2 and is called the ‘gamma’ value”. Let’s look at our video file and determine what ‘gamma’ value to use next time.

– Teamster Sub

Intro#

Square Averaging#

Code Outline#

Code#

Results#

Next Steps#