How libvpx Performs Entropy Coding with Boolean Coder

This article explores how the libvpx codec library—the reference implementation for the VP8 and VP9 video formats—utilizes a Boolean arithmetic coder for entropy coding. We will examine the mechanics of this binary coder, including how it represents probabilities, divides the coding interval, performs renormalization, and processes non-binary syntax elements through binarization.

The Role of the Boolean Coder in libvpx

Entropy coding is the final stage in the video compression pipeline, responsible for losslessly compressing syntax elements such as motion vectors, transform coefficients, and macroblock modes. Instead of using traditional Huffman coding, libvpx uses a customized binary arithmetic coder known as the Boolean coder.

This coder operates exclusively on binary decisions (bits, or “bins”) rather than multi-symbol alphabets. By restricting the input to boolean values (0 or 1), the coder significantly simplifies the mathematical calculations required for arithmetic coding, replacing complex divisions with fast bitwise shifts and additions.

Probability Representation and Contexts

The Boolean coder in libvpx relies on state-based probability models to predict the likelihood of a 0 or a 1.

8-bit Probabilities: Probabilities are represented as 8-bit unsigned integers ranging from 1 to 255. The value represents the probability of the symbol being 0, scaled to a denominator of 256. For example, a probability value of 128 represents a 50/50 chance (\(128/256\)).
Context Modeling: To maximize compression, the coder uses context-adaptive modeling. The probability of a bin is determined by the values of previously coded, neighboring syntax elements.
Static vs. Dynamic Updates: Unlike some arithmetic coders that update probability states after every single coded bit, libvpx (particularly in VP8) often keeps probabilities static throughout a frame or updates them at designated intervals. This design choice prevents serialization bottlenecks and allows for highly parallelized decoding.

The Encoding Algorithm

The Boolean coder maintains two primary registers during the encoding process: 1. range: Represents the width of the current coding interval (initialized to 255). 2. low (or value in the decoder): Represents the bottom of the current interval.

For every bin processed, the coder performs the following steps:

1. Interval Subdivision

The coder splits the current range into two sub-intervals based on the probability (\(P\)) of the current bin being 0. The split point is calculated using integer math:

\[\text{split} = 1 + \left( \frac{(\text{range} - 1) \times P}{256} \right)\]

In practice, this is implemented using a fast bitwise shift: split = 1 + (((range - 1) * P) >> 8).

2. Range and Low Update

Depending on whether the actual bin being encoded is 0 or 1, the registers are updated: * If the bin is 0: The new range becomes split. The value of low remains unchanged. * If the bin is 1: The new range becomes range - split, and low is incremented by split.

3. Renormalization

As symbols are encoded, the range progressively shrinks. To prevent numerical underflow and maintain precision, the registers must be renormalized whenever the range falls below 128 (the MSB of the 8-bit register becomes 0).

During renormalization: * The range is shifted left by 1 bit (range <<= 1). * The low register is shifted left by 1 bit (low <<= 1). * As low shifts, bits are output to the final compressed bitstream.

Binarization of Non-Binary Syntax Elements

Because the Boolean coder only processes binary inputs, any syntax element that can take on more than two values must undergo binarization before entropy coding.

libvpx converts multi-symbol values into a series of binary decisions using tree structures (such as Huffman-like binary trees) or unary coding. The encoder then passes each branch of the tree as a individual bin to the Boolean coder, using a unique probability context for each node of the decision tree. This allows the coder to maintain high compression efficiency even when dealing with complex, multi-valued data like transform coefficients.