libvpx Color Space Conversion Before Encoding

This article explains how the libvpx codec library, which handles VP8 and VP9 video encoding, manages color space conversion before the compression process begins. It covers the library’s reliance on external conversion tools, its supported input pixel formats, and how it handles color space metadata to ensure accurate color reproduction during decoding.

External Pre-Processing Requirement

The libvpx library itself does not perform raw color space conversions, such as transforming RGB data from a camera or graphics engine into the YUV format required for compression. Instead, it expects the input frames to already be in a compatible YUV pixel format.

Developers must use external libraries to handle the initial color space conversion before passing the video frames to the encoder. The most common libraries used for this purpose are:

libyuv: A Google-maintained library specifically optimized for scaling and color space conversion (e.g., RGB to YUV).
libswscale (FFmpeg): A widely used software scaling and conversion library that converts various RGB and YUV formats into the exact planar format libvpx requires.

Supported Input Pixel Formats

Once the video data is converted to YUV, it is wrapped in the vpx_image_t structure and passed to the encoder. The libvpx library natively supports several planar YUV formats, which vary depending on whether you are using VP8 or VP9:

VP8: Primarily supports 8-bit YUV 4:2:0 (VPX_IMG_FMT_I420).
VP9: Offers broader support, including 8-bit, 10-bit, and 12-bit depths, as well as various chroma subsampling formats such as YUV 4:2:0, YUV 4:2:2 (VPX_IMG_FMT_I422), and YUV 4:4:4 (VPX_IMG_FMT_I444).

Color Space Metadata Signaling

While libvpx does not alter the pixel values to change color spaces, it is responsible for signaling the correct color space metadata in the encoded bitstream. This metadata tells the downstream decoder which color coefficients (such as BT.601, BT.709, or BT.2020) to use when converting the YUV video back to RGB for display.

In the libvpx API, this is managed via the vpx_codec_enc_cfg_t configuration structure or through frame-level extra data.

VP8 has limited color space signaling, defaulting almost exclusively to the BT.601 color standard.
VP9 supports the vpx_color_space_t enum, allowing developers to explicitly tag the video with color profiles including BT.601, BT.709 (standard HD), BT.2020 (Wide Color Gamut/HDR), and SMPTE-170.

By ensuring the input YUV format matches the encoder’s expectations and correctly configuring the color space metadata, developers can achieve accurate color representation throughout the encoding pipeline.