libvpx Color Space Conversion Before Encoding
This article explains how the libvpx codec library,
which handles VP8 and VP9 video encoding, manages color space conversion
before the compression process begins. It covers the library’s reliance
on external conversion tools, its supported input pixel formats, and how
it handles color space metadata to ensure accurate color reproduction
during decoding.
External Pre-Processing Requirement
The libvpx library itself does not perform raw color
space conversions, such as transforming RGB data from a camera or
graphics engine into the YUV format required for compression. Instead,
it expects the input frames to already be in a compatible YUV pixel
format.
Developers must use external libraries to handle the initial color space conversion before passing the video frames to the encoder. The most common libraries used for this purpose are:
- libyuv: A Google-maintained library specifically optimized for scaling and color space conversion (e.g., RGB to YUV).
- libswscale (FFmpeg): A widely used software scaling
and conversion library that converts various RGB and YUV formats into
the exact planar format
libvpxrequires.
Supported Input Pixel Formats
Once the video data is converted to YUV, it is wrapped in the
vpx_image_t structure and passed to the encoder. The
libvpx library natively supports several planar YUV
formats, which vary depending on whether you are using VP8 or VP9:
- VP8: Primarily supports 8-bit YUV 4:2:0
(
VPX_IMG_FMT_I420). - VP9: Offers broader support, including 8-bit,
10-bit, and 12-bit depths, as well as various chroma subsampling formats
such as YUV 4:2:0, YUV 4:2:2 (
VPX_IMG_FMT_I422), and YUV 4:4:4 (VPX_IMG_FMT_I444).
Color Space Metadata Signaling
While libvpx does not alter the pixel values to change
color spaces, it is responsible for signaling the correct color space
metadata in the encoded bitstream. This metadata tells the downstream
decoder which color coefficients (such as BT.601, BT.709, or BT.2020) to
use when converting the YUV video back to RGB for display.
In the libvpx API, this is managed via the
vpx_codec_enc_cfg_t configuration structure or through
frame-level extra data.
- VP8 has limited color space signaling, defaulting almost exclusively to the BT.601 color standard.
- VP9 supports the
vpx_color_space_tenum, allowing developers to explicitly tag the video with color profiles including BT.601, BT.709 (standard HD), BT.2020 (Wide Color Gamut/HDR), and SMPTE-170.
By ensuring the input YUV format matches the encoder’s expectations and correctly configuring the color space metadata, developers can achieve accurate color representation throughout the encoding pipeline.