What is the auto-alt-ref Parameter in libvpx?
This article explains the function of the auto-alt-ref
parameter in the libvpx video encoder, which is used for VP8 and VP9
video compression. You will learn how this setting enables “alternate
reference frames” to improve video quality and compression efficiency,
how it works under the hood, and the performance trade-offs associated
with enabling it.
In libvpx, the auto-alt-ref parameter controls the
creation of alternate reference (alt-ref) frames. When enabled, this
parameter allows the encoder to generate invisible frames that are not
displayed to the viewer but are instead used purely as reference points
to predict the compression of other visible frames. By using these
high-quality, non-displayed reference frames, libvpx can significantly
reduce the overall bitrate of a video while maintaining or improving its
visual quality.
To create an alt-ref frame, the encoder uses a process called
temporal filtering. It analyzes a sequence of future video frames—which
requires enabling the lookahead buffer using the
lag-in-frames parameter—and blends them together to remove
temporary noise and grain. This resulting “clean” frame is stored in the
encoder’s memory. Subsequent active frames can then reference this clean
image to encode static backgrounds and slow-moving objects far more
efficiently than using standard P-frames or B-frames.
The auto-alt-ref parameter typically accepts the
following values: * 0 (Disabled): The encoder will not
generate alternate reference frames. This reduces encoding time and CPU
usage but results in lower compression efficiency and larger file sizes.
* 1 (Enabled): The encoder automatically determines
when to insert alt-ref frames. This is the standard setting for
achieving optimal video quality and compression. * 2 (VP9
only): Enables a more aggressive, multi-layered alt-ref frame
structure, which can further improve compression efficiency at the cost
of additional encoding complexity.
While enabling auto-alt-ref is highly recommended for
standard 2-pass encoding, it does have some trade-offs. It increases CPU
utilization and encoding times due to the complex temporal filtering and
lookahead analysis required. Additionally, because it relies on
analyzing future frames, it introduces latency, making it unsuitable for
real-time streaming or low-latency video conferencing where the
lag-in-frames must be set to zero.