How libvpx Handles Multi-Threading in Video Encoding
This article explains how the libvpx library manages multi-threading to accelerate VP8 and VP9 video encoding. It covers the core parallel processing techniques used by the library—such as tile-based threading, row-based multi-threading (row-mt), and token partitioning—and provides guidance on how to configure these settings to maximize CPU utilization and encoding speed.
Multi-Threading in VP8 vs. VP9
The libvpx library supports both the VP8 and VP9 video codecs, but handles multi-threading differently for each due to architectural advancements in the VP9 standard.
In VP8, multi-threading is relatively limited. It primarily relies on token-based partitioning, which splits the entropy coding (the final stage of encoding) into up to four parallel partitions. While this offers some speedup, it does not allow the heavy motion estimation and intra-prediction steps to be fully parallelized across many CPU cores.
In VP9, libvpx introduces highly efficient multi-threading mechanisms designed for modern multi-core processors. These mechanisms include tile-based parallel processing and row-based multi-threading.
Tile-Based Multi-Threading (VP9)
VP9 allows video frames to be divided into a grid of independent rectangular regions called tiles.
- Independent Encoding: Because the pixels and motion vectors in one tile do not depend on neighboring tiles for prediction, each tile can be encoded completely independently.
- Thread Allocation: libvpx can assign a separate CPU thread to process each tile simultaneously.
- Control Parameters: The number of tiles is
controlled by the
--tile-columnsand--tile-rowsparameters. For example, setting--tile-columns=2creates \(2^2 = 4\) vertical tile columns, allowing up to four threads to run in parallel.
While tiling significantly improves encoding speed, it comes with a minor trade-off: because prediction is disabled across tile boundaries, compression efficiency (video quality per bitrate) can slightly decrease as the number of tiles increases.
Row-Based Multi-Threading (row-mt)
To overcome the quality loss associated with high tile counts, libvpx
introduced Row-Based Multi-Threading
(--row-mt=1). This is the most efficient way to
utilize modern processors with high core counts in VP9.
- Pipelined Execution: Instead of waiting for an
entire tile to finish,
row-mtallows different threads to work on different rows of blocks (superblocks) within a single tile or frame at the same time. - Dependency Management: A thread encoding row \(N\) can begin as soon as the thread encoding row \(N-1\) is a few blocks ahead. This maintains the spatial dependencies required for high-quality intra-prediction while still distributing the workload across multiple CPU cores.
- Benefits: Enabling
row-mtdramatically increases CPU utilization and encoding speed with virtually no impact on compression efficiency or visual quality.
Key libvpx Threading Parameters
To optimize libvpx multi-threading via command-line tools like FFmpeg, three primary parameters are used:
threads: Sets the maximum number of threads the encoder is allowed to use.tile-columns: Specifies the log2 of the number of tile columns (e.g., a value of 2 results in 4 columns).row-mt: A boolean switch (0 or 1) that enables or disables row-based multi-threading.
For optimal performance on a modern multi-core system, it is
recommended to enable row-mt and set the thread count to
match your CPU’s logical core count.