libvpx vs libaom: Key API Design Differences

This article provides a direct comparison of the API design differences between libvpx (the reference library for the VP8 and VP9 video codecs) and libaom (the reference library for the AV1 video codec). While both libraries originate from the same development lineage and share similar fundamental structures, libaom introduces modern API patterns, expanded configuration parameters, and refined controls to handle the increased complexity of the AV1 codec. Understanding these differences is essential for developers migrating video processing pipelines from VP9 to AV1.

Ancestry and Structural Similarity

Because libaom was originally forked from the libvpx codebase, they share an identical structural foundation. Both libraries utilize a unified codec interface design based on the vpx_codec_ctx and aom_codec_ctx structures, respectively.

In both libraries, the basic workflow follows these steps: 1. Initialize a codec context (vpx_codec_enc_init vs. aom_codec_enc_init). 2. Pass configuration structures (vpx_codec_enc_cfg_t vs. aom_codec_enc_cfg_t). 3. Send raw frames for encoding using an encode call. 4. Retrieve compressed packets via an iterator loop.

Despite this shared foundation, the APIs diverge significantly regarding configuration depth, control enumeration, and modern feature integration.

1. Prefixing and Namespace Separation

The most immediate difference is the naming convention used for functions, data types, and macros. This separation allows both libraries to be linked into the same application without namespace collisions:

2. Expanded Control Enums (Ctrl IDs)

Both libraries use a generic control function (vpx_codec_control and aom_codec_control) to set or get specific encoder settings. However, the complexity of AV1 requires a vastly expanded set of controls in libaom.

3. Configuration Profiles and Key-Value API

In libvpx, settings are primarily configured by direct manipulation of the vpx_codec_enc_cfg_t struct, followed by vpx_codec_control calls for fine-tuning.

libaom retains this structural configuration but introduces a more modern string-based key-value configuration helper API. This makes it easier to pass command-line-style options directly to the library programmatically without needing to map every configuration variable to a hardcoded struct member or control ID in the host application.

4. Multi-Threading and Tiling Controls

While both codecs support multi-threading, the APIs manage parallel processing differently due to the architectural differences between VP9 and AV1.

5. Metadata and High Dynamic Range (HDR) Handling

Handling of modern video metadata is much more integrated into the libaom API surface compared to libvpx.