MPEG-H Contribution & Distribution Audio Encoding for TV Broadcast & OTT Streaming
MPEG-H Audio is a Next Generation Audio (NGA) codec enabling OTT streaming providers and TV broadcasters to make use of Fraunhofer's industry-leading MPEG-H 3D Audio technology in their product, workflow or service. It delivers personalized immersive sound that offers an unprecedented user experience for both live and VOD use-cases.

Compared to traditional audio formats with rigid stereo or surround mixes, the immersive MPEG-H codec handles the individual sound elements (e.g. dialog, commentary, music, sound effects) as separate audio objects. Each object is accompanied by metadata that defines what the audio element represents, its location, when it is active, how it should be rendered on different playback devices, what user interactivity is permitted, and its loudness characteristics.
MPEG-H technology provides more realism through sound from above as well as around the listener. With its unique personalization features, MPEG-H Audio offers consumers greater flexibility to actively engage with content and adapt it to their own preferences.
Sample on-prem workflow
The standard MPEG-H production workflow using on-premises servers is widely used today. However, there is an increasing tendency by streamers and broadcasters to shift parts of their production workflow to the cloud, allowing for increased scalability and flexibility. In this case, source video and audio, including metadata, must be transferred and processed in a different way so the content is delivered correctly to the consumer, creating a fully immersive and interactive experience.
Sample cloud/hybrid workflow
- Product Highlights
- Related Products
Product Highlights
-
MPEG-H Contribution and Emission modes
-
Input as PCM with Control Track (e.g. over SDI), Production Metadata (PMD) or Serialized Audio Definition Model (S-ADM)
-
Full compliance with MPEG-H 3D Audio Baseline and Low-Complexity Profiles as specified in ISO/IEC 23008-3
-
Automatic Fallback mode switch in case of Control Track loss or interruption
-
Support for input and output MHAS byte stream
-
RAP (Random Access Point) on demand support
-
Audio sample rate 48 kHz
-
Support for many CICP (Coding Independent Code Points) defined loudspeaker & channel configurations
-
Deployment on on-premise hardware or in the cloud
-
Suitable for live or VOD workflows
Related Products
SDKs: AVC/H.264, HEVC/H.265, VVC/H.266, OTT Content Creation
Applications: FFmpeg Plugins
Contribution & Emission encoding
The MainConcept MPEG-H Encoder can be used in either Contribution or Emission mode for distribution.
| Feature | Contribution Encoder |
Emission Encoder |
| Supported sample rates (Hz) | 48000 | |
| Input data endianness | Little endian | |
| Input data formats | 8-bit unsigned, 16-bit signed, 24-bit signed integers and 32-bit floating point | |
| Maximum channel count | 16 (including metadata track) | Up to 64 (including metadata tracks) |
| Metadata format | Control Track (interleaved with audio) | Control Track, PMD, S-ADM |
| Random Access Points (RAP) | RAP interval defines the interval of independently decodable points in the MHAS, i.e., no prior information is required. RAP on demand can be requested on demand. | |

The NGA Codec for TV Broadcast
MPEG-H is part of the ATSC (North America), DVB (Europe), TTA (South Korea) and SBTVD (Brazil) TV standards. It is used in South Korea’s terrestrial UHD TV service and Brazil has selected it as the mandatory audio format for their DTV+ broadcast service.
Comprehensive User Interactivity
Allows user interaction by enabling control over audio objects such as language, dialogue enhancement, alternative commentary tracks and other personalization features. The degree of interactivity is defined by the broadcaster or OTT service providers during the authoring and metadata creation process.

Immersive Sound
Reproduction
Support for a wide range of 3D audio configurations, including height channels. Featuring common immersive mixes like 5.1+4H or 7.1+4H.
Adaptive and Device-Aware
Playback
Automatic audio rendering for an optimized playback experience on many CE devices, including high-end home theater systems, soundbars, TVs, game consoles as well as mobile devices.
Advanced Loudness and Dynamic Range Processing
Allows consistent loudness across programs and offers dynamic range adaptation tailored to different audience use-cases and settings.
Products
MPEG-H contribution and distribution encoding for real-time and file-based workflows to enable object-based and personalized NGA production on-prem, in the cloud or in hybrid environments.


