In a chronically low-bandwidth region like the Middle East, cloud-based encoding technologies offer the best solution for streaming service providers looking to deliver content in HD, 4K, 8K and beyond, says Gerald Zankl.
The shift from traditional broadcast methods to cloud-based over-the-top (OTT) streaming has been a game changer for the Middle East TV industry. The region is home to more than 182m internet users and 304.5m mobile subscriptions (We Are Social, January 2019), which means mobile is leapfrogging TV to become the first screen, simply because it is the most accessible device for consumers.
Yet despite consumer consumption habits leaning towards streaming content from mobile devices, the reality in the Middle East is that available bandwidth is affecting access to high-quality streaming, slowing delivery and reducing the quality of image on arrival. With such a huge online consumer base now looking to access OTT services, it is essential that users be able to stream the highest-quality content from any device and location.
There has been much promise of 5G revolutionising streaming by minimising buffering and delays, and while the Middle East is gearing up for the arrival of 5G, it is unlikely to have an impact on the average consumer in the short to medium term. Indeed, the earliest we can expect to see 5G networks is in 2020, and even then only a minority of devices will support it.
For streaming service providers, the most readily available solution for delivering high-definition TV, film, sport and documentary content in a region with chronically low bandwidth is the deployment of next-generation cloud technologies. Cloud-based encoding technologies have the potential to dramatically improve quality of service and have already been deployed by forward-thinking broadcasters like OSN. As a result, they are driving major changes in the region and underpin a number of trends that are facilitating the cutting-edge experiences consumers expect and deserve.
The move towards a multi-codec world
The launch of AV1 has been one of the industry’s hottest topics, due to the promise of being able to process UHD much faster than any other solution on the market. It’s also royalty-free, effectively levelling the playing field for innovation and allowing more agile players to compete with industry giants. AV1 is likely to be rolled out for premium VoD services first, enabling providers to spread the cost of computer resources while the wider market begins mass adoption of the codec over time. We expect that 2019 will accelerate the trend towards a multi-codec world as all content providers select the best solution for different scenarios.
This bandwidth-saving technology will benefit TV service providers by enabling a player to detect what the most efficient codec is on any browser or platform. AV1, h.264, HEVC, VP9 and VVC have all been designed to suit a specific environment or type of content. By encoding videos into multiple codecs and configuring a player to make informed decisions about which files to stream and to whom, operators can deliver the highest picture quality on existing bandwidth.
Leveraging the power of machine learning
Technologies such as per-title encoding will benefit TV service providers in the region by adjusting an encoding configuration to optimise a specific video asset. This leverages the fact that some videos are far less complex than others, and therefore can be encoded at lower bitrates. Cartoons are a classic example – they contain scenes with large areas of solid colour at a low complexity level, which can be compressed much more efficiently than more detailed scenes in blockbuster movies or high-quality nature documentaries.
Per-title encoding relies on performing a complexity analysis on each title before the encoding process begins. That complexity score is used to adjust the bitrate ladder by sending a new encoding profile to the encoder. The result: each video in a library is encoded in a way that best suits the content – minimising the bitrate where appropriate and greatly reducing bandwidth usage.
Quality as high as the eye can see
One of the biggest challenges facing the Middle East is low-bandwidth constraints. This can be overcome by delivering streams in the highest quality that can be perceived by the human eye. Technologies like per-scene adaptation leverage the fact that the human eye is unable to register a lot of the information delivered in a video stream. In most videos, many scenes can be streamed at a lower bitrate without the viewer noticing. Per-scene adaptation involves supplementing the adaptation logic that governs viewer behaviour with an additional stream of quality metadata containing information about the visual complexity of a particular segment in the video.
In an adaptive streaming scenario with a standard configuration, the player attempts to download a video file that fits the screen on which it is playing. A player configured for per-scene adaptation is alerted to the fact that an upcoming segment can be played at a lower bitrate without any noticeable loss of quality. The player thus adjusts itself accordingly to reduce the bandwidth consumption – in some cases by 30% or more.
The quality metadata required to control this process is generated by running an analysis on each video as it is encoded, using a variety of perceptual quality measurements, meaning the encoding process is optimised for the human eye. This metadata is included in the adaptive package and streams to the player in a similar way to subtitles and closed captions.
Future-proofing OTT
Eventually, the Middle East will have democratised access to much faster mobile networks, and I welcome that when it comes. But by then more people will be consuming even more content at higher resolutions, and we will find ourselves in the same position if content is not being managed and distributed in the most efficient way possible.
At CES and NAB this year, 8K cameras, devices and complementary technologies were increasingly prevalent. When they become more mainstream, the need for multi-codec streaming, per-title encoding and per-scene adaptation will become even more critical, even on 5G networks that come close to achieving the speed and bandwidth availability promises being made right now.