Here at Globo, we have as our mission to build resilient systems to provide an optimal experience for our domestic and international customers, allowing them to watch their favorite international or local tv show, a soap opera, or any of our live streaming channels on globoplay.
The architecture supporting globoplay’s media platform heavily relies on open-source software such as GNU/Linux, nginx, tsuru, kubernetes, clappr, nginx, on-the-fly packaging and much more. Another exciting aspect about Globo is that we built and manage our own CDN. Such a task pushes us to broaden our knowledge and understanding around Brazil’s networking infrastructure.
Nowadays, people are expecting to stream on any device they own. This means that we need to support most of these devices. There are hundreds of smartphones, dozens of smart TVs, many tablets, browsers, casters, etc. Providing a steady and successful playback experience in this scenario inevitably adds a lot of complexity to the architecture.
We’re live-streaming the famous program Big Brother through the Internet and broadcasting it into the air as well. In one of the days, we reached 2 million simultaneous users watching the BBB live stream. To serve these many users, we must be present in the majority of stream-capable platforms.
Before we discuss the challenges we face dealing with this plethora of devices, here’s the breakdown of play hit and watch session percentage grouped by major platform.
The data from one week in January, 2021. Play metric means the percentage of plays for a specific device. Playtime represents the fraction (%) of whole-time watched by a given platform. Big screen accommodates all casters and HTML also counts mobile browsers.
In the US, the market-share seems to gravitate around the casters (also k nown as OTT devices like FireTV, Roku, ATV, etc). In contrast, here in Brazil, we have mobile devices and connected TVs sharing a lot of home presence. We have a diverse environment of Connected TVs brands as well as a significant number of old TVs that run legacy and outdated firmware/OS versions.
On the Big Brother case, the big-screen performed fewer plays, but people kept watching on it for longer sessions. It could be due to the convenience that these devices offer, but it’s hard to draw a conclusion.
This data may differ from other players in Brazil. However, here are some observations for the context of Brasil:
- Netflix, due to its premium VOD only nature,
- Youtube, because they’re present natively on most platforms,
- Twitter / Facebook, their massive presence mostly on mobile
Although it wouldn’t be unexpected, if the data presented here align with these players.
While having people watching everywhere sounds good, it also poses a hard technical challenge. To support the majority of devices, we need to adjust our platform in almost every component of its workflow.
Here we list some of the problems we faced and how we fixed them:
- HTTP Cookies
- problem: some devices can’t manage cookies
- solution: persist data over URI or HTTP headers
- HTTP CORS
- problem: some native players don’t follow all the cors specification
- solution: adapt for their usage
- chunked transfer
- problem: a specific device fails silently dealing with chunked transfer
- solution: buffer the intermediate response and serve the full file
- video frame rate
- problem: mixing different frame rates (i.e. 30fps and 60fps) causes glitches on rendition swap in some players
- solution: offer fixed frame rate ladder for them
- audio sampling
- problem: 48khz sampling causing audio glitch/sync on some devices
- solution: stick to 44.1 khz
- aes hls encryption
- problem: some devices handles aes hls keys as if they were always at the same level as the master variant (if they aren’t, then it’s a 404)
- solution: put the full URI for each key or always have your variant in the same level as your master
- syncing by hls media sequence
- problem: some players use the hls media sequence attribute to sync among the renditions
- solution: always sync your media sequence
- high resolutions
- problem: some devices won’t be able to play high resolution renditions (even if they advise otherwise)
- solution: filtering out resolutions based on observed capacity
Brazil is a continental country with sub-optimal internet infrastructure. One way to offer a smooth experience for media streaming users is to take the content closer to them. We are continually expanding our footprint with new edges, open caches, ISP, IX, etc.
We improved our streaming QoE by adding a 144p rendition within our current ladders. In a live streaming, such a rendition may require around 200kpbs which consumes circa 77MB per hour.
Such a complex scenario made us introduce an extra congestion control level. When we notice that a group of users, behind an ISP, started to fill their link to our CDN, we eagerly adjust the renditions’ suggestion to these users. At the point where the link starts to decrease its saturation, we also re-adjust the suggested bitrates.
In this server-side “ABR” we act much more preventable than in the player’s ABR algorithms or the (almost reactive) TCP mechanism.
it’s all entertainment
Dealing with outdated devices, varying Internet speed, latency, and bandwidth, plus the already complex media streaming surroundings requires teams to be well-integrated. People working together help to alleviate the hard task of debugging in this environment.
Brazilians love to spend time on the Internet. The quantity of devices (brands, OSes, build year, etc) and the complex Internet infrastructure creates opportunities to be creative while designing systems to support Brasil’s context.