With advances in recent years in sensors, internet connections, streaming, and decoding technologies, participation in a wider variety of events has become possible using a smartphone, tablet, or laptop. From small-scale family events to professional sports, the number of events streamed and remotely attended is steadily increasing. However, due to restrictions concerning access and privacy, there are still many events that are not streamed. These event types usually cannot be streamed in a traditional way, but interest exists in providing a service where a limited number of cameras are moved by an operator. At small family or private events, hiring operators to stream can be perceived as too service-intensive or costly, leading to the idea of hybrid live streaming. In hybrid live video streaming services, a limited content stream is provided, and the camera operator moves the cameras and acts as the operator. Automatically adjusted shot composition aims to assist the operator with the basic task of keeping the main subjects in the frame during the shots.
In a hybrid livestream live stream, an operator uses basic equipment to limit the resource intensity of the stream while still providing some basic welcome compositions for most of the actions. With camera technology becoming more affordable due to the increasing performance of smartphones, a viable service can be created by hybrid live streaming. When using smartphones for capturing feeds from the events, the need for a user’s experience is limited. This is due to smartphones being equipped with built-in Wi-Fi, IP mobility, and streaming software, highly adaptive encoding, and observation cameras. The panorama shots can also be done automatically by a simple app. Experimentation is needed to find out what video formats can be used with the current intermediate recording and streaming setup, allowing the provision of multiple different quality streams of the same video.
With advances in recent years in sensors, internet connections, streaming, and decoding technologies, participation in a wider variety of events has become possible using a smartphone, tablet, or laptop. From small-scale family events to professional sports, the number of events streamed and remotely attended is steadily increasing. However, due to restrictions concerning access and privacy, there are still many events that are not streamed. These event types usually cannot be streamed in a traditional way, but in any case, interest exists in providing a service where a limited number of cameras are moved by an operator. At small family or private events, hiring operators to stream can be seen as too service-intensive or costly. With hybrid live streaming, a limited content stream is provided, and the camera operator moves the cameras and acts as the operator. Automatically adjusted shot composition aims to assist the operator in this fundamental task of keeping the main subjects in the frame during the shots.
In a hybrid live stream, an operator uses basic equipment to limit the resource intensity of the stream while still providing some basic welcome compositions for most of the actions. With the camera technology becoming more affordable due to the increasing performance of smartphones, this creates a viable service by implementing hybrid live streaming. When smartphones are used in capturing feeds from the events, the need for a user’s experience is limited due to smartphones being equipped with built-in Wi-Fi, IP mobility, and streaming software, highly adaptive encoding, and observation cameras. The panorama shots can also be done automatically by a simple app. Experimentation is needed to find out what video formats can be used with the current intermediate recording and streaming setup, as well as to allow the provision of multiple different quality streams of the same video.
1.1. Definition and Overview
Hybrid live streaming refers to a television broadcasting method that delivers both broadcast and internet content to viewers. This guide will examine a system known as hybrid live streaming on a cable network, wherein conventional broadcasting and the internet function in tandem, each expanding the domain of the other. Broadcasting will be seamlessly integrated with the internet in ways that have not been possible before. With this promising technique, a new realm of broadcasters will be born, and the viewing experience will be enhanced and improved.
The two approaches, broadcast and internet, each have their own advantages and challenges. Broadcast TV is free of charge, has fixed viewing equipment in households, and addresses a wide domain. Nonetheless, it has limitations: it cannot target specific audiences, and interactivity is difficult and often limited to sending an SMS. Conversely, internet streaming targets specific audiences, allows for high viewer interaction, and incorporates interactivity within the program, but with its own drawbacks: it requires a connected PC with sufficient bandwidth, and in many cases necessitates viewing on a PC screen, which is undesirable for programs traditionally viewed on a TV screen.
Hybrid live streaming serves as an ingenious solution to address the conventional bounds of broadcasters. It presents a concept in which broadcasting and the internet are directly coupled to produce a single stream: the hybrid stream. Since the broadcast stream is modified, processing and re-encapsulating are handled at the broadcaster’s side. The coupling occurs at a stream server deployed at the broadcaster’s site that templates the broadcast stream by creating a hybrid stream, and a hybrid client arrayed at the viewer’s side that demultiplexes the hybrid stream by separating it into both the broadcast and internet streams and directing them to the proper viewers’ receivers—TV set and PC. Each receiving device is enhanced, allowing it to deal with this specifically pre-delivered stream. A TV receiver coupled with appropriate set-top equipment receives this hybrid stream. Conventional broadcast video streaming can be employed with TV sets, and internet multicast/cast can be used with PC confinement.
1.2. Importance and Benefits
The world of technology is expanding incredibly fast, and so are the various aspects of life in the rapidly evolving digital age. Communications and information are easily exchanged through various technologies all around the world, and so is the entertainment industry. Irrespective of the field, the way of operations has also transformed phenomenally. Where people used to rely on media channels such as televisions or radios to get updates about events taking place all around the world, it has now turned to an online system introduced by various digital platforms. Instant access to information is possible now through these digital platforms, and nothing is the same since social media and Wi-Fi came into existence.
The advent of digital platforms has totally revolutionized the changeover of live events to the audience all around the world. In keeping a great interest in sports, the industry has also seen an immense surge of investment and revenue from multiple countries. The broadcasting rights for various games have been sold for significant amounts of money. As a result, several sports channels have emerged, and there started the quest for the telecasting rights of various live events by different media houses. However, the large-scale operations now require enormous truckloads of equipment and machines that are often too heavy and wide for easy transportation. Moreover, in these high technologies, the connection between field productions and the corporate broadcasters is done through a fiber link, which is also often very challenging in crucial situations. In the event of an accident in traffic, there is a very high risk of loss of an expensive truck along with the equipment and the chances of disruption of the telecast.
However, this hybrid live streaming technology came as a blessing in disguise and alleviated the above-stated difficulties. It allows the use of the same infrastructure and the same fiber link connection as the traditional approach, which eases the transmission of live events. This technology uses a combination of camera and mobile to stream live from the field to the offices without the use of an expensive truck and a full-fledged crew, which is often not feasible during some ground events. A lot of notebook computers equipped with software and a mobile camera can do the same work as a giant truck filled with millions of machines and people. It could also avoid the danger of losing a huge investment in an accident by having no such truck. Moreover, this technology can also have a cricket match streamed live alongside the same telecast with the traditional system by fitting in both methods.
Taking everything into account, hybrid live streaming is a very useful and robust technology that can help in various applications and can be used as a very productive tool in processes. In today’s scenario, a pilot project based on this technology is under research in both the Department of Computer Science and Engineering and the Industrial Research and Consultancy Centre for further augmentations in adding more features and benefits to this technology. Hope it flourishes well for every possible and impossible application and for a better tomorrow.
Technologies Behind Hybrid Live Streaming
The technological evolution of live streaming has become paramount in recent years, particularly against the backdrop of the pandemic. Consequently, the avenues for live-streamed events, be they sporting, musical, cultural, or corporate, have burgeoned and diversified in approach. This subsequent increase in live-streamed events has thereby amplified the challenges to ensure high quality, reliability, and interactivity when streaming such events. This report aims to illustrate the solution presented to some of these challenges through hybrid live streaming. Of the various technologies available to create a hybrid experience, a shortlist of the most beneficial is highlighted, focusing specifically on WebRTC and Adaptive Bitrate Streaming. This report will also cover a hybrid live stream’s simple setup.
WebRTC Technology Since its inception, WebRTC has enabled web browsers and mobile applications to stream audio and video in real-time and peer-to-peer by the use of JavaScript APIs. It has now matured into various open-source projects, libraries, and protocols capable of many means around live streaming. Due to the ability of WebRTC to stream audio and video with very low latency, it is more fitting for hybrid live streams than HTTP protocols. The only main downside of WebRTC is the limit on concurrent users of one stream. The maximum industry standard is 20. Live streams with just one camera are able to reach up to 50 concurrent users.
Adaptive Bitrate Streaming Adaptive Bitrate Streaming (ABR) is a methodology employed for the delivery of a consistently high-quality streaming experience across various network conditions. ABR streaming technologies consistently modulate the minimum required bandwidth to play a given video stream, ensuring smooth performance, regardless of fluctuating bandwidths or circumstances. Leveraging this technology, a stream can be archived into various files with differing bandwidth prerequisites. Subsequently, as a stream is played, the viewer fills out from the lowest-quality file, with progressive switching to higher-quality files as their bandwidth allows, thus ensuring that a stream of satisfactory quality is played. Currently, the most widely used ABR streaming technologies are HLS and MPEG-DASH. HLS was developed by Apple and is designed to serve on HTTP, while MPEG-DASH is an open standard. For a hybrid live stream, it is preferable to use one of these HLS or MPEG-DASH file levels, ensuring no need for transcoding when distributing the stream on other platforms.
WebRTC Technology
WebRTC is an open web standard that enables peer-to-peer audio, video, and data sharing between web browsers and mobile applications. Compared to traditional media streaming protocols, WebRTC stands out due to its ultra-low latency of less than 1 second. Designed to work seamlessly in browsers, it eliminates the need for installation of plugins or third-party applications, making it particularly appealing for mobile platforms.
WebRTC relies on three components: getUserMedia for media capture, peerConnection for configuring sessions, and dataChannel for peer-to-peer data transmission. The versatile dataChannel supports different communication types such as text, binary, or file transfer. Other media protocols with ultra-low latency typically rely on commercial solutions.
However, WebRTC was initially not designed for large-scale broadcasting applications. Traditional WebRTC communication, known as “mesh,” directly connects all peers in a conference room-like fashion. As a result, increasing participants visually deteriorates the quality of each ongoing connection. In commercial solutions, the load balancing relies on extra infrastructure. To sidestep issues, several architecture designs have been implemented, such as Selective Forwarding Unit (SFU), which transmits all senders’ streams to the peer and allows each end-user to subscribe to the streams it needs, and Transcoding Gateway, where all streams converge to Media Server for transcoding. This rendering has enabled distribution to a larger audience. However, it sacrifices performance and does not support all media protocols defined by WebRTC.
End-to-end encryption is one major area of focus when addressing security issues. Unlike traditional streaming protocols, which have variable and customizable data encryption, built-in security using DTLS-SRTP encryption is fundamental to all WebRTC communications. Attacks on WebRTC technology have been focused on privacy violations using the getUserMedia API to retrieve audio and video streams from the computer without the user’s knowledge.
As a media protocol, WebRTC has its advantages and disadvantages. Low latency makes it the best peer-to-peer solution for direct interaction applications like video calls or auctions via an interactive dashboard. Turning to broadcasting applications, WebRTC is not the best choice as it can hardly scale distribution above 1000 participants if not already adapted with the infrastructure of SFU or Media Server architecture.
Adaptive Bitrate Streaming
Adaptive bitrate streaming (ABR) technology is a crucial component of hybrid live streaming, ensuring a seamless and high-quality viewing experience across various network conditions on the client side. This technology enables the adaptation of video playback based on the viewer’s current bandwidth, device performance, and other factors. ABR streaming has been integrated into major video delivery standards to promote interoperability and compatibility with different player implementations.
In providing an overview of the adaptive streaming technology, the typical streaming workflow is explained. The implementation of the technology using the HLS standard is also described, along with operational considerations for content providers. A Mobile Edge Computing (MEC)-based architecture for hybrid live streaming is proposed, capitalizing on the potential of providing low latency and cost-effective streaming to clients with different service requirements.
For both live and on-demand video needs, adaptive bitrate streaming is a widely adopted technology that serves diverse video resolutions and multi-bitrate representations at the same time. The video segments are encoded at different qualities, then placed into an adaptive manifest file together with the corresponding segment delivery URLs. This file is hosted on streaming servers as an entry point for initiating streams on the client side. Based on the current client-side buffer level, estimated bandwidth, video segment playback duration, and other factors, the player intelligently selects a delivery URL and an initial segment bitrate.
Typically, low-bitrate segments are delivered under slow connections, while high-bitrate segments are targeted for high-bandwidth connections, ensuring a graceful playback experience for clients. In the event of bandwidth fluctuation, clients adjust video resolutions accordingly, either up or down, to accommodate changes in available resources. When the bandwidth is once again sufficient, clients seamlessly resume playback at a relatively higher resolution.
Applications of Hybrid Live Streaming
With the advancement of new technologies, the demands of users for hybrid live streaming systems are also increasing dramatically. Demand for high-quality hybrid live streaming has skyrocketed in several different fields and is broadly covered in this section.
E-learning is one of the first applications that utilized hybrid live streaming, and hybrid live streaming has been well researched in this area. The pandemic has accelerated the near-ubiquitous adoption of online education at all levels across the globe, from primary and secondary schools to universities, training centers, and corporate learning. This massive and rapid shift has raised several challenges to ensure the effectiveness and quality of the education delivered. These challenges encompass academic integrity, data privacy, accessibility, and the need to better engage and interact with online learners. Educators have shifted to a new pedagogical approach, blending face-to-face classes with online education. Several synthesizing studies have pointed out shortcomings with such amalgamations, including fidelity concerns with technology-enabled learning environments.
The pandemic also resulted in a sudden need for virtual conferences. Early in the pandemic, many organizations postponed or canceled their conferences. Some organizations quickly launched a series of online conferences, during which many challenges occurred on this exchange platform for researchers worldwide. Many different conferencing systems were used.
Later in 2020, the concept of hybrid conferences, applying online platforms alongside in-person venues, gained attention. The hybrid format was considered not only a solution to the challenges brought by the pandemic but also an opportunity to better engage audiences remotely. A hybrid conferencing system has to transfer audiovisual signals between different venues and process the signals in a way that delivers a unified experience to both communities. Several researchers have examined and developed hybrid conferencing systems and operations from different angles, including feasibility studies, postmortems of conferences run in hybrid mode, and technical perspectives on microphone placement, camera design, and system architectures.
E-Learning and Online Education
Over the last two decades, both higher education institutions and private education providers have increased their investment in the delivery of educational materials via the internet. The term e-learning has become widely accepted. Initially, e-learning was mainly based on old teaching paradigms such as text and static images with low interactivity levels. Significant development phases have been passed, including multimedia-based materials and the introduction of high-speed internet and computer technologies. A number of innovative technology-based approaches have been developed, ranging from virtual universities to balloon universities, but the most successful models have incorporated blended learning approaches, where traditional education is enriched with online components.
With the democratization of knowledge, hurdles preventing individuals from educational attainment outside classroom walls have been reduced. As a result, networks of peer learners have emerged outside educational institutions, creating various forms of online communities where individuals can share their knowledge and expertise with one another. Today, there is a plethora of sites providing free lectures, course materials, and forums for online debates, thus challenging the monopoly of formal educational institutions over knowledge.
Educational institutions of all sizes are increasingly likely to include live streaming and/or video-on-demand services among their offered service components. Networked audio and visual technology has improved quickly in recent decades, enabling users to deliver high-quality services at low overhead costs. Integrating online, broadcast, or IPTV with current or legacy lecturing and classroom recording systems has become relatively straightforward. Enterprise-wide interest in live production and video-on-demand services integrating desktop, ceiling-mounted, or on-camera codecs pairs with similar interest in wider dissemination and increased production values.
Virtual Events and Conferences
As the world embraces new technologies for hybrid live broadcasting, various platforms have emerged to conduct online conferences of every kind. Simple applications allow one-to-many communication, emulating the usual holding of a conference in the pre-pandemic world. However, the hyper-use of this simple e-learning could promote the creation of new systems for more complex transformations of business operations and models in the future. New platforms close to social networks with controlled playback of recorded conferences and the possibility of interactive Q&A are under development. Audio-only services for going “back to the roots” might exist as a retro-cool trend.
Aggressive promotion of loud advertising of products in the form of studies, presentations, keyword filtering, and non-conducted sessions are possible ways to harm the production of quality content and remove unwanted advertisements too. Manipulative events with corresponding big data mathematical generation mechanisms could emerge, added to social network models
of reputational measures. The whole new production of audiovisual content generators would require fees to be like an unusual audiovisual work.
Virtual events emerged as a channel for real-time communication with global outreach opportunities. Some applications are for small group communication. Broader applications for one-to-many communication are for global conferences. Some platforms offer a fully integrated solution: online broadcast, chat room services, the publication of materials and video recordings, and in-house controlled conference management. In-house proprietary software will allow design with broadcaster suppliers and organizers’ feedback, with core staff trained to assist and monitor speakers, providing panels ready online and fostering “natural” video interoperability. Applications supported on the hybrid model of attending e-learning events embrace wider media coverage. Solutions provided by external contractors require full service with planned costs, sometimes increasing in the case of difficulties.
Challenges and Solutions in Hybrid Live Streaming
Hybrid live streaming faces a number of challenges that are paramount to its successful deployment in professional environments. This chapter identifies the most relevant challenges related to the distribution of signals from the venue to the cloud and the real-time reception of streams in hybrid setups via the cloud. The two most prevailing challenges are identified as latency and streaming bandwidth constraints. Both challenges are defined, illustrating their consequences in hybrid environments. Furthermore, proposed and state-of-the-art solutions are discussed, showing their fulfillment of key use case requirements.
With the establishment of low-latency hybrid technology, the primary objectives for physical venue operators, event organizers, and cloud providers do not change, namely reducing latency as much as possible and delivering high-quality media experiences. However, unlike with cloud-first technology, where the issue was on the reception and processing side of hybrid media experiences, cloud operator-to-venue transmission configurations are more multifaceted with hybrid setups. Due to venue on-premises infrastructure limitations, onsite operational challenges, and heavy equipment costs, implementations of venue networks for cloud distribution architectures are not straightforward. As a consequence, technology vendors with meeting room products typically also provide their own public cloud for offloading media, which significantly reduces the flexibility of hybrid system designs. Furthermore, such products often bundle cloud processing services with on-premises filtering capabilities, adding to the challenge of content customization. As emerging hybrid technology aims to tackle large productions with diverse, multi-camera media experiences, the integration of existing, usually high-value on-premises equipment at the venue is an important consideration.
With the above, it is essential for hybrid content providers to have cloud provider options other than the meeting room solution vendor. A common implementation scenario is for a Cloud Production Provider to deliver uplink feeds from multiple sources to one or multiple Media Processing Providers for transcoding, producing, and offloading to one or multiple Content Distribution Networks for distribution. More frequently than not, the finished output of such setups is low latency EdgeStream, where only the event initiator is allowed to publish a stream. Here, the contents of the caveats of EdgeStream setups must be strictly adhered to.
Latency Issues
Live streaming combines the real-time broadcast of live content with streaming media technology. Today, it is now possible to transmit events to thousands, sometimes millions, of remote viewers in real time, using the Internet, telecommunication networks, or off-the-shelf hardware and software components. For a broadcaster, the workflow of hybrid live streaming typically involves video capture, encoding and stream formatting, distribution, cloud processing, and playout. With different interactions, the common experience would be for the cast, audience, or production to connect devices to the platform and perform video actions.
Live streaming faces two challenges when it comes to audience interactivity: video delay, or latency, and response delay, or reactivity. Video latency can further be split into several components: encoding delay, contribution delay, distribution delay, playout delay, and buffering delay. From a viewer’s perspective, one major source of delay is the buffering time that a player needs to pre-fill the buffer before starting playback, and latency can be defined as the time delay between the start of the live event and the first moment when it is possible to watch it.
There are two types of reactivity in audience interactions: one-to-many, like chat and audience polls, where a message sent by one audience member can be delivered to everyone through the chat, and many-to-one, like audience questions where selected questions from the chat are posed to the cast and answered. A one-to-many interaction is considered faster and has a very low response time, typically a few seconds. On the other hand, a many-to-one interaction with Q&A is typically much slower and has a time delay of around 30 seconds or more.
Live question and answer (Q&A) opens the possibility that remote viewers can pose questions to the cast. In contrast with chat, it is a so-called many-to-one reactivity where numerous questions are asked by the audience, but a limited number of them can be answered. To avoid time zone confusion, it has to be defined whether the time delay refers to an absolute internal time frame or to an external time frame indicating real clock times.
Bandwidth Constraints
In hybrid live streaming, a combination of both local users and remote users takes part in the same live event. On the one hand, local users receive the high-definition video stream from the local camera, while the remote users receive the live stream from the server via CDN. Because local users and remote users connect to the server from different places and under different network conditions, differences between the two streams are inevitable, which may become a challenge. One of the results of those differences is the different video quality, mainly the video resolution. Unfortunately, the high-definition video stream sent to the local users will degrade to the same resolution as the video stream sent to the remote users, the 720p resolution. The reason is that in order to reduce the bandwidth usage, it is set that only one video stream with medium resolution, which is the 720p resolution, can be sent to the server for further distribution.
Consequently, in addition to the video stream with lower resolution and insufficient details, the video stream with high-definition resolution is also discarded. The same as the above challenge, the sustained bandwidth is also reduced to the same as the minimum sustainable bandwidth, equal to 10 Mbps. Ironically, the opencast as a solution to the above challenge can be turned into a bottleneck. During the opencast, a large influx of remote users will connect to the server, which means that they will share the limited bandwidth of the server. Every remote user requires a bandwidth of 1.5 Mbps to receive the 720p video stream. Unfortunately, the server itself can provide at most 20 Mbps of bandwidth. Hence, the maximum number of remote users is 13, meaning that if there are more than 13 remote users connecting to the server at the same time, there will be some of them with a bandwidth lower than 1.5 Mbps and thus unable to receive the video stream. As a result, the remote users are not connected to the server, which is regarded as a crisis in the situation where a large number of remote users are lost during the opening ceremony.
To cope with these challenges and control the number of users who fail to connect to the server, it is essential to analyze the bandwidth usage for both local users and remote users. On the one hand, local users take part in the event by connecting to the local camera through a network switch. Consequently, the bandwidth of any number of local users connecting with the same server is invariant, equal to the sustained bandwidth of the incoming stream from the local camera to the server, which is 20 Mbps. On the other hand, without a network switch, all remote users take part in the event by connecting to the server via WAN. Hence, the bandwidth of each remote user connecting to the server is a fraction of the sustained bandwidth of the incoming streams. The bandwidth of local users and remote users connecting to the server varies with the number of users connecting to the server. Therefore, the total bandwidth of both local users and remote users connecting to the server can be established.