Connect. Collaborate. Innovate.

An Introduction to WebRTC

The future of real-time apps

An Introduction to WebRTC

This afternoon I got to attend an event organized by Kranky Geek at 91Springboard, Bangalore. The topic of the day was WebRTC, an API and protocol that enables real-time communications without the pain of installing plugins, configuring devices or worrying about security. The event was nicely structured: an overview, a detailed explanation of a typical call setup and its API calls, a live coding demo, a look at multiparty complexities, and finally an insight into how a startup PiOctave is using WebRTC in its IoT product.

Real-time communications has always been a challenge on the web, although I must point out that the birth of UDP/IP/VoIP owes a lot to Network Voice Protocol (NVP) of the 1970s, perhaps the first real-time protocol of the Internet. The problem was that back then we were handicapped by low bandwidths and modem speeds. This is not much of a problem these days but true many-to-many media streaming is hampered by the architecture of the web. WebRTC is trying to overcome that.

Internet is structured around the client-server model. This causes a problem when, let's say, three friends are involved in a video call. Three streams of media must go to the server that has to coordinate all streams and relay them to the three clients. WebRTC avoids this by adopting a peer-to-peer model. A server is involved in setting up the logical connections between the peers. After that, media is streamed peer to peer without involving the server. This allows applications to scale easily with respect to the number of peers.

Another beauty of WebRTC is NOT to define a signalling protocol. This allows developers to choose any signalling protocol they wish to use. It could be SIP, HTTP, WebSocket, Matrix or something proprietary. If there's no signalling, how is negotiation done? WebRTC does use Session Description Protocol (SDP) format for signalling and it specifies the different types of packets that can be sent, such as codecs supported. Yes, signalling information is part WebRTC but how such information is exchanged is left open.


Since in a networked world, HTML/CSS/JS are the operational engines, wouldn't it be nice to leverage on them? This is exactly what WebRTC does. The API is written in JavaScript. Calls are asynchronous and fulfilled via promise interfaces. Media are rendered within HTML5 tags. Major browsers support WebRTC, which means that users don't need to install plugins, particularly third-party ones that could pose security risks. A video chat app based on WebRTC can run straight off a web browser. Developers also like the fact that debugging can be done within the browser using standard tools (such as the console) in addition to specialized debugging that browsers offer for WebRTC.

If we thought that audio and video are the only aspects of a WebRTC app, we are quite wrong. Why not have a digital whiteboard on which anyone can write? Why not pass control of your desktop to a remote user? Why not have an accompanying text chat app? Why not forward a file? WebRTC enables arbitrary exchange of data via its Data Channel. In fact, it's also possible to use the Data Channel to do the signalling!

What this means is that WebRTC is quite versatile but that's only because it's a technological enabler. By itself, it means nothing. It's what you can build with it that's set to revolutionize real-time apps on the web. So it's all about creativity. It's also about implementation that can scale. Here are a few examples:

  • A multi-player game is being played but audiences anywhere can see how the game is progressing.
  • A live interview of a player at a stadium and a viewer can come in to ask a question.
  • A classroom setting in which any student can ask questions and students can engage in discussions.

One of the challenges that WebRTC seems to have solved is Network Address Translation (NAT). NAT is what maps public IP address to private IP address and vice versa. This is fine for client-server model but is not going to work for peer-to-peer connections. WebRTC solves this by using a STUN server or a TURN server (proxy).

Implementations will have their challenges. In a many-to-many app, there could be an inefficient use of either bandwidth or CPU resources in managing multiple streams. Multipoint Conferencing Unit (MCU) or Selective Forwarding Unit (SFU) are architectural models that can be investigated. There are design challenges that need to be solved for running WebRTC on a mobile device. WebRTC supports media recording but at times these may need to be done centrally due to regulations. Log parameters in real time because if call quality suffers, logs can be invaluable to developers. One good practice may be to enumerate devices on a peer (eg. a computer could have multiple cameras) and let the user select the right one.

All in all, the event was excellent for its overview followed by the details of WebRTC. The event was also good for networking with other WebRTC enthusiasts.

About the Author

Arvind Padmanabhan

Arvind Padmanabhan

Arvind Padmanabhan graduated from the National University of Singapore with a master’s degree in electrical engineering. With fifteen years of experience, he has worked extensively on various wireless technologies including DECT, WCDMA, HSPA, WiMAX and LTE. He is passionate about training and is keen to build an R&D ecosystem and culture in India. He recently published a book on the history of digital technology:

Leave a comment

You are commenting as guest.