Thomas Hastings

What I learned developing real-time web applications

December 29, 2016

Overview

Over the last two years, I've had the privilege of developing a web application that displays real-time hardware and software information. The application utilizes middleware to communicate with drivers on different hardware platforms. When changes in values are detected from the drivers, the middleware software fires a server-sent event. The server-sent events are aggregated and then sent from the aggregation server up to the front-end client. From there, Javascript is used to parse the messages and append updates to the graphical user interface. On average, there are 29 messages sent every second. The average message size is 3KBs. Within a minute, the system generates roughly 5MBs worth of messages.

Document Object Model (DOM) Manipulation

The system, on average, sends 29 messages every second to the front-end. 29 DOM updates every second will bring any web application to its knees if the JavaScript isn't implemented correctly. Below are 5 things that I learned in regards to updating and manipulating the DOM.

Each Javascript function that updates or manipulates the DOM creates a reflow and redraw, which works for the browser (the less work the browser has to do, the better).
It's optimal to update the DOM all at once with one function if you can. This limits the amount of work the browser has to do with reflow because the browser will wait for the function to finish and only execute one reflow and repaint.
Be specific about the element that you want to edit in the HTML. This is where good design comes into play. Wrap HTML values in" for values that will change based on events. For example:
```
<h1>Welcome, <span id=”welcome-user-name”>user</span></h1>
```
Doing it this way will allow you to target the span by using the id to update the element instead of updating the entire header element.
Only update DOM elements that the user can see. The application I worked on was a single-page website, and we had div layers hidden for individual pages. There is no reason to update hidden elements in real-time. We worked around this by calling a 'GET' on the RESTful endpoints when a user clicked on a new page that was hidden previously to retrieve the most current information and then updated thecontentbased on events after the initial pull.
Do NOT rely heavily on the front-end for logical reasoning. The less work the browser has to do, the better. All data should be parsed and managed in the model and then forwarded to the view. For example, when dealing with hardware, hardware can be in a bad state. Say a fan on a server is bad. This is something a user will want to know, but it shouldn't be the front-end's responsibility to compare values to figure out if something is out of range. That should have been worked out by the time the data is sent to the view. The view needs to know that if a bad state is present and handle it accordingly.

Event Source

The event source caused some issues early on with the application. Below are 5 lessons that I learned.

Google Chrome limits the number of open sockets to 6. Each event source counts as an open socket. Each refresh also counts as an open socket. If you attempt to have 6 event sources open and refresh, you'll run into a "waiting for available socket" error. The fewer event sources open at one time, the better.
New windows in Google Chrome don't automatically mean that the new window is running under a new process. This means that new windows with event sources will still be limited to the 6 socket sum across both windows. To work around this, we register new windows with the current window and pass the server-sent events from the main window to the new window without opening additional event sources.
The browser needs to close and open a new socket occasionally. The interval for restarting will depend on the amount of data coming through the pipe. We observed that a browser left open for over 6 hours became sluggish. The issue had all of the symptoms of a classic memory leak. Using Chrome's developer tools, we noticed that the event source, for every message, continued to grow in size. For the browser to garbage collect on the event source, it needs to be closed. Opening a new event source and closing the old one fixed this issue.
Server-side applications can break the front-end client. We had an issue where a browser would fail to keep up with the messages coming from the server and gradually become sluggish to the point of hitting the "aw snap" page in Chrome. To overcome this challenge, we made a queue for each client in a thread on the backend that aggregates the server-sent events and passes those on to the listener. If the queue to the client's listener begins to backup, the backend will close the connection and attempt to establish a new connection. This prevents one client from causing a domino effect on the backend and causing the other clients to crash.
In this application, because we had so many messages, we needed a way to filter messages that we cared about at any given time. When we first wrote the application, we listened to every message, regardless of relevance. As the project matured, we realized that we needed a way to filter messages. We could do this by passing in variables to the backend with attributes that described what messages we cared about. We ended up with two event sources running in parallel on the client. The first event source listens for critical status messages while the other listens for applicable data based on the user's page.

Conclusion

This was the most challenging and most fun project that I've worked on to date. I'm excited about the future of web applications and all of the utility that new web languages bring to the craft.