Introduction to Load Balancing Using Node.js - Part 2

by Ross Johnson
2013/11/11

Introduction

In my previous post on load balancing, I introduced the concept of horizontal scaling using multiple instances of a pi estimating application sitting behind an http proxy. While it serves well as an academic example, the proposed solution if the previous post is far from ideal, and far from being production ready.

In this post, I will improve our pi example by using substack's seaport module to give the main application better information about available pi workers.

The Problem

The previous version of the main balancer is where a lot of problems lie. The balancer is hard coded with the location of each worker, which means that if a worker crashes, the balancer will continue to send it requests. This problem could be mitigated by using the forever node module, or a more powerful utility like Supervisord, to restart crashed workers. Alternatively (or additionally), the main balancer could try to detect workers that aren't responding and remove them from its worker list.

There is an additional problem of how to add new workers to the service. In order to add additional workers, the previous balancer would have to be edited and reloaded. This problem could be lessened by using a configuration file that the load balancer reloads every so often, but this still requires manual tracking and updating of the configuration file. If we really wanted to, we could try to share the configuration file across multiple servers and have the workers update the file when they start up, but at that point we might as well just make another web service to manage the workers.

Luckily, and as with most problems in node, there is a module which handles most of this for us.

The Solution

The seaport module is a service manager which maintains a duplicated list of running services. When a worker starts up, it registers itself with seaport (which can even assign the worker a port), and then the location and purpose of that worker then gets duplicated across all other seaport instances.

To get started with seaport, you must first install it and then start it running on a port.

$ sudo npm install -g seaport
$ seaport listen 9090 &

Now, we can edit our pi worker to register with seaport.

Listing 1: Modified pi-server.js

var http = require('http');

var seaport = require('seaport');
var ports = seaport.connect('localhost', 9090);

/**
 * This function estimates pi using Monte-Carlo integration
 * https://en.wikipedia.org/wiki/Monte_Carlo_integration
 * @returns {number}
 */
function estimatePi() {
    var n = 10000000, inside = 0, i, x, y;

    for ( i = 0; i < n; i++ ) {
        x = Math.random();
        y = Math.random();
        if ( Math.sqrt(x * x + y * y) <= 1 )
            inside++;
    }

    return 4 * inside / n;
}

// Create a basic server that responds to any request with the pi estimation
var server = http.createServer(function (req, res) {
    res.writeHead(200, {'Content-Type' : 'text/plain'});
    res.end('Pi: ' + estimatePi());
});

// Listen to a specified port, or default to 8000
server.listen(ports.register('pi-server'));

Now as for the load balancer, it no longer has a set number of workers to cycle through, but will instead cycle through whatever pi-servers seaport sees as available.

Listing 2: Modified load-balancer.js

var arguments = process.argv.splice(2);
var httpProxy = require('http-proxy');

var seaport = require('seaport');
var ports = seaport.connect('localhost', 9090);

//
// Addresses to use in the round robin proxy
//
var i = -1;
httpProxy.createServer(function (req, res, proxy) {
    var addresses = ports.query('pi-server');

    // if there are not workers, give an error
    if (!addresses.length) {
        res.writeHead(503, {'Content-Type' : 'text/plain'});
        res.end('Service unavailable');
        return;
    }

    i = (i + 1) % addresses.length;
    proxy.proxyRequest(req, res, addresses[i]);
}).listen(arguments[0] || 8000);

Running It

To start off, let's fire up the load balancer without any workers.

$ node load-balancer.js 8000

Now, browsing to our web service produces an error of no available workers.

Without restarting the balancer, we can fire up as many pi workers as we want and start to use them.

$ node pi-server.js &
$ node pi-server.js &
$ node pi-server.js &

At this point, your balancer on port 8000 should be distributing between the 3 discovered workers. The response time graph should be very similar to the final one from the previous blog post.

At this point we have added service discovery to our pi web service. With a simple command we can fire off a new worker and gracefully scale our web service. While I have been running these examples on my computer, they could just as well be ran across multiple physical servers connected via a VPN.

Further Improvements

At this point, if I kill a worker node, it will not finish handling any current requests, and it will take a short while for seaport to detect and update the load balancer. What I should do is have the worker listen for SIGTERM and gracefully unregister with seaport, close the connection listener, and gracefully exit on its own time. This would allow for more graceful updates of workers.

As some have mentioned, there are faster and more robust proxies than node-http-proxy, such as nginx or HAProxy. Using either of these would very likely improve performance, but then there is an additional problem of keeping them updated with the available seaport services.

Finally, up to this point I have not mentioned node's cluster support which would allow a single pi-server.js instance to utilize multiple cores on its own to fully utilize a machine.

Copyright © 2018 Mazira, LLC
All rights reserved.