19 Nov
2010

Taking Baby Steps with Node.js – Threads vs. Events

Category:UncategorizedTag: , :

In a previous blog post, I provided a shallow introduction to Node.js. I also mentioned where you can find more information on how to get it installed on Windows as well as how to install a seemingly popular package manager in the JavaScript community called Npm.

In the mean time, I’ve started to get a more clearer view on the general concepts on which Node.js is based on, as well as the kind of applications that can be built using this server-side platform. The more I read and learn about Node.js, the more I come to the conclusion that it is very much targeted towards building real-time applications. Google Wave, Friendfeed and most recently Facebook are popular examples. You can also read this article to learn more about other examples of real-time web applications.

As I briefly mentioned in the previous blog post, Node.js makes heavy use of JavaScript’s event-based style of programming which lies at the heart of it’s capabilities for building real-time applications. This event-based model is a completely different way of thinking compared to the thread-based model that we’ve been so accustomed to over the past couple of years. ASP.NET Web Services or WCF services for that matter are excellent examples of the thread-based model. Every time a message comes in, these frameworks spawn a new thread or take one from the thread pool in order to handle this request. There’s nothing wrong with this approach. In fact, this thread-based model makes perfect sense for many of the scenarios out there. But generally not for real-time applications that usually require long-lived connections.

In the thread-based model, most of the threads spend a lot of their time being blocked; waiting for I/O operations like executing queries against a database, calling another service or writing to a file on disk. These are expensive operations that usually take longer to complete compared to in-memory operations. When having large amounts of traffic, you can’t afford to have threads blocking for long periods of time. Otherwise you’ll be hitting the maximum number of available threads rather sooner than later.

Node.js solves this by putting the event-based model at its core, using an event loop instead of threads. All these expensive I/O operations that we just talked about are always executed asynchronously with a callback that gets executed when the initiated operation completes. The net result here is that while the I/O operation is busy performing its duties, Node.js is able to accept other incoming requests and start doing the work required to handle these tasks. When the I/O operation completes, the specified callback is executed and the earlier request is further processed. The event loop manages to switch between these requests very fast picking up where it previously left of. This event-based model provides the means for building highly scalable real-time applications.

Let me show you a naive example of this concept so you can get a feel on how this looks in code.

var http = require('http');

http.createServer(function(request, response) {
    var feedUrl = 'http://feeds.feedburner.com/astronomycast.rss';
    var parsedUrl = url.parse(feedUrl);

    var client = http.createClient(80, parsedUrl.hostname);
    var request = client.request(parsedUrl.pathname, { 'host': parsedUrl.hostname });
    request.addListener('response', handle);
    request.end();

    response.sendHeader(200, { 'Content-Type': 'text/html' });
    response.end('Done processing the request.');
}).listen(8124);

function handle(response) {    
    if(response.statusCode !== 200)
        return;
    
    var responseBody = '';
    
    response.addListener('data', function(chunk) {
        responseBody += chunk;
    });
    
    response.addListener('end', function() {       
        console.log('All data has been read.');
        console.log(responseBody);
    });
}

Our server implementation just reads the content of a particular RSS feed every time a request comes in. This code doesn’t do anything useful except illustrating the fact that when we make an HTTP request for an external resource, this HTTP request is fired of asynchronously. We need to subscribe an event listener for when the request completes and in order to read the requested data from the HTTP response. In the mean time, Node.js takes one other requests, firing new HTTP requests and going its merry way. Notice that even reading in the chunks of data from an HTTP response is done asynchronously!

This isn’t very different compared to performing Ajax requests in a browser, now is it? Take a look at the following jQuery snippet and notice how similar it looks with the code for performing an HTTP request in our server-side example.

$.getJSON('http://myfancywebsite.com/something', function(data, status) {
    // Handles the data from the response
});

Earlier this week, I was listening to this excellent episode of Herding Code on Manos de Mono. This is a high performance web application framework that targets the .NET platform, Mono in particular. I know there are new web frameworks popping out of the ground like mushrooms every day. But what particularly excites me about this one is that it’s based on the same high-performance event loop as Node.js, which is called libev. I have to admit that I haven’t heard about this before the Herding Code episode, but I’m definitely looking forward spending some time on it as well.

As I mentioned before, I’m just learning about this stuff so I’m happy to get your feedback, thoughts, etc … . Till next time. 

12 thoughts on “Taking Baby Steps with Node.js – Threads vs. Events

  1. Sounds like the event loop used in monotorrent which can be seen as a a c# equivalent. For the record, WCF uses IO threads, so the blocking behavior is only in YOUR code, not theirs. There’s some documentation out there showing which .NET Apis use IO threads. You could always build your own event loop using IO threads and a message pump. That said, native support for these kind of constructs is probably a big plus for getting up and running fast.

  2. Sounds like the event loop used in monotorrent which can be seen as a a c# equivalent. For the record, WCF uses IO threads, so the blocking behavior is only in YOUR code, not theirs. There’s some documentation out there showing which .NET Apis use IO threads. You could always build your own event loop using IO threads and a message pump. That said, native support for these kind of constructs is probably a big plus for getting up and running fast.

  3. Dude, your blog’s comment system is broken from a usability POV. It reports an error about my comment, and then it decides to keep/allow it anyway.

  4. Dude, your blog’s comment system is broken from a usability POV. It reports an error about my comment, and then it decides to keep/allow it anyway.

  5. @seagile
    Indeed, the blocking of IO threads is not caused by WCF but by the application programmer. The point I was trying to make is the difference between the thread and event models. As you mentioned, it’s possible to write your own event pump, but I haven’t seen anything like that on top of existing .NET FW’s. I’m definitely going to read the article you pointed out. Thanks for the feedback.

  6. Quite good baby steps. You’ll be able to walk soon!:P Have you read about asynch in C# 5.0? Seems that the language designers wants your blocking operations to be tasked up as much as possible

  7. @Scooletz
    I haven’t dived into the new async stuff in C# 5.0, but I have watched the session at this years PDC. I guess I need to learn more about that, although at first glance it just looks like the TPL tasks incorporated into the language. Not sure if you can use these new language features to build an event pump just like the one used for node.js.

  8. Yes, you’re right it’s a wrap around TPL setting the rest of the method as a continuation of a started task. What I meant was, that you’ll be able to produce more powerful applications in terms of standard time waste for I/O operations.

  9. Hi, great article: One question, though.
    You say:
    “The net result here is that while the I/O operation is busy performing its duties, Node.js is able to accept other incoming requests and start doing the work required to handle these tasks. When the I/O operation completes, the specified callback is executed and the earlier request is further processed.”

    The ‘I/O operation’ is doing its work on another thread? While node.js is able to accept incoming requests ‘that other work’ needs to be done on another thread… right?

    Thanks.

    1. Most likely on another thread yes. The point is that you as a developer are completely shielded from this. It basically depends on how smart Node.js wants to be with this. Internally, Node.js makes use of a thread pool so it decides whether to use a seperate thread for this or not.

Comments are closed.

Find me

RSS
Facebook
Twitter
LinkedIn
SOCIALICON