Chapter 1. A Very Brief Introduction to Node.js

Node.js is many things, but mostly it’s a way of running JavaScript outside the web browser. This book will cover why that’s important and the benefits that Node.js provides. This introduction attempts to sum up that explanation in a few paragraphs, rather than a few hundred pages.

Many people use the JavaScript programming language extensively for programming the interfaces of websites. Node.js allows this popular programming language to be applied in many more contexts, in particular on web servers. There are several notable features about Node.js that make it worthy of interest.

Node is a wrapper around the high-performance V8 JavaScript runtime from the Google Chrome browser. Node tunes V8 to work better in contexts other than the browser, mostly by providing additional APIs that are optimized for specific use cases. For example, in a server context, manipulation of binary data is often necessary. This is poorly supported by the JavaScript language and, as a result, V8. Node’s Buffer class provides easy manipulation of binary data. Thus, Node doesn’t just provide direct access to the V8 JavaScript runtime. It also makes JavaScript more useful for the contexts in which people use Node.

V8 itself uses some of the newest techniques in compiler technology. This often allows code written in a high-level language such as JavaScript to perform similarly to code written in a lower-level language, such as C, with a fraction of the development cost. This focus on performance is a key aspect of Node.

JavaScript is an event-driven language, and Node uses this to its advantage to produce highly scalable servers. Using an architecture called an event loop, Node makes programming highly scalable servers both easy and safe. There are various strategies that are used to make servers performant. Node has chosen an architecture that performs very well but also reduces the complexity for the application developer. This is an extremely important feature. Programming concurrency is hard and fraught with danger. Node sidesteps this challenge while still offering impressive performance. As always, any approach still has trade-offs, and these are discussed in detail later in the book.

To support the event-loop approach, Node supplies a set of “nonblocking” libraries. In essence, these are interfaces to things such as the filesystem or databases, which operate in an event-driven way. When you make a request to the filesystem, rather than requiring Node to wait for the hard drive to spin up and retrieve the file, the nonblocking interface simply notifies Node when it has access, in the same way that web browsers notify your code about an onclick event. This model simplifies access to slow resources in a scalable way that is intuitive to JavaScript programmers and easy to learn for everyone else.

Although not unique to Node, supporting JavaScript on the server is also a powerful feature. Whether we like it or not, the browser environment gives us little choice of programming languages. Certainly, JavaScript is the only choice if we would like our code to work in any reasonable percentage of browsers. To achieve any aspirations of sharing code between the server and the browser, we must use JavaScript. Due to the increasing complexity of client applications that we are building in the browser using JavaScript (such as Gmail), the more code we can share between the browser and the server, the more we can reduce the cost of creating rich web applications. Because we must rely on JavaScript in the browser, having a server-side environment that uses JavaScript opens the door to code sharing in a way that is not possible with other server-side languages, such as PHP, Java, Ruby, or Python. Although there are other platforms that support programming web servers with JavaScript, Node is quickly becoming the dominant platform in the space.

Aside from what you can build with Node, one extremely pleasing aspect is how much you can build for Node. Node is extremely extensible, with a large volume of community modules that have been built in the relatively short time since the project’s release. Many of these are drivers to connect with databases or other software, but many are also useful software applications in their own right.

The last reason to celebrate Node, but certainly not the least important, is its community. The Node project is still very young, and yet rarely have we seen such fervor around a project. Both novices and experts have coalesced around the project to use and contribute to Node, making it both a pleasure to explore and a supportive place to share and get advice.

Installing Node.js

Installing Node.js is extremely simple. Node runs on Windows, Linux, Mac, and other POSIX OSes (such as Solaris and BSD). Node.js is available from two primary locations: the project’s website or the GitHub repository. You’re probably better off with the Node website because it contains the stable releases. The latest cutting-edge features are hosted on GitHub for the core development team and anyone else who wants a copy. Although these features are new and often intriguing, they are also less reliable than those in a stable release.

Let’s get started by installing Node.js. The first thing to do is download Node.js from the website, so let’s go there and find the latest release. From the Node home page, find the download link. The current release at the time of print is 0.6.13, which is a stable release. The Node website provides installers for Windows and Mac as well as the stable source code. If you are on Linux, you can either do a source install or use your usual package manager (apt-get, yum, etc.).

Note

Node.js version numbers follow the C convention of major.minor.patch. Stable versions of Node.js have an even minor version number, and development versions have an odd minor version number. It’s unclear when Node will become version 1, but it’s a fair assumption that it will only be when the Windows and Unix combined release is considered mature.

If you used an installer, you can skip to First Steps in Code. Otherwise (i.e., if you are doing a source install), once you have the code, you’ll need to unpack it. The tar command does this using the flags xzf. The x stands for extract (rather than compress), z tells tar to also decompress using the GZIP algorithm, and f indicates we are unpacking the filename given as the final argument (see Example 1-1).

Example 1-1. Unpacking the code

enki:Downloads $ tar xzf node-v0.6.6.tar.gz 
enki:Downloads $ cd node-v0.6.6
enki:node-v0.6.6 $ ls
AUTHORS       Makefile       common.gypi     doc        test
BSDmakefile   Makefile-gyp   configure       lib        tools
ChangeLog     README.md      configure-gyp   node.gyp   vcbuild.bat
LICENSE       benchmark      deps            src        wscript
enki:node-v0.6.6 $

The next step is to configure the code for your system. Node.js uses the configure/make system for its installation. The configure script looks at your system and finds the paths Node needs to use for the dependencies it needs. Node generally has very few dependencies. The installer requires Python 2.4 or greater, and if you wish to use TLS or cryptology (such as SHA1), Node needs the OpenSSL development libraries . Running configure will let you know whether any of these dependencies are missing (see Example 1-2).

Example 1-2. Configuring the Node install

enki:node-v0.6.6 $ ./configure
Checking for program g++ or c++          : /usr/bin/g++ 
Checking for program cpp                 : /usr/bin/cpp 
Checking for program ar                  : /usr/bin/ar 
Checking for program ranlib              : /usr/bin/ranlib 
Checking for g++                         : ok  
Checking for program gcc or cc           : /usr/bin/gcc 
Checking for gcc                         : ok  
Checking for library dl                  : yes 
Checking for openssl                     : not found 
Checking for function SSL_library_init   : yes 
Checking for header openssl/crypto.h     : yes 
Checking for library util                : yes 
Checking for library rt                  : not found 
Checking for fdatasync(2) with c++       : no 
'configure' finished successfully (0.991s)
enki:node-v0.6.6 $

The next installation step is to make the project (Example 1-3). This compiles Node and builds the binary version that you will use into a build subdirectory of the source directory we’ve been using. Node numbers each of the build steps it needs to complete so you can follow the progress it makes during the compile.

Example 1-3. Compiling Node with the make command

enki:node-v0.6.6 $ make
Waf: Entering directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
DEST_OS: darwin
DEST_CPU: x64
Parallel Jobs: 1
Product type: program
[ 1/35] copy: src/node_config.h.in -> out/Release/src/node_config.h
[ 2/35] cc: deps/http_parser/http_parser.c -> out/Release/deps/http_parser/http_parser_3.o
/usr/bin/gcc -rdynamic -pthread -arch x86_64 -g -O3 -DHAVE_OPENSSL=1 -D_LARGEFILE_SOURCE ...
[ 3/35] src/node_natives.h: src/node.js lib/dgram.js lib/console.js lib/buffer.js ...
[ 4/35] uv: deps/uv/include/uv.h -> out/Release/deps/uv/uv.a

...

f: Leaving directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
'build' finished successfully (2m53.573s)
-rwxr-xr-x  1 sh1mmer  staff   6.8M Jan  3 21:56 out/Release/node
enki:node-v0.6.6 $

The final step is to use make to install Node. First, Example 1-4 shows how to install Node globally for the whole system. This requires you to have either access to the root user or sudo privileges that let you act as root.

Example 1-4. Installing Node for the whole system

enki:node-v0.6.6 $ sudo make install
Password:
Waf: Entering directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
DEST_OS: darwin
DEST_CPU: x64
Parallel Jobs: 1
Product type: program
* installing deps/uv/include/ares.h as /usr/local/include/node/ares.h
* installing deps/uv/include/ares_version.h as /usr/local/include/node/ares_version.h
* installing deps/uv/include/uv.h as /usr/local/include/node/uv.h

...

* installing out/Release/src/node_config.h as /usr/local/include/node/node_config.h
Waf: Leaving directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
'install' finished successfully (0.915s)
enki:node-v0.6.6 $

If you want to install only for the local user and avoid using the sudo command, you need to run the configure script with the --prefix argument to tell Node to install somewhere other than the default (Example 1-5).

Example 1-5. Installing Node for a local user

enki:node-v0.6.6 $ mkdir ~/local
enki:node-v0.6.6 $ ./configure --prefix=~/local
Checking for program g++ or c++          : /usr/bin/g++ 
Checking for program cpp                 : /usr/bin/cpp 

...

'configure' finished successfully (0.501s)
enki:node-v0.6.6 $ make && make install
Waf: Entering directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
DEST_OS: darwin
DEST_CPU: x64

...

* installing out/Release/node as /Users/sh1mmer/local/bin/node
* installing out/Release/src/node_config.h as /Users/sh1mmer/local/include/node/...
Waf: Leaving directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
'install' finished successfully (0.747s)
enki:node-v0.6.6 $

First Steps in Code

This section will take you through a basic Node program before we move on to more in-depth programs.

Node REPL

One of the things that’s often hard to understand about Node.js is that, in addition to being a server, it’s also a runtime environment in the same way that Perl, Python, and Ruby are. So, even though we often refer to Node.js as “server-side JavaScript,” that doesn’t really accurately describe what Node.js does. One of the best ways to come to grips with Node.js is to use Node REPL (“Read-Evaluate-Print-Loop”), an interactive Node.js programming environment. It’s great for testing out and learning about Node.js. You can try out any of the snippets in this book using Node REPL. In addition, because Node is a wrapper around V8, Node REPL is an ideal place to easily try out JavaScript. However, when you want to run a Node program, you can use your favorite text editor, save it in a file, and simply run node filename.js. REPL is a great learning and exploration tool, but we don’t use it for production code.

Let’s launch Node REPL and try out a few bits of JavaScript to warm up (Example 1-6). Open up a console on your system. I’m using a Mac with a custom command prompt, so your system might look a little different, but the commands should be the same.

Example 1-6. Starting Node REPL and trying some JavaScript

$Enki:~ $ node
> 3 > 2 > 1
false
> true == 1
true
> true === 1
false

Note

The first line, which evaluates to false, is from http://wtfjs.com, a collection of weird and amusing things about JavaScript.

Having a live programming environment is a really great learning tool, but you should know a few helpful features of Node REPL to make the most of it. It offers meta-commands, which all start with a period (.). Thus, .help shows the help menu, .clear clears the current context, and .exit quits Node REPL (see Example 1-7). The most useful command is .clear, which wipes out any variables or closures you have in memory without the need to restart REPL.

Example 1-7. Using the metafeatures in Node REPL

> console.log('Hello World');
Hello World
> .help
.clear  Break, and also clear the local context.
.exit   Exit the prompt
.help   Show repl options
> .clear
Clearing context...
> .exit
Enki:~ $

When using REPL, simply typing the name of a variable will enumerate it in the shell. Node tries to do this intelligently so a complex object won’t just be represented as a simple Object, but through a description that reflects what’s in the object (Example 1-8). The main exception to this involves functions. It’s not that REPL doesn’t have a way to enumerate functions; it’s that functions have the tendency to be very large. If REPL enumerated functions, a lot of output could scroll by.

Example 1-8. Setting and enumerating objects with REPL

Enki:~ $ node
> myObj = {};
{}
> myObj.list = ["a", "b", "c"];
[ 'a', 'b', 'c' ]
> myObj.doThat = function(first, second, third) { console.log(first); };
[Function]
> myObj
{ list: [ 'a', 'b', 'c' ]
, doThat: [Function]
}
>

A First Server

REPL gives us a great tool for learning and experimentation, but the main application of Node.js is as a server. One of the specific design goals of Node.js is to provide a highly scalable server environment. This is an area where Node differs from V8, which was described at the beginning of this chapter. Although the V8 runtime is used in Node.js to interpret the JavaScript, Node.js also uses a number of highly optimized libraries to make the server efficient. In particular, the HTTP module was written from scratch in C to provide a very fast nonblocking implementation of HTTP. Let’s take a look at the canonical Node “Hello World” example using an HTTP server (Example 1-9).

Example 1-9. A Hello World Node.js web server

var http = require('http'); 
http.createServer(function (req, res) { 
    res.writeHead(200, {'Content-Type': 'text/plain'}); 
    res.end('Hello World\n'); 
}).listen(8124, "127.0.0.1"); 
console.log('Server running at http://127.0.0.1:8124/');

The first thing that this code does is use require to include the HTTP library into the program. This concept is used in many languages, but Node uses the CommonJS module format, which we’ll talk about more in Chapter 8. The main thing to know at this point is that the functionality in the HTTP library is now assigned to the http object.

Next, we need an HTTP server. Unlike some languages, such as PHP, that run inside a server such as Apache, Node itself acts as the web server. However, that also means we have to create it. The next line calls a factory method from the HTTP module that creates new HTTP servers. The new HTTP server isn’t assigned to a variable; it’s simply going to be an anonymous object in the global scope. Instead, we use chaining to initialize the server and tell it to listen on port 8124.

When calling createServer, we passed an anonymous function as an argument. This function is attached to the new server’s event listener for the request event. Events are central to both JavaScript and Node. In this case, whenever there is a new request to the web server, it will call the method we’ve passed to deal with the request. We call these kinds of methods callbacks because whenever an event happens, we “call back” all the methods listening for that event.

Perhaps a good analogy would be ordering a book from a bookshop. When your book is in stock, they call back to let you know you can come and collect it. This specific callback takes the arguments req for the request object and res for the response object.

Inside the function we created for the callback, we call a couple of methods on the res object. These calls modify the response. Example 1-9 doesn’t use the request, but typically you would use both the request and response objects.

The first thing we must do is set the HTTP response header . We can’t send any actual response to the client without it. The res.writeHead method does this. We set the value to 200 (for the HTTP status code “200 OK”) and pass a list of HTTP headers. In this case, the only header we specify is Content-type.

After we’ve written the HTTP header to the client, we can write the HTTP body. In this case, we use a single method to both write the body and close the connection. The end method closes the HTTP connection, but since we also passed it a string, it will send that to the client before it closes the connection.

Finally, the last line of our example uses the console.log. This simply prints information to stdout, much like the browser counterpart supported by Firebug and Web Inspector.

Let’s run this with Node.js on the console and see what we get (Example 1-10).

Example 1-10. Running the Hello World example

Enki:~ $ node
> var http = require('http'); 
> http.createServer(function (req, res) { 
...     res.writeHead(200, {'Content-Type': 'text/plain'}); 
...     res.end('Hello World\n'); 
...   }).listen(8124, "127.0.0.1"); 
> console.log('Server running at http://127.0.0.1:8124/'); 
Server running at http://127.0.0.1:8124/ 
node>

Here we start a Node REPL and type in the code from the sample (we’ll forgive you for pasting it from the website). Node REPL accepts the code, using ... to indicate that you haven’t completed the statement and should continue entering it. When we run the console.log line, Node REPL prints out Server running at http://127.0.0.1:8124/. Now we are ready to call our Hello World example in a web browser (Figure 1-1).

Figure 1-1. Viewing the Hello World web server from a browser

It works! Although this isn’t exactly a stunning demo, it is notable that we got Hello World working in six lines of code. Not that we would recommend that style of coding, but we are starting to get somewhere. In the next chapter, we’ll look at a lot more code, but first let’s think about why Node is how it is.

Why Node?

In writing this book, we’ve been acutely aware of how new Node.js is. Many platforms take years to find adoption, and yet there’s a level of excitement around Node.js that we’ve never seen before in such a young platform. We hope that by looking at the reasons other people are getting so excited about Node.js, you will find features that also resonate with you. By looking at Node.js’s strengths, we can find the places where it is most applicable. This section looks at the factors that have come together to create a space for Node.js and discusses the reasons why it’s become so popular in such a short time.

High-Performance Web Servers

When we first started writing web applications more than 10 years ago, the Web was much smaller. Sure, we had the dot-com bubble, but the sheer volume of people on the Internet was considerably lower, and the sites we made were much less ambitious. Fast-forward to today, and we have the advent of Web 2.0 and widely available Internet connections on cell phones. So much more is expected of us as developers. Not only are the features we need to deliver more complex, more interactive, and more real, but there are also many more people using them more often and from more devices than ever before. This is a pretty steep challenge. While hardware continues to improve, we also need to make improvements to our software development practices to support such demands. If we kept just buying hardware to support ever-increasing features or users, it wouldn’t be very cost-effective.

Node is an attempt to solve this problem by introducing the architecture called event-driven computing to the programming space for web servers. As it turns out, Node isn’t the first platform to do this, but it is by far the most successful, and we would argue that it is the easiest to use. We are going to talk about event-driven programming in a lot more detail later in this book, but let’s go through a quick intro here. Imagine you connect to a web server to get a web page. The time to reach that web server is probably 100ms or so over a reasonable DSL connection. When you connect to a typical web server, it creates a new instance of a program on the server that represents your request. That program runs from the top to the bottom (following all of the function calls) to provide your web page. This means that the server has to allocate a fixed amount of memory to that request until it is totally finished, including the 100ms+ to send the data back to you. Node doesn’t work that way. Instead, Node keeps all users in a single program. Whenever Node has to do something slow, like wait for a confirmation that you got your data (so it can mark your request as finished), it simply moves on to another user. We’re glossing over the details a bit, but this means Node is much more efficient with memory than traditional servers and can keep providing a very fast response time with lots and lots of concurrent users. This is a huge win. This approach is one of the main reasons people like Node.

Professionalism in JavaScript

Another reason people like Node is JavaScript. JavaScript was created by Brendan Eich in 1995 to be a simple scripting language for use in web pages on the Netscape browser platform. Surprisingly, almost since its inception JavaScript has been used in nonbrowser settings. Some of the early Netscape server products supported JavaScript (known then as LiveScript) as a server-side scripting language. Although server-side JavaScript didn’t really catch on then, that certainly wasn’t true for the exploding browser market. On the Web, JavaScript competed with Microsoft’s VBScript to provide programming functionality in web pages. It’s hard to say why JavaScript won, but perhaps Microsoft allowing JavaScript in Internet Explorer did it,^[1] or perhaps it was the JavaScript language itself, but win it did. This meant by the early 2000s, JavaScript had emerged as the web language—not just the first choice, but the only choice for programming with HTML in browsers.

What does this have to do with Node.js? Well, the important thing to remember is that when the AJAX revolution happened and the Web became big business (think Yahoo!, Amazon, Google, etc.), the only choice for the “J” in AJAX was JavaScript. There simply wasn’t an alternative. As a result, a whole industry needed an awful lot of JavaScript programmers, and really good ones at that, rather fast. The emergence of the Web as a serious platform and JavaScript as its programming language meant that we, as JavaScript programmers, needed to shape up. We can equate the change in JavaScript as the second or third programming language of a programmer to the change in perception of its importance. We started to get emerging experts who led the charge in making JavaScript respectable.

Arguably at the head of this movement was Douglas Crockford. His popular articles and videos on JavaScript have helped many programmers discover that inside this much-maligned language there is a lot of beauty. Most programmers working with JavaScript spent the majority of their time working with the browser implementation of the W3C DOM API for manipulating HTML or XML documents. Unfortunately, the DOM is probably not the prettiest API ever conceived, but worse, its various implementations in the browsers are inconsistent and incomplete. No wonder that for a decade after its release JavaScript was not thought of as a “proper” language by so many programmers. More recently, Douglas’s work on “the good parts” of JavaScript have helped create a movement of advocates of the language who recognize that it has a lot going for it, despite the warts.

In 2012, we now have a proliferation of JavaScript experts advocating well-written, performant, maintainable JavaScript code. People such as Douglas Crockford, Dion Almaer, Peter Paul Koch (PPK), John Resig, Alex Russell, Thomas Fuchs, and many more have provided research, advice, tools, and primarily libraries that have allowed thousands of professional JavaScript programmers worldwide to practice their trade with a spirit of excellence. Libraries such as jQuery, YUI, Dojo, Prototype, Mootools, Sencha, and many others are now used daily by thousands of people and deployed on millions of websites. It is in this environment—where JavaScript is not only accepted, but widely used and celebrated—that a platform larger than the Web makes sense. When so many programmers know JavaScript, its ubiquity is a distinct advantage.

When a roomful of web programmers is asked what languages they use, Java and PHP are very popular, Ruby is probably the next most popular these days (or at least closely tied with Python), and Perl still has a huge following. However, almost without exception, anyone who does any programming for the Web has programmed in JavaScript. Although backend languages are fractured in-browser, programming is united by the necessities of deployment. Various browsers and browser plug-ins allow the use of other languages, but they simply aren’t universal enough for the Web. So here we are with a single, universal web language. How can we get it on the server?

Browser Wars 2.0

Fairly early in the days of the Web, we had the infamous browser wars. Internet Explorer and Netscape competed viciously on web features, adding various incompatible programmatic features to their own browser and not supporting the features in the other browser. For those of us who programmed for the Web, this was the cause of much anguish because it made web programming really tiresome. Internet Explorer more or less emerged as the winner of that round and became the dominant browser. Fast-forward a few years, and Internet Explorer has been languishing at version 6, and a new contender, Firefox, emerges from the remnants of Netscape. Firefox kicks off a resurgence in browsers, followed by WebKit (Safari) and then Chrome. Most interesting about this current trend is the resurgence of competition in the browser market.

Unlike the first iteration of the browser wars, today’s browsers compete on two fronts: adhering to the standards that emerged after the previous browser war, and performance. As websites have become more complex, users want the fastest experience possible. This has meant that browsers not only need to support the web standards well, allowing developers to optimize, but also to do a little optimization of their own. With JavaScript as a core component of Web 2.0, AJAX websites have become part of the battleground.

Each browser has its own JavaScript runtime: Spider Monkey for Firefox, Squirrel Fish Extreme for Safari, Karakan for Opera, and finally V8 for Chrome. As these runtimes compete on performance, it creates an environment of innovation for JavaScript. And in order to differentiate their browsers, vendors are going to great lengths to make them as fast as possible.

^[1] Internet Explorer doesn’t actually support JavaScript or ECMAScript; it supports a language variety called JScript. In recent years, JScript has fully supported the ECMAScript 3 standard and has some ECMAScript 5 support. However, JScript also implements proprietary extensions in the same way that Mozilla JavaScript does and has features that ECMAScript does not.