Node.js is many things, but mostly it’s a way of running JavaScript outside the web browser. This book will cover why that’s important and the benefits that Node.js provides. This introduction attempts to sum up that explanation in a few paragraphs, rather than a few hundred pages.
Many people use the JavaScript programming language extensively for programming the interfaces of websites. Node.js allows this popular programming language to be applied in many more contexts, in particular on web servers. There are several notable features about Node.js that make it worthy of interest.
Node is a wrapper around the high-performance
V8 JavaScript runtime from the Google Chrome browser. Node
tunes V8 to work better in contexts other than the browser, mostly by
providing additional APIs that are optimized for specific use cases. For
example, in a server context, manipulation of binary data is often
necessary. This is poorly supported by the JavaScript language and, as a
result, V8. Node’s Buffer class
provides easy manipulation of binary data. Thus, Node doesn’t
just provide direct access to the V8 JavaScript runtime. It also makes
JavaScript more useful for the contexts in which people use Node.
V8 itself uses some of the newest techniques in compiler technology. This often allows code written in a high-level language such as JavaScript to perform similarly to code written in a lower-level language, such as C, with a fraction of the development cost. This focus on performance is a key aspect of Node.
JavaScript is an event-driven language, and Node uses this to its advantage to produce highly scalable servers. Using an architecture called an event loop, Node makes programming highly scalable servers both easy and safe. There are various strategies that are used to make servers performant. Node has chosen an architecture that performs very well but also reduces the complexity for the application developer. This is an extremely important feature. Programming concurrency is hard and fraught with danger. Node sidesteps this challenge while still offering impressive performance. As always, any approach still has trade-offs, and these are discussed in detail later in the book.
To support the event-loop approach, Node
supplies a set of “nonblocking” libraries. In essence, these are interfaces
to things such as the filesystem or databases, which operate in an
event-driven way. When you make a request to the filesystem, rather than
requiring Node to wait for the hard drive to spin up and retrieve the file,
the nonblocking interface simply notifies Node when it has access, in the
same way that web browsers notify your code about an onclick event. This
model simplifies access to slow resources in a scalable way that is
intuitive to JavaScript programmers and easy to learn for everyone
else.
Although not unique to Node, supporting JavaScript on the server is also a powerful feature. Whether we like it or not, the browser environment gives us little choice of programming languages. Certainly, JavaScript is the only choice if we would like our code to work in any reasonable percentage of browsers. To achieve any aspirations of sharing code between the server and the browser, we must use JavaScript. Due to the increasing complexity of client applications that we are building in the browser using JavaScript (such as Gmail), the more code we can share between the browser and the server, the more we can reduce the cost of creating rich web applications. Because we must rely on JavaScript in the browser, having a server-side environment that uses JavaScript opens the door to code sharing in a way that is not possible with other server-side languages, such as PHP, Java, Ruby, or Python. Although there are other platforms that support programming web servers with JavaScript, Node is quickly becoming the dominant platform in the space.
Aside from what you can build with Node, one extremely pleasing aspect is how much you can build for Node. Node is extremely extensible, with a large volume of community modules that have been built in the relatively short time since the project’s release. Many of these are drivers to connect with databases or other software, but many are also useful software applications in their own right.
The last reason to celebrate Node, but certainly not the least important, is its community. The Node project is still very young, and yet rarely have we seen such fervor around a project. Both novices and experts have coalesced around the project to use and contribute to Node, making it both a pleasure to explore and a supportive place to share and get advice.
Installing Node.js is extremely simple. Node runs on Windows, Linux, Mac, and other POSIX OSes (such as Solaris and BSD). Node.js is available from two primary locations: the project’s website or the GitHub repository. You’re probably better off with the Node website because it contains the stable releases. The latest cutting-edge features are hosted on GitHub for the core development team and anyone else who wants a copy. Although these features are new and often intriguing, they are also less reliable than those in a stable release.
Let’s get started by installing Node.js. The
first thing to do is download Node.js from the website, so let’s go there
and find the latest release. From the Node home page, find the download
link. The current release at the time of print is 0.6.13, which is a
stable release. The Node website provides installers for Windows and Mac
as well as the stable source code. If you are on Linux, you can either do
a source install or use your usual package manager (apt-get,
yum, etc.).
Node.js version
numbers follow the C convention of
major.minor.patch. Stable versions of Node.js
have an even minor version number, and development versions have an odd
minor version number. It’s unclear when Node will become version 1, but
it’s a fair assumption that it will only be when the Windows and Unix
combined release is considered mature.
If you used an installer, you can skip to
First Steps in Code. Otherwise (i.e., if you are doing a
source install), once you have the code, you’ll need to unpack it. The
tar command does this using the flags xzf. The x
stands for extract (rather than compress), z tells tar to also
decompress using the GZIP algorithm, and f
indicates we are unpacking the filename given as the final argument (see
Example 1-1).
The next step is to configure the code for
your system. Node.js uses the configure/make system for its installation.
The configure script
looks at your system and finds the paths Node needs to use for the
dependencies it needs. Node generally has very few dependencies. The
installer requires Python 2.4 or greater, and if you wish to use TLS or
cryptology (such as SHA1), Node needs the OpenSSL development
libraries. Running configure will let you know
whether any of these dependencies are missing (see Example 1-2).
Example 1-2. Configuring the Node install
enki:node-v0.6.6 $ ./configure
Checking for program g++ or c++ : /usr/bin/g++
Checking for program cpp : /usr/bin/cpp
Checking for program ar : /usr/bin/ar
Checking for program ranlib : /usr/bin/ranlib
Checking for g++ : ok
Checking for program gcc or cc : /usr/bin/gcc
Checking for gcc : ok
Checking for library dl : yes
Checking for openssl : not found
Checking for function SSL_library_init : yes
Checking for header openssl/crypto.h : yes
Checking for library util : yes
Checking for library rt : not found
Checking for fdatasync(2) with c++ : no
'configure' finished successfully (0.991s)
enki:node-v0.6.6 $The next installation step is to make the project (Example 1-3). This compiles Node and builds the binary
version that you will use into a build subdirectory of the source
directory we’ve been using. Node numbers each of the build steps it needs
to complete so you can follow the progress it makes during the
compile.
Example 1-3. Compiling Node with the make command
enki:node-v0.6.6 $ make
Waf: Entering directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
DEST_OS: darwin
DEST_CPU: x64
Parallel Jobs: 1
Product type: program
[ 1/35] copy: src/node_config.h.in -> out/Release/src/node_config.h
[ 2/35] cc: deps/http_parser/http_parser.c -> out/Release/deps/http_parser/http_parser_3.o
/usr/bin/gcc -rdynamic -pthread -arch x86_64 -g -O3 -DHAVE_OPENSSL=1 -D_LARGEFILE_SOURCE ...
[ 3/35] src/node_natives.h: src/node.js lib/dgram.js lib/console.js lib/buffer.js ...
[ 4/35] uv: deps/uv/include/uv.h -> out/Release/deps/uv/uv.a
...
f: Leaving directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
'build' finished successfully (2m53.573s)
-rwxr-xr-x 1 sh1mmer staff 6.8M Jan 3 21:56 out/Release/node
enki:node-v0.6.6 $The final step is to use make to install Node. First, Example 1-4 shows how to install Node globally for the
whole system. This requires you to have either access to the root user or sudo privileges that let you act as root.
Example 1-4. Installing Node for the whole system
enki:node-v0.6.6 $ sudo make install
Password:
Waf: Entering directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
DEST_OS: darwin
DEST_CPU: x64
Parallel Jobs: 1
Product type: program
* installing deps/uv/include/ares.h as /usr/local/include/node/ares.h
* installing deps/uv/include/ares_version.h as /usr/local/include/node/ares_version.h
* installing deps/uv/include/uv.h as /usr/local/include/node/uv.h
...
* installing out/Release/src/node_config.h as /usr/local/include/node/node_config.h
Waf: Leaving directory `/Users/sh1mmer/Downloads/node-v0.6.6/out'
'install' finished successfully (0.915s)
enki:node-v0.6.6 $If you want to install only for the local user
and avoid using the sudo command, you
need to run the configure script with
the --prefix argument to tell Node to
install somewhere other than the default (Example 1-5).
Example 1-5. Installing Node for a local user
enki:node-v0.6.6 $mkdir ~/localenki:node-v0.6.6 $./configure --prefix=~/localChecking for program g++ or c++ : /usr/bin/g++ Checking for program cpp : /usr/bin/cpp ... 'configure' finished successfully (0.501s) enki:node-v0.6.6 $ make && make install Waf: Entering directory `/Users/sh1mmer/Downloads/node-v0.6.6/out' DEST_OS: darwin DEST_CPU: x64 ... * installing out/Release/node as /Users/sh1mmer/local/bin/node * installing out/Release/src/node_config.h as /Users/sh1mmer/local/include/node/... Waf: Leaving directory `/Users/sh1mmer/Downloads/node-v0.6.6/out' 'install' finished successfully (0.747s) enki:node-v0.6.6 $
This section will take you through a basic Node program before we move on to more in-depth programs.
One of the things that’s often hard to understand about Node.js is that, in
addition to being a server, it’s also a runtime environment in the same
way that Perl, Python, and Ruby are. So, even though we often refer to
Node.js as “server-side JavaScript,” that doesn’t really accurately
describe what Node.js does. One of the best ways to come to grips with
Node.js is to use Node REPL (“Read-Evaluate-Print-Loop”), an interactive
Node.js programming environment. It’s great for testing out and learning
about Node.js. You can try out any of the snippets in this book using
Node REPL. In addition, because Node is a wrapper around V8, Node REPL is an ideal place to easily try out
JavaScript. However, when you want to run a Node program, you can use
your favorite text editor, save it in a file, and simply run node filename.js. REPL is a great learning and
exploration tool, but we don’t use it for production code.
Let’s launch Node REPL and try out a few bits of JavaScript to warm up (Example 1-6). Open up a console on your system. I’m using a Mac with a custom command prompt, so your system might look a little different, but the commands should be the same.
The first line, which evaluates to
false, is from http://wtfjs.com, a collection of weird and amusing
things about JavaScript.
Having a live programming environment is a really great learning tool, but you
should know a few helpful features of Node REPL to make the most of it.
It offers meta-commands, which all start with a period (.). Thus,
.help shows the help menu, .clear
clears the current context, and .exit quits Node REPL (see Example 1-7).
The most useful command is .clear,
which wipes out any variables or closures you have in memory without the
need to restart REPL.
When using REPL, simply typing the name of a
variable will enumerate it in the shell. Node tries to do this
intelligently so a complex object won’t just be represented as a
simple Object, but
through a description that reflects what’s in the object (Example 1-8). The main exception to this involves
functions. It’s not that REPL doesn’t have a way to
enumerate functions; it’s that functions have the tendency to be very
large. If REPL enumerated functions, a lot of output could scroll by.
REPL gives us a great tool for learning and experimentation, but the main application of Node.js is as a server. One of the specific design goals of Node.js is to provide a highly scalable server environment. This is an area where Node differs from V8, which was described at the beginning of this chapter. Although the V8 runtime is used in Node.js to interpret the JavaScript, Node.js also uses a number of highly optimized libraries to make the server efficient. In particular, the HTTP module was written from scratch in C to provide a very fast nonblocking implementation of HTTP. Let’s take a look at the canonical Node “Hello World” example using an HTTP server (Example 1-9).
The first thing that this code does is
use require to include the
HTTP library into the program. This concept is used in
many languages, but Node uses the CommonJS module format, which we’ll talk about more in
Chapter 8. The main thing to know at this point is
that the functionality in the HTTP library is now assigned to the
http object.
Next, we need an HTTP server. Unlike some
languages, such as PHP, that run inside a server such as Apache, Node
itself acts as the web server. However, that also means we have to
create it. The next line calls a factory method from the HTTP module
that creates new HTTP servers. The new HTTP server isn’t assigned to a
variable; it’s simply going to be an anonymous object in the global
scope. Instead, we use chaining to initialize the server and tell it to
listen on port 8124.
When calling createServer, we passed an anonymous function as an argument. This function
is attached to the new server’s event listener for the request event. Events
are central to both JavaScript and Node. In this case, whenever there is
a new request to the web server, it will call the method we’ve passed to
deal with the request. We call these kinds of methods
callbacks because whenever an event happens, we “call back” all the
methods listening for that event.
Perhaps a good analogy would be ordering a
book from a bookshop. When your book is in stock, they call
back to let you know you can come and collect it. This
specific callback takes the arguments req for the request object and res for the response object.
Inside the function we created for the
callback, we call a couple of methods on the res object. These calls modify the response.
Example 1-9 doesn’t use the request, but
typically you would use both the request and response objects.
The first thing we must
do is set the HTTP response header. We can’t send any actual response to the client without
it. The res.writeHead method does
this. We set the value to 200 (for
the HTTP status code “200 OK”) and pass a list of HTTP headers. In this
case, the only header we specify is Content-type.
After we’ve written the HTTP header to the
client, we can write the HTTP body. In this case, we use a single method
to both write the body and close the connection. The end method closes the HTTP connection, but since we also passed it a
string, it will send that to the client before it closes the
connection.
Finally, the last line of our example uses
the console.log. This simply prints information to stdout, much like the browser counterpart
supported by Firebug and Web Inspector.
Let’s run this with Node.js on the console and see what we get (Example 1-10).
Example 1-10. Running the Hello World example
Enki:~ $ node
> var http = require('http');
> http.createServer(function (req, res) {
... res.writeHead(200, {'Content-Type': 'text/plain'});
... res.end('Hello World\n');
... }).listen(8124, "127.0.0.1");
> console.log('Server running at http://127.0.0.1:8124/');
Server running at http://127.0.0.1:8124/
node>Here we start a Node REPL and type in the
code from the sample (we’ll forgive you for pasting it from the
website). Node REPL accepts the code, using ... to indicate that you haven’t completed the
statement and should continue entering it. When we run the console.log line, Node REPL prints
out Server running at
http://127.0.0.1:8124/. Now we are ready to call our Hello
World example in a web browser (Figure 1-1).
It works! Although this isn’t exactly a stunning demo, it is notable that we got Hello World working in six lines of code. Not that we would recommend that style of coding, but we are starting to get somewhere. In the next chapter, we’ll look at a lot more code, but first let’s think about why Node is how it is.
In writing this book, we’ve been acutely aware of how new Node.js is. Many platforms take years to find adoption, and yet there’s a level of excitement around Node.js that we’ve never seen before in such a young platform. We hope that by looking at the reasons other people are getting so excited about Node.js, you will find features that also resonate with you. By looking at Node.js’s strengths, we can find the places where it is most applicable. This section looks at the factors that have come together to create a space for Node.js and discusses the reasons why it’s become so popular in such a short time.
When we first started writing web applications more than 10 years ago, the Web was much smaller. Sure, we had the dot-com bubble, but the sheer volume of people on the Internet was considerably lower, and the sites we made were much less ambitious. Fast-forward to today, and we have the advent of Web 2.0 and widely available Internet connections on cell phones. So much more is expected of us as developers. Not only are the features we need to deliver more complex, more interactive, and more real, but there are also many more people using them more often and from more devices than ever before. This is a pretty steep challenge. While hardware continues to improve, we also need to make improvements to our software development practices to support such demands. If we kept just buying hardware to support ever-increasing features or users, it wouldn’t be very cost-effective.
Node is an attempt to solve this problem by introducing the architecture called event-driven computing to the programming space for web servers. As it turns out, Node isn’t the first platform to do this, but it is by far the most successful, and we would argue that it is the easiest to use. We are going to talk about event-driven programming in a lot more detail later in this book, but let’s go through a quick intro here. Imagine you connect to a web server to get a web page. The time to reach that web server is probably 100ms or so over a reasonable DSL connection. When you connect to a typical web server, it creates a new instance of a program on the server that represents your request. That program runs from the top to the bottom (following all of the function calls) to provide your web page. This means that the server has to allocate a fixed amount of memory to that request until it is totally finished, including the 100ms+ to send the data back to you. Node doesn’t work that way. Instead, Node keeps all users in a single program. Whenever Node has to do something slow, like wait for a confirmation that you got your data (so it can mark your request as finished), it simply moves on to another user. We’re glossing over the details a bit, but this means Node is much more efficient with memory than traditional servers and can keep providing a very fast response time with lots and lots of concurrent users. This is a huge win. This approach is one of the main reasons people like Node.
Another reason people like Node is JavaScript. JavaScript was created by Brendan Eich in 1995 to be a simple scripting language for use in web pages on the Netscape browser platform. Surprisingly, almost since its inception JavaScript has been used in nonbrowser settings. Some of the early Netscape server products supported JavaScript (known then as LiveScript) as a server-side scripting language. Although server-side JavaScript didn’t really catch on then, that certainly wasn’t true for the exploding browser market. On the Web, JavaScript competed with Microsoft’s VBScript to provide programming functionality in web pages. It’s hard to say why JavaScript won, but perhaps Microsoft allowing JavaScript in Internet Explorer did it,[1] or perhaps it was the JavaScript language itself, but win it did. This meant by the early 2000s, JavaScript had emerged as the web language—not just the first choice, but the only choice for programming with HTML in browsers.
What does this have to do with Node.js? Well, the important thing to remember is that when the AJAX revolution happened and the Web became big business (think Yahoo!, Amazon, Google, etc.), the only choice for the “J” in AJAX was JavaScript. There simply wasn’t an alternative. As a result, a whole industry needed an awful lot of JavaScript programmers, and really good ones at that, rather fast. The emergence of the Web as a serious platform and JavaScript as its programming language meant that we, as JavaScript programmers, needed to shape up. We can equate the change in JavaScript as the second or third programming language of a programmer to the change in perception of its importance. We started to get emerging experts who led the charge in making JavaScript respectable.
Arguably at the head of this movement was Douglas Crockford. His popular articles and videos on JavaScript have helped many programmers discover that inside this much-maligned language there is a lot of beauty. Most programmers working with JavaScript spent the majority of their time working with the browser implementation of the W3C DOM API for manipulating HTML or XML documents. Unfortunately, the DOM is probably not the prettiest API ever conceived, but worse, its various implementations in the browsers are inconsistent and incomplete. No wonder that for a decade after its release JavaScript was not thought of as a “proper” language by so many programmers. More recently, Douglas’s work on “the good parts” of JavaScript have helped create a movement of advocates of the language who recognize that it has a lot going for it, despite the warts.
In 2012, we now have a proliferation of JavaScript experts advocating well-written, performant, maintainable JavaScript code. People such as Douglas Crockford, Dion Almaer, Peter Paul Koch (PPK), John Resig, Alex Russell, Thomas Fuchs, and many more have provided research, advice, tools, and primarily libraries that have allowed thousands of professional JavaScript programmers worldwide to practice their trade with a spirit of excellence. Libraries such as jQuery, YUI, Dojo, Prototype, Mootools, Sencha, and many others are now used daily by thousands of people and deployed on millions of websites. It is in this environment—where JavaScript is not only accepted, but widely used and celebrated—that a platform larger than the Web makes sense. When so many programmers know JavaScript, its ubiquity is a distinct advantage.
When a roomful of web programmers is asked what languages they use, Java and PHP are very popular, Ruby is probably the next most popular these days (or at least closely tied with Python), and Perl still has a huge following. However, almost without exception, anyone who does any programming for the Web has programmed in JavaScript. Although backend languages are fractured in-browser, programming is united by the necessities of deployment. Various browsers and browser plug-ins allow the use of other languages, but they simply aren’t universal enough for the Web. So here we are with a single, universal web language. How can we get it on the server?
Fairly early in the days of the Web, we had the infamous browser wars. Internet Explorer and Netscape competed viciously on web features, adding various incompatible programmatic features to their own browser and not supporting the features in the other browser. For those of us who programmed for the Web, this was the cause of much anguish because it made web programming really tiresome. Internet Explorer more or less emerged as the winner of that round and became the dominant browser. Fast-forward a few years, and Internet Explorer has been languishing at version 6, and a new contender, Firefox, emerges from the remnants of Netscape. Firefox kicks off a resurgence in browsers, followed by WebKit (Safari) and then Chrome. Most interesting about this current trend is the resurgence of competition in the browser market.
Unlike the first iteration of the browser wars, today’s browsers compete on two fronts: adhering to the standards that emerged after the previous browser war, and performance. As websites have become more complex, users want the fastest experience possible. This has meant that browsers not only need to support the web standards well, allowing developers to optimize, but also to do a little optimization of their own. With JavaScript as a core component of Web 2.0, AJAX websites have become part of the battleground.
Each browser has its own JavaScript runtime: Spider Monkey for Firefox, Squirrel Fish Extreme for Safari, Karakan for Opera, and finally V8 for Chrome. As these runtimes compete on performance, it creates an environment of innovation for JavaScript. And in order to differentiate their browsers, vendors are going to great lengths to make them as fast as possible.
[1] Internet Explorer doesn’t actually support JavaScript or ECMAScript; it supports a language variety called JScript. In recent years, JScript has fully supported the ECMAScript 3 standard and has some ECMAScript 5 support. However, JScript also implements proprietary extensions in the same way that Mozilla JavaScript does and has features that ECMAScript does not.