UNPKG

20 kBMarkdownView Raw
1# WebWorker Threads
2
3[![Build Status](https://travis-ci.org/audreyt/node-webworker-threads.svg)](https://travis-ci.org/audreyt/node-webworker-threads)
4
5This is based on @xk (jorgechamorro)'s [Threads A GoGo for Node.js](https://github.com/audreyt/node-threads-a-gogo), but with an API conforming to the [Web Worker standard](http://www.w3.org/TR/workers/).
6
7This module provides an asynchronous, evented and/or continuation passing style API for moving blocking/longish CPU-bound tasks out of Node's event loop to JavaScript threads that run in parallel in the background and that use all the available CPU cores automatically; all from within a single Node process.
8
9**Note**: If you would like to `require()` native modules in a worker, please consider using the process-based [tiny-worker](https://www.npmjs.com/package/tiny-worker) instead. It does not use threads, but the WebWorker API is compatible.
10
11This module requires Node.js 0.10.0+ and a working [node-gyp toolchain](https://github.com/nodejs/node-gyp#installation).
12
13## Illustrated Writeup
14
15There is an [illustrated writeup](http://aosabook.org/en/posa/from-socialcalc-to-ethercalc.html#multi-core-scaling) for the original use case of this module:
16
17<img src="http://aosabook.org/en/posa/ethercalc-images/scaling-threads.png" alt="Event Threaded Server (multi-core)" width="100%">
18
19## Installing the module
20
21With [npm](http://npmjs.org/):
22
23 npm install webworker-threads
24
25Sample usage (adapted from [MDN](https://developer.mozilla.org/en-US/docs/DOM/Using_web_workers#Passing_data)):
26
27```js
28var Worker = require('webworker-threads').Worker;
29// var w = new Worker('worker.js'); // Standard API
30
31// You may also pass in a function:
32var worker = new Worker(function(){
33 postMessage("I'm working before postMessage('ali').");
34 this.onmessage = function(event) {
35 postMessage('Hi ' + event.data);
36 self.close();
37 };
38});
39worker.onmessage = function(event) {
40 console.log("Worker said : " + event.data);
41};
42worker.postMessage('ali');
43```
44
45A more involved example in [LiveScript](http://livescript.net/) syntax, with five threads:
46
47```coffee
48{ Worker } = require \webworker-threads
49
50for til 5 => (new Worker ->
51 fibo = (n) -> if n > 1 then fibo(n - 1) + fibo(n - 2) else 1
52 @onmessage = ({ data }) -> postMessage fibo data
53)
54 ..onmessage = ({ data }) ->
55 console.log "[#{ @thread.id }] #data"
56 @postMessage Math.ceil Math.random! * 30
57 ..postMessage Math.ceil Math.random! * 30
58
59do spin = -> setImmediate spin
60```
61
62## Introduction
63
64After the initialization phase of a Node program, whose purpose is to setup listeners and callbacks to be executed in response to events, the next phase, the proper execution of the program, is orchestrated by the event loop whose duty is to [juggle events, listeners and callbacks quickly and without any hiccups nor interruptions that would ruin its performance](https://youtu.be/D0uA_NOb0PE).
65
66Both the event loop and said listeners and callbacks run sequentially in a single thread of execution, Node's main thread. If any of them ever blocks, nothing else will happen for the duration of the block: no more events will be handled, no more callbacks nor listeners nor timeouts nor setImmediate()ed functions will have the chance to run and do their job, because they won't be called by the blocked event loop, and the program will turn sluggish at best, or appear to be frozen and dead at worst.
67
68### What is WebWorker-Threads
69
70`webworker-threads` provides an asynchronous API for CPU-bound tasks that's missing in Node.js:
71
72``` javascript
73var Worker = require('webworker-threads').Worker;
74require('http').createServer(function (req,res) {
75 var fibo = new Worker(function() {
76 function fibo (n) {
77 return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
78 }
79 this.onmessage = function (event) {
80 postMessage(fibo(event.data));
81 }
82 });
83 fibo.onmessage = function (event) {
84 res.end('fib(40) = ' + event.data);
85 };
86 fibo.postMessage(40);
87}).listen(port);
88```
89
90And it won't block the event loop because for each request, the `fibo` worker will run in parallel in a separate background thread.
91
92## API
93
94### Module API
95
96``` javascript
97var Threads= require('webworker-threads');
98```
99
100##### .Worker
101
102`new Threads.Worker( [ file | function ] )` returns a Worker object.
103
104##### .create()
105
106`Threads.create( /* no arguments */ )` returns a thread object.
107
108##### .createPool( numThreads )
109
110`Threads.createPool( numberOfThreads )` returns a threadPool object.
111
112---
113### Web Worker API
114
115``` javascript
116var worker= new Threads.Worker('worker.js');
117var worker= new Threads.Worker(function(){ ... });
118var worker= new Threads.Worker();
119```
120
121##### .postMessage( data )
122
123`worker.postMessage({ x: 1, y: 2 })` sends a data structure into the worker. The worker can receive it using the `onmessage` handler.
124
125##### .onmessage
126
127`worker.onmessage = function (event) { console.log(event.data) };` receives data from the worker's `postMessage` calls.
128
129##### .terminate()
130
131`worker.terminate()` terminates the worker thread.
132
133##### .addEventListener( type, cb )
134
135`worker.addEventListener('message', callback)` is equivalent to setting `worker.onmesssage = callback`.
136
137##### .dispatchEvent( event )
138
139Currently unimplemented.
140
141##### .removeEventListener( type )
142
143Currently unimplemented.
144
145##### .thread
146
147Returns the underlying `thread` object; see the next section for details.
148Note that this attribute is implementation-specific, and not part of W3C Web Worker API.
149
150---
151
152### Thread API
153
154``` javascript
155var thread= Threads.create();
156```
157
158##### .id
159
160`thread.id` is a sequential thread serial number.
161
162##### .load( absolutePath [, cb] )
163
164`thread.load( absolutePath [, cb] )` reads the file at `absolutePath` and `thread.eval(fileContents, cb)`.
165
166##### .eval( program [, cb])
167
168`thread.eval( program [, cb])` converts `program.toString()` and eval()s it in the thread's global context, and (if provided) returns the completion value to `cb(err, completionValue)`.
169
170##### .on( eventType, listener )
171
172`thread.on( eventType, listener )` registers the listener `listener(data)` for any events of `eventType` that the thread `thread` may emit.
173
174##### .once( eventType, listener )
175
176`thread.once( eventType, listener )` is like `thread.on()`, but the listener will only be called once.
177
178##### .removeAllListeners( [eventType] )
179
180`thread.removeAllListeners( [eventType] )` deletes all listeners for all eventTypes. If `eventType` is provided, deletes all listeners only for the event type `eventType`.
181
182##### .emit( eventType, eventData [, eventData ... ] )
183
184`thread.emit( eventType, eventData [, eventData ... ] )` emits an event of `eventType` with `eventData` inside the thread `thread`. All its arguments are .toString()ed.
185
186##### .destroy( /* no arguments */ )
187
188`thread.destroy( /* no arguments */ )` destroys the thread.
189
190---
191### Thread pool API
192
193``` javascript
194threadPool= Threads.createPool( numberOfThreads );
195```
196
197##### .load( absolutePath [, cb] )
198
199`threadPool.load( absolutePath [, cb] )` runs `thread.load( absolutePath [, cb] )` in all the pool's threads.
200
201##### .any.eval( program, cb )
202
203`threadPool.any.eval( program, cb )` is like `thread.eval()`, but in any of the pool's threads.
204
205##### .any.emit( eventType, eventData [, eventData ... ] )
206
207`threadPool.any.emit( eventType, eventData [, eventData ... ] )` is like `thread.emit()`, but in any of the pool's threads.
208
209##### .all.eval( program, cb )
210
211`threadPool.all.eval( program, cb )` is like `thread.eval()`, but in all the pool's threads.
212
213##### .all.emit( eventType, eventData [, eventData ... ] )
214
215`threadPool.all.emit( eventType, eventData [, eventData ... ] )` is like `thread.emit()`, but in all the pool's threads.
216
217##### .on( eventType, listener )
218
219`threadPool.on( eventType, listener )` is like `thread.on()`, but in all of the pool's threads.
220
221##### .totalThreads()
222
223`threadPool.totalThreads()` returns the number of threads in this pool: as supplied in `.createPool( number )`
224
225##### .idleThreads()
226
227`threadPool.idleThreads()` returns the number of threads in this pool that are currently idle (sleeping)
228
229##### .pendingJobs()
230
231`threadPool.pendingJobs()` returns the number of jobs pending.
232
233##### .destroy( [ rudely ] )
234
235`threadPool.destroy( [ rudely ] )` waits until `pendingJobs()` is zero and then destroys the pool. If `rudely` is truthy, then it doesn't wait for `pendingJobs === 0`.
236
237---
238### Global Web Worker API
239
240Inside every Worker instance from webworker-threads, there's a global `self` object with these properties:
241
242##### .postMessage( data )
243
244`postMessage({ x: 1, y: 2 })` sends a data structure back to the main thread.
245
246##### .onmessage
247
248`onmessage = function (event) { ... }` receives data from the main thread's `.postMessage` calls.
249
250##### .close()
251
252`close()` stops the current thread.
253
254##### .addEventListener( type, cb )
255
256`addEventListener('message', callback)` is equivalent to setting `self.onmesssage = callback`.
257
258##### .dispatchEvent( event )
259
260`dispatchEvent({ type: 'message', data: data })` is the same as `self.postMessage(data)`.
261
262##### .removeEventListener( type )
263
264Currently unimplemented.
265
266##### .importScripts( file [, file...] )
267
268`importScripts('a.js', 'b.js')` loads one or more files from the disk and `eval()` them in the worker's instance scope.
269
270##### .thread
271
272The underlying `thread` object; see the next section for details.
273Note that this attribute is implementation-specific, and not part of W3C Web Worker API.
274
275---
276### Global Thread API
277
278Inside every thread .create()d by webworker-threads, there's a global `thread` object with these properties:
279
280##### .id
281
282`thread.id` is the serial number of this thread
283
284##### .on( eventType, listener )
285
286`thread.on( eventType, listener )` is just like `thread.on()` above.
287
288##### .once( eventType, listener )
289
290`thread.once( eventType, listener )` is just like `thread.once()` above.
291
292##### .emit( eventType, eventData [, eventData ... ] )
293
294`thread.emit( eventType, eventData [, eventData ... ] )` is just like `thread.emit()` above.
295
296##### .removeAllListeners( [eventType] )
297
298`thread.removeAllListeners( [eventType] )` is just like `thread.removeAllListeners()` above.
299
300##### .nextTick( function )
301
302`thread.nextTick( function )` is like `process.nextTick()`, but much faster.
303
304---
305### Global Helper API
306
307Inside every thread .create()d by webworker-threads, there are some helpers:
308
309##### console.log(arg1 [, arg2 ...])
310
311Same as `console.log` on the main process.
312
313##### console.error(arg1 [, arg2 ...])
314
315Same as `console.log`, except it prints to stderr.
316
317##### puts(arg1 [, arg2 ...])
318
319`puts(arg1 [, arg2 ...])` converts .toString()s and prints its arguments to stdout.
320
321## Developer guide
322
323See the [developer guide](https://github.com/audreyt/node-webworker-threads/wiki) if you want to contribute.
324
325-----------
326WIP WIP WIP
327-----------
328Note that everything below this line is under construction and subject to change.
329-----------
330
331## Examples
332
333**A.-** Here's a program that makes Node's event loop spin freely and as fast as possible: it simply prints a dot to the console in each turn:
334
335 cat examples/quickIntro_loop.js
336
337``` javascript
338(function spinForever () {
339 setImmediate(spinForever);
340})();
341```
342
343**B.-** Here's another program that adds to the one above a fibonacci(35) call in each turn, a CPU-bound task that takes quite a while to complete and that blocks the event loop making it spin slowly and clumsily. The point is simply to show that you can't put a job like that in the event loop because Node will stop performing properly when its event loop can't spin fast and freely due to a callback/listener/setImmediate()ed function that's blocking.
344
345 cat examples/quickIntro_blocking.js
346
347``` javascript
348function fibo (n) {
349 return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
350}
351
352(function fiboLoop () {
353 process.stdout.write(fibo(35).toString());
354 setImmediate(fiboLoop);
355})();
356
357(function spinForever () {
358 setImmediate(spinForever);
359})();
360```
361
362**C.-** The program below uses `webworker-threads` to run the fibonacci(35) calls in a background thread, so Node's event loop isn't blocked at all and can spin freely again at full speed:
363
364 cat examples/quickIntro_oneThread.js
365
366``` javascript
367function fibo (n) {
368 return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
369}
370
371function cb (err, data) {
372 process.stdout.write(data);
373 this.eval('fibo(35)', cb);
374}
375
376var thread= require('webworker-threads').create();
377
378thread.eval(fibo).eval('fibo(35)', cb);
379
380(function spinForever () {
381 process.stdout.write(".");
382 setImmediate(spinForever);
383})();
384```
385
386**D.-** This example is almost identical to the one above, only that it creates 5 threads instead of one, each running a fibonacci(35) in parallel and in parallel too with Node's event loop that keeps spinning happily at full speed in its own thread:
387
388 cat examples/quickIntro_fiveThreads.js
389
390``` javascript
391function fibo (n) {
392 return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
393}
394
395function cb (err, data) {
396 process.stdout.write(" ["+ this.id+ "]"+ data);
397 this.eval('fibo(35)', cb);
398}
399
400var Threads= require('webworker-threads');
401
402Threads.create().eval(fibo).eval('fibo(35)', cb);
403Threads.create().eval(fibo).eval('fibo(35)', cb);
404Threads.create().eval(fibo).eval('fibo(35)', cb);
405Threads.create().eval(fibo).eval('fibo(35)', cb);
406Threads.create().eval(fibo).eval('fibo(35)', cb);
407
408(function spinForever () {
409 setImmediate(spinForever);
410})();
411```
412
413**E.-** The next one asks `webworker-threads` to create a pool of 10 background threads, instead of creating them manually one by one:
414
415 cat examples/multiThread.js
416
417``` javascript
418function fibo (n) {
419 return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
420}
421
422var numThreads= 10;
423var threadPool= require('webworker-threads').createPool(numThreads).all.eval(fibo);
424
425threadPool.all.eval('fibo(35)', function cb (err, data) {
426 process.stdout.write(" ["+ this.id+ "]"+ data);
427 this.eval('fibo(35)', cb);
428});
429
430(function spinForever () {
431 setImmediate(spinForever);
432})();
433```
434
435**F.-** This is a demo of the `webworker-threads` eventEmitter API, using one thread:
436
437 cat examples/quickIntro_oneThreadEvented.js
438
439``` javascript
440var thread= require('webworker-threads').create();
441thread.load(__dirname + '/quickIntro_evented_childThreadCode.js');
442
443/*
444 This is the code that's .load()ed into the child/background thread:
445
446 function fibo (n) {
447 return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
448 }
449
450 thread.on('giveMeTheFibo', function onGiveMeTheFibo (data) {
451 this.emit('theFiboIs', fibo(+data)); //Emits 'theFiboIs' in the parent/main thread.
452 });
453
454*/
455
456//Emit 'giveMeTheFibo' in the child/background thread.
457thread.emit('giveMeTheFibo', 35);
458
459//Listener for the 'theFiboIs' events emitted by the child/background thread.
460thread.on('theFiboIs', function cb (data) {
461 process.stdout.write(data);
462 this.emit('giveMeTheFibo', 35);
463});
464
465(function spinForever () {
466 setImmediate(spinForever);
467})();
468```
469
470**G.-** This is a demo of the `webworker-threads` eventEmitter API, using a pool of threads:
471
472 cat examples/quickIntro_multiThreadEvented.js
473
474``` javascript
475var numThreads= 10;
476var threadPool= require('webworker-threads').createPool(numThreads);
477threadPool.load(__dirname + '/quickIntro_evented_childThreadCode.js');
478
479/*
480 This is the code that's .load()ed into the child/background threads:
481
482 function fibo (n) {
483 return n > 1 ? fibo(n - 1) + fibo(n - 2) : 1;
484 }
485
486 thread.on('giveMeTheFibo', function onGiveMeTheFibo (data) {
487 this.emit('theFiboIs', fibo(+data)); //Emits 'theFiboIs' in the parent/main thread.
488 });
489
490*/
491
492//Emit 'giveMeTheFibo' in all the child/background threads.
493threadPool.all.emit('giveMeTheFibo', 35);
494
495//Listener for the 'theFiboIs' events emitted by the child/background threads.
496threadPool.on('theFiboIs', function cb (data) {
497 process.stdout.write(" ["+ this.id+ "]"+ data);
498 this.emit('giveMeTheFibo', 35);
499});
500
501(function spinForever () {
502 setImmediate(spinForever);
503})();
504```
505
506## More examples
507
508The `examples` directory contains a few more examples:
509
510* [ex01_basic](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex01_basic.md): Running a simple function in a thread.
511* [ex02_events](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex02_events.md): Sending events from a worker thread.
512* [ex03_ping_pong](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex03_ping_pong.md): Sending events both ways between the main thread and a worker thread.
513* [ex04_main](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex04_main.md): Loading the worker code from a file.
514* [ex05_pool](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex05_pool.md): Using the thread pool.
515* [ex06_jason](https://github.com/xk/node-threads-a-gogo/blob/master/examples/ex06_jason.md): Passing complex objects to threads.
516
517## Rationale
518
519[Node.js](http://nodejs.org) is the most awesome, cute and super-sexy piece of free, open source software.
520
521Its event loop can spin as fast and smooth as a turbo, and roughly speaking, **the faster it spins, the more power it delivers**. That's why [@ryah](http://twitter.com/ryah) took great care to ensure that no -possibly slow- I/O operations could ever block it: a pool of background threads (thanks to [Marc Lehmann's libeio library](http://software.schmorp.de/pkg/libeio.html)) handle any blocking I/O calls in the background, in parallel.
522
523In Node it's verboten to write a server like this:
524
525``` javascript
526http.createServer(function (req,res) {
527 res.end( fs.readFileSync(path) );
528}).listen(port);
529```
530Because synchronous I/O calls **block the turbo**, and without proper boost, Node.js begins to stutter and behaves clumsily. To avoid it there's the asynchronous version of `.readFile()`, in continuation passing style, that takes a callback:
531
532``` javascript
533fs.readfile(path, function cb (err, data) { /* ... */ });
534```
535
536It's cool, we love it (*), and there's hundreds of ad hoc built-in functions like this in Node to help us deal with almost any variety of possibly slow, blocking I/O.
537
538### But what's with longish, CPU-bound tasks?
539
540How do you avoid blocking the event loop, when the task at hand isn't I/O bound, and lasts more than a few fractions of a millisecond?
541
542``` javascript
543http.createServer(function cb (req,res) {
544 res.end( fibonacci(40) );
545}).listen(port);
546```
547
548You simply can't, because there's no way... well, there wasn't before `webworker-threads`.
549
550### Why Threads
551
552Threads (kernel threads) are very interesting creatures. They provide:
553
5541.- Parallelism: All the threads run in parallel. On a single core processor, the CPU is switched rapidly back and forth among the threads providing the illusion that the threads are running in parallel, albeit on a slower CPU than the real one. With 10 compute-bound threads in a process, the threads would appear to be running in parallel, each one on a CPU with 1/10th the speed of the real CPU. On a multi-core processor, threads are truly running in parallel, and get time-sliced when the number of threads exceed the number of cores. So with 12 compute bound threads on a quad-core processor each thread will appear to run at 1/3rd of the nominal core speed.
555
5562.- Fairness: No thread is more important than another, cores and CPU slices are fairly distributed among threads by the OS scheduler.
557
5583.- Threads fully exploit all the available CPU resources in your system. On a loaded system running many tasks in many threads, the more cores there are, the faster the threads will complete. Automatically.
559
560### Why not multiple processes.
561
562The "can't block the event loop" problem is inherent to Node's evented model. No matter how many Node processes you have running as a [Node-cluster](http://blog.nodejs.org/2011/10/04/an-easy-way-to-build-scalable-network-programs/), it won't solve its issues with CPU-bound tasks.
563
564Launch a cluster of N Nodes running the example B (`quickIntro_blocking.js`) above, and all you'll get is N -instead of one- Nodes with their event loops blocked and showing a sluggish performance.