UNPKG

3.81 kBMarkdownView Raw
1## Understanding the analysis
2
3JavaScript is a single-threaded event-driven non-blocking language.
4
5In Node.js I/O tasks are delegated to the Operating System, JavaScript functions (callbacks)
6are invoked once a related I/O operation is complete. At a rudimentary level, the process of
7queueing events and later handling results in-thread is conceptually achieved with the
8"Event Loop" abstraction.
9
10At a (very) basic level the following pseudo-code demonstrates the Event Loop:
11`while (event) handle(event)`
12
13The Event Loop paradigm leads to an ergonomic development experience for high concurrency programming
14(relative to the multi-threaded paradigm).
15
16However, since the Event Loop operates on a single thread this is essentially a shared
17execution environment for every potentially concurrent action. This means that if the
18execution time of any line of code exceeds an acceptable threshold it interferes with
19processing of future events (for instance, an incoming HTTP request); new events cannot
20be processed because the same thread that would be processing the event is currently
21blocked by a long-running synchronous operation.
22
23Asynchronous operations are those which queue an event for later handling, they tend to be
24identified by an API that requires a callback, or uses promises (or async/await).
25
26Whereas synchronous operations simply return a value. Long running synchronous operations are either
27functions that perform blocking I/O (such as `fs.readFileSync`) or potentially resource intensive
28algorithms (such as `JSON.stringify` or `react.renderToString`).
29
30To solve the Event Loop issue, we need to find out where the synchronous bottleneck is.
31This may (commonly) be identified as a single long-running synchronous function, or
32the bottleneck may be distributed which would take rather more detective work.
33
34## Next Steps
35- If the system is already deployed, mitigate the issue immediately by implementing
36 HTTP 503 Service Unavailable functionality (see *Load Shedding* in **Reference**)
37 + This should allow the deployments Load Balance to route traffic to a different service instance
38 + In the worse case the user receives the 503 in which case they must retry (this is still preferable to waiting for a timeout)
39- Use `clinic flame` to generate a flamegraph
40 + Run <code class='snippet'>clinic flame --help</code> to get started
41 + see "Understanding Flamegraphs and how to use [0x](https://www.npmjs.com/package/0x)" article in the **Reference** section for more information
42- Look for "hot" blocks, these are functions that are observed (at a higher relative frequency) to be at the top the stack per CPU sample – in other words, such functions are blocking the event loop
43 - (In the case of a distributed bottleneck, start by looking for lots of wide tips at the top of the Flamegraph)
44
45## Reference
46
47- Load Shedding
48 + Express, Koa, Restify, `http`: [overload-protection](https://www.npmjs.com/package/overload-protection)
49 + Hapi: [Server load sampleInterval option](https://hapi.dev/api/#-serveroptionsload) & [Server connections load maxEventLoopDelay](https://hapijs.com/api#-serveroptionsload)
50 + Fastify: [under-pressure](https://www.npmjs.com/package/under-pressure)
51 + General: [loopbench](https://www.npmjs.com/package/loopbench)
52- [Concurrency model and Event Loop
53](https://developer.mozilla.org/en-US/docs/Web/JavaScript/EventLoop)
54- [Overview of Blocking vs Non-Blocking](https://nodejs.org/en/docs/guides/blocking-vs-non-blocking/)
55- [Don't Block the Event Loop (or the Worker Pool)](https://nodejs.org/en/docs/guides/dont-block-the-event-loop/)
56- Understanding Flamegraphs and how to use 0x: [Tuning Node.js app performance with autocannon and 0x](https://www.nearform.com/blog/tuning-node-js-app-performance-with-autocannon-and-0x/)