1 | # ES Module Lexer
|
2 |
|
3 | [![Build Status][travis-image]][travis-url]
|
4 |
|
5 | A JS module syntax lexer used in [es-module-shims](https://github.com/guybedford/es-module-shims).
|
6 |
|
7 | Outputs the list of exports and locations of import specifiers, including dynamic import and import meta handling.
|
8 |
|
9 | A very small single JS file (4KiB gzipped) that includes inlined Web Assembly for very fast source analysis of ECMAScript module syntax only.
|
10 |
|
11 | For an example of the performance, Angular 1 (720KiB) is fully parsed in 5ms, in comparison to the fastest JS parser, Acorn which takes over 100ms.
|
12 |
|
13 | _Comprehensively handles the JS language grammar while remaining small and fast. - ~10ms per MB of JS cold and ~5ms per MB of JS warm, [see benchmarks](#benchmarks) for more info._
|
14 |
|
15 | ### Usage
|
16 |
|
17 | ```
|
18 | npm install es-module-lexer
|
19 | ```
|
20 |
|
21 | For use in CommonJS:
|
22 |
|
23 | ```js
|
24 | const { init, parse } = require('es-module-lexer');
|
25 |
|
26 | (async () => {
|
27 | // either await init, or call parse asynchronously
|
28 | // this is necessary for the Web Assembly boot
|
29 | await init;
|
30 |
|
31 | const [imports, exports] = parse('export var p = 5');
|
32 | exports[0] === 'p';
|
33 | })();
|
34 | ```
|
35 |
|
36 | An ES module version is also available from `dist/lexer.js`:
|
37 | Note: This version will be automatically used in rollup/es-dev-server/node (if an es-module project)
|
38 |
|
39 | ```js
|
40 | import { init, parse } from 'es-module-lexer';
|
41 |
|
42 | (async () => {
|
43 | await init;
|
44 |
|
45 | const source = `
|
46 | import { a } from 'asdf';
|
47 | export var p = 5;
|
48 | export function q () {
|
49 |
|
50 | };
|
51 |
|
52 | // Comments provided to demonstrate edge cases
|
53 | import /*comment!*/ ('asdf');
|
54 | import /*comment!*/.meta.asdf;
|
55 | `;
|
56 |
|
57 | const [imports, exports] = parse(source, 'optional-sourcename');
|
58 |
|
59 | // Returns "asdf"
|
60 | source.substring(imports[0].s, imports[0].e);
|
61 |
|
62 | // Returns "import { a } from 'asdf';"
|
63 | source.substring(imports[0].ss, imports[0].se);
|
64 |
|
65 | // Returns "p,q"
|
66 | exports.toString();
|
67 |
|
68 | // Dynamic imports are indicated by imports[1].d > -1
|
69 | // In this case the "d" index is the start of the dynamic import
|
70 | // Returns true
|
71 | imports[1].d > -1;
|
72 |
|
73 | // Returns "'asdf'"
|
74 | source.substring(imports[1].s, imports[1].e);
|
75 | // Returns "import /*comment!*/ ("
|
76 | source.substring(imports[1].d, imports[1].s);
|
77 |
|
78 | // import.meta is indicated by imports[2].d === -2
|
79 | // Returns true
|
80 | imports[2].d === -2;
|
81 | // Returns "import /*comment!*/.meta"
|
82 | source.substring(imports[2].s, imports[2].e);
|
83 | })();
|
84 | ```
|
85 |
|
86 | ### Environment Support
|
87 |
|
88 | Node.js 10+, and [all browsers with Web Assembly support](https://caniuse.com/#feat=wasm).
|
89 |
|
90 | ### Grammar Support
|
91 |
|
92 | * Token state parses all line comments, block comments, strings, template strings, blocks, parens and punctuators.
|
93 | * Division operator / regex token ambiguity is handled via backtracking checks against punctuator prefixes, including closing brace or paren backtracking.
|
94 | * Always correctly parses valid JS source, but may parse invalid JS source without errors.
|
95 |
|
96 | ### Limitations
|
97 |
|
98 | The lexing approach is designed to deal with the full language grammar including RegEx / division operator ambiguity through backtracking and paren / brace tracking.
|
99 |
|
100 | The only limitation to the reduced parser is that the "exports" list may not correctly gather all export identifiers in the following edge cases:
|
101 |
|
102 | ```js
|
103 | // Only "a" is detected as an export, "q" isn't
|
104 | export var a = 'asdf', q = z;
|
105 |
|
106 | // "b" is not detected as an export
|
107 | export var { a: b } = asdf;
|
108 | ```
|
109 |
|
110 | The above cases are handled gracefully in that the lexer will keep going fine, it will just not properly detect the export names above.
|
111 |
|
112 | ### Benchmarks
|
113 |
|
114 | Benchmarks can be run with `npm run bench`.
|
115 |
|
116 | Current results:
|
117 |
|
118 | ```
|
119 | Cold Run, All Samples
|
120 | test/samples/*.js (3057 KiB)
|
121 | > 24ms
|
122 |
|
123 | Warm Runs (average of 25 runs)
|
124 | test/samples/angular.js (719 KiB)
|
125 | > 5.12ms
|
126 | test/samples/angular.min.js (188 KiB)
|
127 | > 3.04ms
|
128 | test/samples/d3.js (491 KiB)
|
129 | > 4.08ms
|
130 | test/samples/d3.min.js (274 KiB)
|
131 | > 2.04ms
|
132 | test/samples/magic-string.js (34 KiB)
|
133 | > 0ms
|
134 | test/samples/magic-string.min.js (20 KiB)
|
135 | > 0ms
|
136 | test/samples/rollup.js (902 KiB)
|
137 | > 5.92ms
|
138 | test/samples/rollup.min.js (429 KiB)
|
139 | > 3.08ms
|
140 |
|
141 | Warm Runs, All Samples (average of 25 runs)
|
142 | test/samples/*.js (3057 KiB)
|
143 | > 17.4ms
|
144 | ```
|
145 |
|
146 | ### Building
|
147 |
|
148 | To build download the WASI SDK from https://github.com/CraneStation/wasi-sdk/releases.
|
149 |
|
150 | The Makefile assumes that the `clang` in PATH corresponds to LLVM 8 (provided by WASI SDK as well, or a standard clang 8 install can be used as well), and that `../wasi-sdk-6` contains the SDK as extracted above, which is important to locate the WASI sysroot.
|
151 |
|
152 | The build through the Makefile is then run via `make lib/lexer.wasm`, which can also be triggered via `npm run build-wasm` to create `dist/lexer.js`.
|
153 |
|
154 | On Windows it may be preferable to use the Linux subsystem.
|
155 |
|
156 | After the Web Assembly build, the CJS build can be triggered via `npm run build`.
|
157 |
|
158 | Optimization passes are run with [Binaryen](https://github.com/WebAssembly/binaryen) prior to publish to reduce the Web Assembly footprint.
|
159 |
|
160 | ### License
|
161 |
|
162 | MIT
|
163 |
|
164 | [travis-url]: https://travis-ci.org/guybedford/es-module-lexer
|
165 | [travis-image]: https://travis-ci.org/guybedford/es-module-lexer.svg?branch=master
|