1 | # Acorn
|
2 |
|
3 | A tiny, fast JavaScript parser written in JavaScript.
|
4 |
|
5 | ## Community
|
6 |
|
7 | Acorn is open source software released under an
|
8 | [MIT license](https://github.com/acornjs/acorn/blob/master/acorn/LICENSE).
|
9 |
|
10 | You are welcome to
|
11 | [report bugs](https://github.com/acornjs/acorn/issues) or create pull
|
12 | requests on [github](https://github.com/acornjs/acorn).
|
13 |
|
14 | ## Installation
|
15 |
|
16 | The easiest way to install acorn is from [`npm`](https://www.npmjs.com/):
|
17 |
|
18 | ```sh
|
19 | npm install acorn
|
20 | ```
|
21 |
|
22 | Alternately, you can download the source and build acorn yourself:
|
23 |
|
24 | ```sh
|
25 | git clone https://github.com/acornjs/acorn.git
|
26 | cd acorn
|
27 | npm install
|
28 | ```
|
29 |
|
30 | ## Interface
|
31 |
|
32 | **parse**`(input, options)` is the main interface to the library. The
|
33 | `input` parameter is a string, `options` must be an object setting
|
34 | some of the options listed below. The return value will be an abstract
|
35 | syntax tree object as specified by the [ESTree
|
36 | spec](https://github.com/estree/estree).
|
37 |
|
38 | ```javascript
|
39 | let acorn = require("acorn");
|
40 | console.log(acorn.parse("1 + 1", {ecmaVersion: 2020}));
|
41 | ```
|
42 |
|
43 | When encountering a syntax error, the parser will raise a
|
44 | `SyntaxError` object with a meaningful message. The error object will
|
45 | have a `pos` property that indicates the string offset at which the
|
46 | error occurred, and a `loc` object that contains a `{line, column}`
|
47 | object referring to that same position.
|
48 |
|
49 | Options are provided by in a second argument, which should be an
|
50 | object containing any of these fields (only `ecmaVersion` is
|
51 | required):
|
52 |
|
53 | - **ecmaVersion**: Indicates the ECMAScript version to parse. Can be a
|
54 | number, either in year (`2022`) or plain version number (`6`) form,
|
55 | or `"latest"` (the latest the library supports). This influences
|
56 | support for strict mode, the set of reserved words, and support for
|
57 | new syntax features.
|
58 |
|
59 | **NOTE**: Only 'stage 4' (finalized) ECMAScript features are being
|
60 | implemented by Acorn. Other proposed new features must be
|
61 | implemented through plugins.
|
62 |
|
63 | - **sourceType**: Indicate the mode the code should be parsed in. Can be
|
64 | either `"script"` or `"module"`. This influences global strict mode
|
65 | and parsing of `import` and `export` declarations.
|
66 |
|
67 | **NOTE**: If set to `"module"`, then static `import` / `export` syntax
|
68 | will be valid, even if `ecmaVersion` is less than 6.
|
69 |
|
70 | - **onInsertedSemicolon**: If given a callback, that callback will be
|
71 | called whenever a missing semicolon is inserted by the parser. The
|
72 | callback will be given the character offset of the point where the
|
73 | semicolon is inserted as argument, and if `locations` is on, also a
|
74 | `{line, column}` object representing this position.
|
75 |
|
76 | - **onTrailingComma**: Like `onInsertedSemicolon`, but for trailing
|
77 | commas.
|
78 |
|
79 | - **allowReserved**: If `false`, using a reserved word will generate
|
80 | an error. Defaults to `true` for `ecmaVersion` 3, `false` for higher
|
81 | versions. When given the value `"never"`, reserved words and
|
82 | keywords can also not be used as property names (as in Internet
|
83 | Explorer's old parser).
|
84 |
|
85 | - **allowReturnOutsideFunction**: By default, a return statement at
|
86 | the top level raises an error. Set this to `true` to accept such
|
87 | code.
|
88 |
|
89 | - **allowImportExportEverywhere**: By default, `import` and `export`
|
90 | declarations can only appear at a program's top level. Setting this
|
91 | option to `true` allows them anywhere where a statement is allowed,
|
92 | and also allows `import.meta` expressions to appear in scripts
|
93 | (when `sourceType` is not `"module"`).
|
94 |
|
95 | - **allowAwaitOutsideFunction**: If `false`, `await` expressions can
|
96 | only appear inside `async` functions. Defaults to `true` in modules
|
97 | for `ecmaVersion` 2022 and later, `false` for lower versions.
|
98 | Setting this option to `true` allows to have top-level `await`
|
99 | expressions. They are still not allowed in non-`async` functions,
|
100 | though.
|
101 |
|
102 | - **allowSuperOutsideMethod**: By default, `super` outside a method
|
103 | raises an error. Set this to `true` to accept such code.
|
104 |
|
105 | - **allowHashBang**: When this is enabled, if the code starts with the
|
106 | characters `#!` (as in a shellscript), the first line will be
|
107 | treated as a comment. Defaults to true when `ecmaVersion` >= 2023.
|
108 |
|
109 | - **checkPrivateFields**: By default, the parser will verify that
|
110 | private properties are only used in places where they are valid and
|
111 | have been declared. Set this to false to turn such checks off.
|
112 |
|
113 | - **locations**: When `true`, each node has a `loc` object attached
|
114 | with `start` and `end` subobjects, each of which contains the
|
115 | one-based line and zero-based column numbers in `{line, column}`
|
116 | form. Default is `false`.
|
117 |
|
118 | - **onToken**: If a function is passed for this option, each found
|
119 | token will be passed in same format as tokens returned from
|
120 | `tokenizer().getToken()`.
|
121 |
|
122 | If array is passed, each found token is pushed to it.
|
123 |
|
124 | Note that you are not allowed to call the parser from the
|
125 | callback—that will corrupt its internal state.
|
126 |
|
127 | - **onComment**: If a function is passed for this option, whenever a
|
128 | comment is encountered the function will be called with the
|
129 | following parameters:
|
130 |
|
131 | - `block`: `true` if the comment is a block comment, false if it
|
132 | is a line comment.
|
133 | - `text`: The content of the comment.
|
134 | - `start`: Character offset of the start of the comment.
|
135 | - `end`: Character offset of the end of the comment.
|
136 |
|
137 | When the `locations` options is on, the `{line, column}` locations
|
138 | of the comment’s start and end are passed as two additional
|
139 | parameters.
|
140 |
|
141 | If array is passed for this option, each found comment is pushed
|
142 | to it as object in Esprima format:
|
143 |
|
144 | ```javascript
|
145 | {
|
146 | "type": "Line" | "Block",
|
147 | "value": "comment text",
|
148 | "start": Number,
|
149 | "end": Number,
|
150 | // If `locations` option is on:
|
151 | "loc": {
|
152 | "start": {line: Number, column: Number}
|
153 | "end": {line: Number, column: Number}
|
154 | },
|
155 | // If `ranges` option is on:
|
156 | "range": [Number, Number]
|
157 | }
|
158 | ```
|
159 |
|
160 | Note that you are not allowed to call the parser from the
|
161 | callback—that will corrupt its internal state.
|
162 |
|
163 | - **ranges**: Nodes have their start and end characters offsets
|
164 | recorded in `start` and `end` properties (directly on the node,
|
165 | rather than the `loc` object, which holds line/column data. To also
|
166 | add a
|
167 | [semi-standardized](https://bugzilla.mozilla.org/show_bug.cgi?id=745678)
|
168 | `range` property holding a `[start, end]` array with the same
|
169 | numbers, set the `ranges` option to `true`.
|
170 |
|
171 | - **program**: It is possible to parse multiple files into a single
|
172 | AST by passing the tree produced by parsing the first file as the
|
173 | `program` option in subsequent parses. This will add the toplevel
|
174 | forms of the parsed file to the "Program" (top) node of an existing
|
175 | parse tree.
|
176 |
|
177 | - **sourceFile**: When the `locations` option is `true`, you can pass
|
178 | this option to add a `source` attribute in every node’s `loc`
|
179 | object. Note that the contents of this option are not examined or
|
180 | processed in any way; you are free to use whatever format you
|
181 | choose.
|
182 |
|
183 | - **directSourceFile**: Like `sourceFile`, but a `sourceFile` property
|
184 | will be added (regardless of the `location` option) directly to the
|
185 | nodes, rather than the `loc` object.
|
186 |
|
187 | - **preserveParens**: If this option is `true`, parenthesized expressions
|
188 | are represented by (non-standard) `ParenthesizedExpression` nodes
|
189 | that have a single `expression` property containing the expression
|
190 | inside parentheses.
|
191 |
|
192 | **parseExpressionAt**`(input, offset, options)` will parse a single
|
193 | expression in a string, and return its AST. It will not complain if
|
194 | there is more of the string left after the expression.
|
195 |
|
196 | **tokenizer**`(input, options)` returns an object with a `getToken`
|
197 | method that can be called repeatedly to get the next token, a `{start,
|
198 | end, type, value}` object (with added `loc` property when the
|
199 | `locations` option is enabled and `range` property when the `ranges`
|
200 | option is enabled). When the token's type is `tokTypes.eof`, you
|
201 | should stop calling the method, since it will keep returning that same
|
202 | token forever.
|
203 |
|
204 | Note that tokenizing JavaScript without parsing it is, in modern
|
205 | versions of the language, not really possible due to the way syntax is
|
206 | overloaded in ways that can only be disambiguated by the parse
|
207 | context. This package applies a bunch of heuristics to try and do a
|
208 | reasonable job, but you are advised to use `parse` with the `onToken`
|
209 | option instead of this.
|
210 |
|
211 | In ES6 environment, returned result can be used as any other
|
212 | protocol-compliant iterable:
|
213 |
|
214 | ```javascript
|
215 | for (let token of acorn.tokenizer(str)) {
|
216 | // iterate over the tokens
|
217 | }
|
218 |
|
219 | // transform code to array of tokens:
|
220 | var tokens = [...acorn.tokenizer(str)];
|
221 | ```
|
222 |
|
223 | **tokTypes** holds an object mapping names to the token type objects
|
224 | that end up in the `type` properties of tokens.
|
225 |
|
226 | **getLineInfo**`(input, offset)` can be used to get a `{line,
|
227 | column}` object for a given program string and offset.
|
228 |
|
229 | ### The `Parser` class
|
230 |
|
231 | Instances of the **`Parser`** class contain all the state and logic
|
232 | that drives a parse. It has static methods `parse`,
|
233 | `parseExpressionAt`, and `tokenizer` that match the top-level
|
234 | functions by the same name.
|
235 |
|
236 | When extending the parser with plugins, you need to call these methods
|
237 | on the extended version of the class. To extend a parser with plugins,
|
238 | you can use its static `extend` method.
|
239 |
|
240 | ```javascript
|
241 | var acorn = require("acorn");
|
242 | var jsx = require("acorn-jsx");
|
243 | var JSXParser = acorn.Parser.extend(jsx());
|
244 | JSXParser.parse("foo(<bar/>)", {ecmaVersion: 2020});
|
245 | ```
|
246 |
|
247 | The `extend` method takes any number of plugin values, and returns a
|
248 | new `Parser` class that includes the extra parser logic provided by
|
249 | the plugins.
|
250 |
|
251 | ## Command line interface
|
252 |
|
253 | The `bin/acorn` utility can be used to parse a file from the command
|
254 | line. It accepts as arguments its input file and the following
|
255 | options:
|
256 |
|
257 | - `--ecma3|--ecma5|--ecma6|--ecma7|--ecma8|--ecma9|--ecma10`: Sets the ECMAScript version
|
258 | to parse. Default is version 9.
|
259 |
|
260 | - `--module`: Sets the parsing mode to `"module"`. Is set to `"script"` otherwise.
|
261 |
|
262 | - `--locations`: Attaches a "loc" object to each node with "start" and
|
263 | "end" subobjects, each of which contains the one-based line and
|
264 | zero-based column numbers in `{line, column}` form.
|
265 |
|
266 | - `--allow-hash-bang`: If the code starts with the characters #! (as
|
267 | in a shellscript), the first line will be treated as a comment.
|
268 |
|
269 | - `--allow-await-outside-function`: Allows top-level `await` expressions.
|
270 | See the `allowAwaitOutsideFunction` option for more information.
|
271 |
|
272 | - `--compact`: No whitespace is used in the AST output.
|
273 |
|
274 | - `--silent`: Do not output the AST, just return the exit status.
|
275 |
|
276 | - `--help`: Print the usage information and quit.
|
277 |
|
278 | The utility spits out the syntax tree as JSON data.
|
279 |
|
280 | ## Existing plugins
|
281 |
|
282 | - [`acorn-jsx`](https://github.com/RReverser/acorn-jsx): Parse [Facebook JSX syntax extensions](https://github.com/facebook/jsx)
|