UNPKG

6.8 kBMarkdownView Raw
1# character-parser
2
3Parse JavaScript one character at a time to look for snippets in Templates. This is not a validator, it's just designed to allow you to have sections of JavaScript delimited by brackets robustly.
4
5[![Build Status](https://img.shields.io/github/workflow/status/ForbesLindesay/character-parser/Publish%20Canary/master?style=for-the-badge)](https://github.com/ForbesLindesay/character-parser/actions)
6[![Rolling Versions](https://img.shields.io/badge/Rolling%20Versions-Enabled-brightgreen?style=for-the-badge)](https://rollingversions.com/ForbesLindesay/character-parser)
7[![NPM version](https://img.shields.io/npm/v/character-parser?style=for-the-badge)](https://www.npmjs.com/package/character-parser)
8
9## Installation
10
11 npm install character-parser
12
13## Usage
14
15### Parsing
16
17Work out how much depth changes:
18
19```js
20var state = parse('foo(arg1, arg2, {\n foo: [a, b\n');
21assert.deepEqual(state.stack, [')', '}', ']']);
22
23parse(' c, d]\n })', state);
24assert.deepEqual(state.stack, []);
25```
26
27### Custom Delimited Expressions
28
29Find code up to a custom delimiter:
30
31```js
32// EJS-style
33var section = parser.parseUntil('foo.bar("%>").baz%> bing bong', '%>');
34assert(section.start === 0);
35assert(section.end === 17); // exclusive end of string
36assert(section.src = 'foo.bar("%>").baz');
37
38var section = parser.parseUntil('<%foo.bar("%>").baz%> bing bong', '%>', {start: 2});
39assert(section.start === 2);
40assert(section.end === 19); // exclusive end of string
41assert(section.src = 'foo.bar("%>").baz');
42
43// Jade-style
44var section = parser.parseUntil('#[p= [1, 2][i]]', ']', {start: 2})
45assert(section.start === 2);
46assert(section.end === 14); // exclusive end of string
47assert(section.src === 'p= [1, 2][i]')
48
49// Dumb parsing
50// Stop at first delimiter encountered, doesn't matter if it's nested or not
51// This is the character-parser@1 default behavior.
52var section = parser.parseUntil('#[p= [1, 2][i]]', '}', {start: 2, ignoreNesting: true})
53assert(section.start === 2);
54assert(section.end === 10); // exclusive end of string
55assert(section.src === 'p= [1, 2')
56''
57```
58
59Delimiters are ignored if they are inside strings or comments.
60
61## API
62
63All methods may throw an exception in the case of syntax errors. The exception contains an additional `code` property that always starts with `CHARACTER_PARSER:` that is unique for the error.
64
65### parse(str, state = defaultState(), options = {start: 0, end: src.length})
66
67Parse a string starting at the index start, and return the state after parsing that string.
68
69If you want to parse one string in multiple sections you should keep passing the resulting state to the next parse operation.
70
71Returns a `State` object.
72
73### parseUntil(src, delimiter, options = {start: 0, ignoreLineComment: false, ignoreNesting: false})
74
75Parses the source until the first occurrence of `delimiter` which is not in a string or a comment.
76
77If `ignoreLineComment` is `true`, it will still count if the delimiter occurs in a line comment.
78
79If `ignoreNesting` is `true`, it will stop at the first bracket, not taking into account if the bracket part of nesting or not. See example above.
80
81It returns an object with the structure:
82
83```js
84{
85 start: 0,//index of first character of string
86 end: 13,//index of first character after the end of string
87 src: 'source string'
88}
89```
90
91### parseChar(character, state = defaultState())
92
93Parses the single character and returns the state. See `parse` for the structure of the returned state object. N.B. character must be a single character not a multi character string.
94
95### defaultState()
96
97Get a default starting state.
98
99### isPunctuator(character)
100
101Returns `true` if `character` represents punctuation in JavaScript.
102
103### isKeyword(name)
104
105Returns `true` if `name` is a keyword in JavaScript.
106
107### TOKEN_TYPES & BRACKETS
108
109Objects whose values can be a frame in the `stack` property of a State (documented below).
110
111## State
112
113A state is an object with the following structure
114
115```js
116{
117 stack: [], // stack of detected brackets; the outermost is [0]
118 regexpStart: false, // true if a slash is just encountered and a REGEXP state has just been added to the stack
119
120 escaped: false, // true if in a string and the last character was an escape character
121 hasDollar: false, // true if in a template string and the last character was a dollar sign
122
123 src: '', // the concatenated source string
124 history: '', // reversed `src`
125 lastChar: '' // last parsed character
126}
127```
128
129`stack` property can contain any of the following:
130
131- Any of the property values of `characterParser.TOKEN_TYPES`
132- Any of the property values of `characterParser.BRACKETS` (the end bracket, not the starting bracket)
133
134It also has the following useful methods:
135
136- `.current()` returns the innermost bracket (i.e. the last stack frame).
137- `.isString()` returns `true` if the current location is inside a string.
138- `.isComment()` returns `true` if the current location is inside a comment.
139- `.isNesting([opts])` returns `true` if the current location is not at the top level, i.e. if the stack is not empty. If `opts.ignoreLineComment` is `true`, line comments are not counted as a level, so for `// a` it will still return false.
140
141### Errors
142
143All errors thrown by character-parser has a `code` property attached to it that allows one to identify what sort of error is thrown. For errors thrown from `parse` and `parseUntil`, an additional `index` property is available.
144
145## Transition from v1
146
147In character-parser@2, we have changed the APIs quite a bit. These are some notes that will help you transition to the new version.
148
149### State Object Changes
150
151Instead of keeping depths of different brackets, we are now keeping a stack. We also removed some properties:
152
153```js
154state.lineComment → state.current() === parser.TOKEN_TYPES.LINE_COMMENT
155state.blockComment → state.current() === parser.TOKEN_TYPES.BLOCK_COMMENT
156state.singleQuote → state.current() === parser.TOKEN_TYPES.SINGLE_QUOTE
157state.doubleQuote → state.current() === parser.TOKEN_TYPES.DOUBLE_QUOTE
158state.regexp → state.current() === parser.TOKEN_TYPES.REGEXP
159```
160
161### `parseMax`
162
163This function has been removed since the usefulness of this function has been questioned. You should find that `parseUntil` is a better choice for your task.
164
165### `parseUntil`
166
167The default behavior when the delimiter is a bracket has been changed so that nesting is taken into account to determine if the end is reached.
168
169To preserve the original behavior, pass `ignoreNesting: true` as an option.
170
171To see the difference between the new and old behaviors, see the "Usage" section earlier.
172
173### `parseMaxBracket`
174
175This function has been merged into `parseUntil`. You can directly rename the function call without any repercussions.
176
177## License
178
179MIT