UNPKG

10.1 kBMarkdownView Raw
1[tests]: http://img.shields.io/travis/mafintosh/csv-parser.svg
2[tests-url]: http://travis-ci.org/mafintosh/csv-parser
3
4[cover]: https://codecov.io/gh/mafintosh/csv-parser/branch/master/graph/badge.svg
5[cover-url]: https://codecov.io/gh/mafintosh/csv-parser
6
7[size]: https://packagephobia.now.sh/badge?p=csv-parser
8[size-url]: https://packagephobia.now.sh/result?p=csv-parser
9
10# csv-parser
11
12[![tests][tests]][tests-url]
13[![cover][cover]][cover-url]
14[![size][size]][size-url]
15
16Streaming CSV parser that aims for maximum speed as well as compatibility with
17the [csv-spectrum](https://npmjs.org/csv-spectrum) CSV acid test suite.
18
19`csv-parser` can convert CSV into JSON at at rate of around 90,000 rows per
20second. Performance varies with the data used; try `bin/bench.js <your file>`
21to benchmark your data.
22
23`csv-parser` can be used in the browser with [browserify](http://browserify.org/).
24
25[neat-csv](https://github.com/sindresorhus/neat-csv) can be used if a `Promise`
26based interface to `csv-parser` is needed.
27
28_Note: This module requires Node v8.16.0 or higher._
29
30## Benchmarks
31
32⚡️ `csv-parser` is greased-lightning fast
33
34```console
35→ npm run bench
36
37 Filename Rows Parsed Duration
38 backtick.csv 2 3.5ms
39 bad-data.csv 3 0.55ms
40 basic.csv 1 0.26ms
41 comma-in-quote.csv 1 0.29ms
42 comment.csv 2 0.40ms
43 empty-columns.csv 1 0.40ms
44 escape-quotes.csv 3 0.38ms
45 geojson.csv 3 0.46ms
46 large-dataset.csv 7268 73ms
47 newlines.csv 3 0.35ms
48 no-headers.csv 3 0.26ms
49 option-comment.csv 2 0.24ms
50 option-escape.csv 3 0.25ms
51 option-maxRowBytes.csv 4577 39ms
52 option-newline.csv 0 0.47ms
53 option-quote-escape.csv 3 0.33ms
54 option-quote-many.csv 3 0.38ms
55 option-quote.csv 2 0.22ms
56 quotes+newlines.csv 3 0.20ms
57 strict.csv 3 0.22ms
58 latin.csv 2 0.38ms
59 mac-newlines.csv 2 0.28ms
60 utf16-big.csv 2 0.33ms
61 utf16.csv 2 0.26ms
62 utf8.csv 2 0.24ms
63```
64
65## Install
66
67Using npm:
68
69```console
70$ npm install csv-parser
71```
72
73Using yarn:
74
75```console
76$ yarn add csv-parser
77```
78
79## Usage
80
81To use the module, create a readable stream to a desired CSV file, instantiate
82`csv`, and pipe the stream to `csv`.
83
84Suppose you have a CSV file `data.csv` which contains the data:
85
86```
87NAME,AGE
88Daffy Duck,24
89Bugs Bunny,22
90```
91
92It could then be parsed, and results shown like so:
93
94``` js
95const csv = require('csv-parser')
96const fs = require('fs')
97const results = [];
98
99fs.createReadStream('data.csv')
100 .pipe(csv())
101 .on('data', (data) => results.push(data))
102 .on('end', () => {
103 console.log(results);
104 // [
105 // { NAME: 'Daffy Duck', AGE: '24' },
106 // { NAME: 'Bugs Bunny', AGE: '22' }
107 // ]
108 });
109```
110
111To specify options for `csv`, pass an object argument to the function. For
112example:
113
114```js
115csv({ separator: '\t' });
116```
117
118## API
119
120### csv([options | headers])
121
122Returns: `Array[Object]`
123
124#### options
125
126Type: `Object`
127
128As an alternative to passing an `options` object, you may pass an `Array[String]`
129which specifies the headers to use. For example:
130
131```js
132csv(['Name', 'Age']);
133```
134
135If you need to specify options _and_ headers, please use the the object notation
136with the `headers` property as shown below.
137
138#### escape
139
140Type: `String`<br>
141Default: `"`
142
143A single-character string used to specify the character used to escape strings
144in a CSV row.
145
146#### headers
147
148Type: `Array[String] | Boolean`
149
150Specifies the headers to use. Headers define the property key for each value in
151a CSV row. If no `headers` option is provided, `csv-parser` will use the first
152line in a CSV file as the header specification.
153
154If `false`, specifies that the first row in a data file does _not_ contain
155headers, and instructs the parser to use the column index as the key for each column.
156Using `headers: false` with the same `data.csv` example from above would yield:
157
158``` js
159[
160 { '0': 'Daffy Duck', '1': 24 },
161 { '0': 'Bugs Bunny', '1': 22 }
162]
163```
164
165_Note: If using the `headers` for an operation on a file which contains headers on the first line, specify `skipLines: 1` to skip over the row, or the headers row will appear as normal row data. Alternatively, use the `mapHeaders` option to manipulate existing headers in that scenario._
166
167#### mapHeaders
168
169Type: `Function`
170
171A function that can be used to modify the values of each header. Return a `String` to modify the header. Return `null` to remove the header, and it's column, from the results.
172
173```js
174csv({
175 mapHeaders: ({ header, index }) => header.toLowerCase()
176})
177```
178
179##### Parameters
180
181**header** _String_ The current column header.<br/>
182**index** _Number_ The current column index.
183
184#### mapValues
185
186Type: `Function`
187
188A function that can be used to modify the content of each column. The return value will replace the current column content.
189
190```js
191csv({
192 mapValues: ({ header, index, value }) => value.toLowerCase()
193})
194```
195
196##### Parameters
197
198**header** _String_ The current column header.<br/>
199**index** _Number_ The current column index.<br/>
200**value** _String_ The current column value (or content).
201
202##### newline
203
204Type: `String`<br>
205Default: `\n`
206
207Specifies a single-character string to denote the end of a line in a CSV file.
208
209#### quote
210
211Type: `String`<br>
212Default: `"`
213
214Specifies a single-character string to denote a quoted string.
215
216#### raw
217
218Type: `Boolean`<br>
219
220If `true`, instructs the parser not to decode UTF-8 strings.
221
222#### separator
223
224Type: `String`<br>
225Default: `,`
226
227Specifies a single-character string to use as the column separator for each row.
228
229#### skipComments
230
231Type: `Boolean | String`<br>
232Default: `false`
233
234Instructs the parser to ignore lines which represent comments in a CSV file. Since there is no specification that dictates what a CSV comment looks like, comments should be considered non-standard. The "most common" character used to signify a comment in a CSV file is `"#"`. If this option is set to `true`, lines which begin with `#` will be skipped. If a custom character is needed to denote a commented line, this option may be set to a string which represents the leading character(s) signifying a comment line.
235
236#### skipLines
237
238Type: `Number`<br>
239Default: `0`
240
241Specifies the number of lines at the beginning of a data file that the parser should
242skip over, prior to parsing headers.
243
244#### maxRowBytes
245
246Type: `Number`<br>
247Default: `Number.MAX_SAFE_INTEGER`
248
249Maximum number of bytes per row. An error is thrown if a line exeeds this value. The default value is on 8 peta byte.
250
251#### strict
252
253Type: `Boolean`<br>
254
255If `true`, instructs the parser that the number of columns in each row must match
256the number of `headers` specified.
257
258## Events
259
260The following events are emitted during parsing:
261
262### `data`
263
264Emitted for each row of data parsed with the notable exception of the header
265row. Please see [Usage](#Usage) for an example.
266
267### `headers`
268
269Emitted after the header row is parsed. The first parameter of the event
270callback is an `Array[String]` containing the header names.
271
272```js
273fs.createReadStream('data.csv')
274 .pipe(csv())
275 .on('headers', (headers) => {
276 console.log(`First header: ${headers[0]}`)
277 })
278```
279
280### Readable Stream Events
281
282Events available on Node built-in
283[Readable Streams](https://nodejs.org/api/stream.html#stream_class_stream_readable)
284are also emitted. The `end` event should be used to detect the end of parsing.
285
286## CLI
287
288This module also provides a CLI which will convert CSV to
289[newline-delimited](http://ndjson.org/) JSON. The following CLI flags can be
290used to control how input is parsed:
291
292```
293Usage: csv-parser [filename?] [options]
294
295 --escape,-e Set the escape character (defaults to quote value)
296 --headers,-h Explicitly specify csv headers as a comma separated list
297 --help Show this help
298 --output,-o Set output file. Defaults to stdout
299 --quote,-q Set the quote character ('"' by default)
300 --remove Remove columns from output by header name
301 --separator,-s Set the separator character ("," by default)
302 --skipComments,-c Skip CSV comments that begin with '#'. Set a value to change the comment character.
303 --skipLines,-l Set the number of lines to skip to before parsing headers
304 --strict Require column length match headers length
305 --version,-v Print out the installed version
306```
307
308For example; to parse a TSV file:
309
310```
311cat data.tsv | csv-parser -s $'\t'
312```
313
314## Encoding
315
316Users may encounter issues with the encoding of a CSV file. Transcoding the
317source stream can be done neatly with a modules such as:
318- [`iconv-lite`](https://www.npmjs.com/package/iconv-lite)
319- [`iconv`](https://www.npmjs.com/package/iconv)
320
321Or native [`iconv`](http://man7.org/linux/man-pages/man1/iconv.1.html) if part
322of a pipeline.
323
324## Byte Order Marks
325
326Some CSV files may be generated with, or contain a leading [Byte Order Mark](https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8). This may cause issues parsing headers and/or data from your file. From Wikipedia:
327
328>The Unicode Standard permits the BOM in UTF-8, but does not require nor recommend its use. Byte order has no meaning in UTF-8.
329
330To use this module with a file containing a BOM, please use a module like [strip-bom-stream](https://github.com/sindresorhus/strip-bom-stream) in your pipeline:
331
332```js
333const fs = require('fs');
334
335const csv = require('csv-parser');
336const stripBom = require('strip-bom-stream');
337
338fs.createReadStream('data.csv')
339 .pipe(stripBom())
340 .pipe(csv())
341 ...
342```
343
344When using the CLI, the BOM can be removed by first running:
345
346```console
347$ sed $'s/\xEF\xBB\xBF//g' data.csv
348```
349
350## Meta
351
352[CONTRIBUTING](./.github/CONTRIBUTING)
353
354[LICENSE (MIT)](./LICENSE)