UNPKG

30.3 kBMarkdownView Raw
1[![Build Status](https://travis-ci.org/Keyang/node-csvtojson.svg?branch=master)](https://travis-ci.org/Keyang/node-csvtojson)
2
3# CSVTOJSON
4All you need nodejs csv to json converter.
5* Large CSV data
6* Command Line Tool and Node.JS Lib
7* Complex/nested JSON
8* Easy Customised Parser
9* Stream based
10* multi CPU core support
11* Easy Usage
12* more!
13
14# Demo
15
16[Here](http://keyangxiang.com/csvtojson/) is a free online csv to json service ultilising latest csvtojson module.
17
18## Menu
19* [Installation](#installation)
20* [Usage](#usage)
21 * [Library](#library)
22 * [Convert from a file](#from-file)
23 * [Convert from a web resource / Readable stream](#from-web)
24 * [Convert from CSV string](#from-string)
25* [Parameters](#params)
26* [Result Transform](#result-transform)
27 * [Synchronouse Transformer](#synchronouse-transformer)
28 * [Asynchronouse Transformer](#asynchronouse-transformer)
29 * [Convert to other data type](#convert-to-other-data-type)
30* [Hooks](#hooks)
31* [Events](#events)
32* [Flags](#flags)
33* [Big CSV File Streaming](#big-csv-file)
34* [Process Big CSV File in CLI](#convert-big-csv-file-with-command-line-tool)
35* [Parse String](#parse-string)
36* [Empowered JSON Parser](#empowered-json-parser)
37* [Field Type](#field-type)
38* [Multi-Core / Fork Process](#multi-cpu-core)
39* [Header Configuration](#header-configuration)
40* [Error Handling](#error-handling)
41* [Customised Parser](#parser)
42* [Stream Options](#stream-options)
43* [Change Log](#change-log)
44
45GitHub: https://github.com/Keyang/node-csvtojson
46
47## Installation
48
49>npm install -g csvtojson
50
51>npm install csvtojson --save
52
53## Usage
54
55### library
56
57#### From File
58
59You can use File stream
60
61```js
62//Converter Class
63var Converter = require("csvtojson").Converter;
64var converter = new Converter({});
65
66//end_parsed will be emitted once parsing finished
67converter.on("end_parsed", function (jsonArray) {
68 console.log(jsonArray); //here is your result jsonarray
69});
70
71//read from file
72require("fs").createReadStream("./file.csv").pipe(converter);
73```
74
75Or use fromFile convenient function
76
77```js
78//Converter Class
79var Converter = require("csvtojson").Converter;
80var converter = new Converter({});
81converter.fromFile("./file.csv",function(err,result){
82
83});
84```
85
86#### From Web
87
88To convert any CSV data from readable stream just simply pipe in the data.
89
90```js
91//Converter Class
92var Converter = require("csvtojson").Converter;
93var converter = new Converter({constructResult:false}); //for big csv data
94
95//record_parsed will be emitted each csv row being processed
96converter.on("record_parsed", function (jsonObj) {
97 console.log(jsonObj); //here is your result json object
98});
99
100require("request").get("http://csvwebserver").pipe(converter);
101
102```
103
104#### From String
105
106```js
107var Converter = require("csvtojson").Converter;
108var converter = new Converter({});
109converter.fromString(csvString, function(err,result){
110 //your code here
111});
112```
113
114### Command Line Tools
115
116>csvtojson <csv file path>
117
118Example
119
120>csvtojson ./myCSVFile <option1=value>
121
122Or use pipe:
123
124>cat myCSVFile | csvtojson
125
126Check current version:
127
128>csvtojson version
129
130Advanced usage with parameters support, check help:
131
132>csvtojson --help
133
134# Params
135
136The constructor of csv Converter allows parameters:
137
138```js
139var converter=new require("csvtojson").Converter({
140 constructResult:false,
141 workerNum:4,
142 noheader:true
143});
144```
145
146Following parameters are supported:
147
148* **constructResult**: true/false. Whether to construct final json object in memory which will be populated in "end_parsed" event. Set to false if deal with huge csv data. default: true.
149* **delimiter**: delimiter used for seperating columns. Use "auto" if delimiter is unknown in advance, in this case, delimiter will be auto-detected (by best attempt). Use an array to give a list of potential delimiters e.g. [",","|","$"]. default: ","
150* **quote**: If a column contains delimiter, it is able to use quote character to surround the column content. e.g. "hello, world" wont be split into two columns while parsing. Set to "off" will ignore all quotes. default: " (double quote)
151* **trim**: Indicate if parser trim off spaces surrounding column content. e.g. " content " will be trimmed to "content". Default: true
152* **checkType**: This parameter turns on and off weather check field type. default is true. See [Field type](#field-type)
153* **toArrayString**: Stringify the stream output to JSON array. This is useful when pipe output to a file which expects stringified JSON array. default is false and only stringified JSON (without []) will be pushed to downstream.
154* **ignoreEmpty**: Ignore the empty value in CSV columns. If a column value is not giving, set this to true to skip them. Defalut: false.
155* **workerNum**: Number of worker processes. The worker process will use multi-cores to help process CSV data. Set to number of Core to improve the performance of processing large csv file. Keep 1 for small csv files. Default 1.
156* **fork(Deprecated, same as workerNum=2)**: Use another CPU core to process the CSV stream.
157* **noheader**:Indicating csv data has no header row and first row is data row. Default is false. See [header configuration](#header-configuration)
158* **headers**: An array to specify the headers of CSV data. If --noheader is false, this value will override CSV header row. Default: null. Example: ["my field","name"]. See [header configuration](#header-configuration)
159* **flatKeys**: Don't interpret dots (.) and square brackets in header fields as nested object or array identifiers at all (treat them like regular characters for JSON field identifiers). Default: false.
160* **maxRowLength**: the max character a csv row could have. 0 means infinite. If max number exceeded, parser will emit "error" of "row_exceed". if a possibly corrupted csv data provided, give it a number like 65535 so the parser wont consume memory. default: 0
161* **checkColumn**: whether check column number of a row is the same as headers. If column number mismatched headers number, an error of "mismatched_column" will be emitted.. default: false
162* **eol**: End of line character. If omitted, parser will attempt retrieve it from first chunk of CSV data. If no valid eol found, then operation system eol will be used.
163* **escape**: escape character used in quoted column. Default is double quote (") according to RFC4108. Change to back slash (\) or other chars for your own case.
164
165All parameters can be used in Command Line tool. see
166
167```
168csvtojson --help
169```
170
171# Result Transform
172
173To transform JSON result, (e.g. change value of one column), just simply add 'transform handler'.
174
175## Synchronouse transformer
176
177```js
178var Converter=require("csvtojson").Converter;
179var csvConverter=new Converter({});
180csvConverter.transform=function(json,row,index){
181 json["rowIndex"]=index;
182 /* some other examples:
183 delete json["myfield"]; //remove a field
184 json["dateOfBirth"]=new Date(json["dateOfBirth"]); // convert a field type
185 */
186};
187csvConverter.fromString(csvString,function(err,result){
188 //all result rows will add a field 'rowIndex' indicating the row number of the csv data:
189 /*
190 [{
191 field1:value1,
192 rowIndex: 0
193 }]
194 */
195});
196```
197
198As shown in example above, it is able to apply any changes to the result json which will be pushed to down stream and "record_parsed" event.
199
200## Asynchronouse Transformer
201
202Asynchronouse transformation can be achieve either through "record_parsed" event or creating a Writable stream.
203
204### Use record_parsed
205
206To transform data asynchronously, it is suggested to use csvtojson with [Async Queue](https://github.com/caolan/async#queue).
207
208This mainly is used when transformation of each csv row needs be mashed with data retrieved from external such as database / server / file system.
209
210However this approach will **not** change the json result pushed to downstream.
211
212Here is an example:
213
214```js
215var Conv=require("csvtojson").Converter;
216var async=require("async");
217var rs=require("fs").createReadStream("path/to/csv"); // or any readable stream to csv data.
218var q=async.queue(function(json,callback){
219 //process the json asynchronously.
220 require("request").get("http://myserver/user/"+json.userId,function(err,user){
221 //do the data mash here
222 json.user=user;
223 callback();
224 });
225},10);//10 concurrent worker same time
226q.saturated=function(){
227 rs.pause(); //if queue is full, it is suggested to pause the readstream so csvtojson will suspend populating json data. It is ok to not to do so if CSV data is not very large.
228}
229q.empty=function(){
230 rs.resume();//Resume the paused readable stream. you may need check if the readable stream isPaused() (this is since node 0.12) or finished.
231}
232var conv=new Conv({construct:false});
233conv.transform=function(json){
234 q.push(json);
235};
236conv.on("end_parsed",function(){
237 q.drain=function(){
238 //code when Queue process finished.
239 }
240})
241rs.pipe(conv);
242```
243
244In example above, the transformation will happen if one csv rown being processed. The related user info will be pulled from a web server and mashed into json result.
245
246There will be at most 10 data transformation woker working concurrently with the help of Async Queue.
247
248### Use Stream
249
250It is able to create a Writable stream (or Transform) which process data asynchronously. See [Here](https://nodejs.org/dist/latest-v4.x/docs/api/stream.html#stream_class_stream_transform) for more details.
251
252## Convert to other data type
253
254Below is an example of result tranformation which converts csv data to a column array rather than a JSON.
255
256```js
257var Converter=require("csvtojson").Converter;
258var columArrData=__dirname+"/data/columnArray";
259var rs=fs.createReadStream(columArrData);
260var result = {}
261var csvConverter=new Converter();
262//end_parsed will be emitted once parsing finished
263csvConverter.on("end_parsed", function(jsonObj) {
264 console.log(result);
265 console.log("Finished parsing");
266 done();
267});
268
269//record_parsed will be emitted each time a row has been parsed.
270csvConverter.on("record_parsed", function(resultRow, rawRow, rowIndex) {
271
272 for (var key in resultRow) {
273 if (!result[key] || !result[key] instanceof Array) {
274 result[key] = [];
275 }
276 result[key][rowIndex] = resultRow[key];
277 }
278
279});
280rs.pipe(csvConverter);
281```
282
283Here is an example:
284
285```csv
286 TIMESTAMP,UPDATE,UID,BYTES SENT,BYTES RCVED
287 1395426422,n,10028,1213,5461
288 1395426422,n,10013,9954,13560
289 1395426422,n,10109,221391500,141836
290 1395426422,n,10007,53448,308549
291 1395426422,n,10022,15506,72125
292```
293
294It will be converted to:
295
296```json
297{
298 "TIMESTAMP": ["1395426422", "1395426422", "1395426422", "1395426422", "1395426422"],
299 "UPDATE": ["n", "n", "n", "n", "n"],
300 "UID": ["10028", "10013", "10109", "10007", "10022"],
301 "BYTES SENT": ["1213", "9954", "221391500", "53448", "15506"],
302 "BYTES RCVED": ["5461", "13560", "141836", "308549", "72125"]
303}
304```
305
306# Hooks
307## preProcessRaw
308This hook is called when parser received any data from upper stream and allow developers to change it. e.g.
309```js
310/*
311CSV data:
312a,b,c,d,e
31312,e3,fb,w2,dd
314*/
315
316var conv=new Converter();
317conv.preProcessRaw=function(data,cb){
318 //change all 12 to 23
319 cb(data.replace("12","23"));
320}
321conv.fromString(csv,function(err,json){
322 //json:{a:23 ....}
323})
324```
325By default, the preProcessRaw just returns the data from the source
326```js
327Converter.prototype.preProcessRaw=function(data,cb){
328 cb(data);
329}
330```
331It is also very good to sanitise/prepare the CSV data stream.
332```js
333var headWhiteSpaceRemoved=false;
334conv.preProcessRaw=function(data,cb){
335 if (!headWhiteSpaceRemoved){
336 data=data.replace(/^\s+/,"");
337 cb(data);
338 }else{
339 cb(data);
340 }
341}
342```
343## preProcessLine
344this hook is called when a file line is emitted. It is called with two parameters `fileLineData,lineNumber`. The `lineNumber` is starting from 1.
345```js
346/*
347CSV data:
348a,b,c,d,e
34912,e3,fb,w2,dd
350*/
351
352var conv=new Converter();
353conv.preProcessLine=function(line,lineNumber){
354 //only change 12 to 23 for line 2
355 if (lineNumber === 2){
356 line=line.replace("12","23");
357 }
358 return line;
359}
360conv.fromString(csv,function(err,json){
361 //json:{a:23 ....}
362})
363```
364Notice that preProcessLine does not support async changes not like preProcessRaw hook.
365
366
367# Events
368
369Following events are used for Converter class:
370
371* end_parsed: It is emitted when parsing finished. the callback function will contain the JSON object if constructResult is set to true.
372* record_parsed: it is emitted each time a row has been parsed. The callback function has following parameters: result row JSON object reference, Original row array object reference, row index of current row in csv (header row does not count, first row content will start from 0)
373
374To subscribe the event:
375
376```js
377//Converter Class
378var Converter=require("csvtojson").Converter;
379
380//end_parsed will be emitted once parsing finished
381csvConverter.on("end_parsed",function(jsonObj){
382 console.log(jsonObj); //here is your result json object
383});
384
385//record_parsed will be emitted each time a row has been parsed.
386csvConverter.on("record_parsed",function(resultRow,rawRow,rowIndex){
387 console.log(resultRow); //here is your result json object
388});
389```
390
391# Flags
392
393There are flags in the library:
394
395\*omit\*: Omit a column. The values in the column will not be built into JSON result.
396
397\*flat\*: Mark a head column as is the key of its JSON result.
398
399Example:
400
401```csv
402*flat*user.name, user.age, *omit*user.gender
403Joe , 40, Male
404```
405
406It will be converted to:
407
408```js
409[{
410 "user.name":"Joe",
411 "user":{
412 "age":40
413 }
414}]
415```
416
417# Big CSV File
418csvtojson library was designed to accept big csv file converting. To avoid memory consumption, it is recommending to use read stream and write stream.
419
420```js
421var Converter=require("csvtojson").Converter;
422var csvConverter=new Converter({constructResult:false}); // The parameter false will turn off final result construction. It can avoid huge memory consumption while parsing. The trade off is final result will not be populated to end_parsed event.
423
424var readStream=require("fs").createReadStream("inputData.csv");
425
426var writeStream=require("fs").createWriteStream("outpuData.json");
427
428readStream.pipe(csvConverter).pipe(writeStream);
429```
430
431The constructResult:false will tell the constructor not to combine the final result which would drain the memory as progressing. The output is piped directly to writeStream.
432
433# Convert Big CSV File with Command line tool
434csvtojson command line tool supports streaming in big csv file and stream out json file.
435
436It is very convenient to process any kind of big csv file. It's proved having no issue to proceed csv files over 3,000,000 lines (over 500MB) with memory usage under 30MB.
437
438Once you have installed [csvtojson](#installation), you could use the tool with command:
439
440```
441csvtojson [path to bigcsvdata] > converted.json
442```
443
444Or if you prefer streaming data in from another application:
445
446```
447cat [path to bigcsvdata] | csvtojson > converted.json
448```
449
450They will do the same job.
451
452
453
454# Parse String
455To parse a string, simply call fromString(csvString,callback) method. The callback parameter is optional.
456
457For example:
458
459```js
460var testData=__dirname+"/data/testData";
461var data=fs.readFileSync(testData).toString();
462var csvConverter=new CSVConverter();
463
464//end_parsed will be emitted once parsing finished
465csvConverter.on("end_parsed", function(jsonObj) {
466 //final result poped here as normal.
467});
468csvConverter.fromString(data,function(err,jsonObj){
469 if (err){
470 //err handle
471 }
472 console.log(jsonObj);
473});
474
475```
476
477# Empowered JSON Parser
478
479*Note: If you want to maintain the original CSV data header values as JSON keys "as is" without being
480interpreted as (complex) JSON structures you can set the option `--flatKeys=true`.*
481
482Since version 0.3.8, csvtojson now can replicate any complex JSON structure.
483As we know, JSON object represents a graph while CSV is only 2-dimension data structure (table).
484To make JSON and CSV containing same amount information, we need "flatten" some information in JSON.
485
486Here is an example. Original CSV:
487
488```csv
489fieldA.title, fieldA.children[0].name, fieldA.children[0].id,fieldA.children[1].name, fieldA.children[1].employee[].name,fieldA.children[1].employee[].name, fieldA.address[],fieldA.address[], description
490Food Factory, Oscar, 0023, Tikka, Tim, Joe, 3 Lame Road, Grantstown, A fresh new food factory
491Kindom Garden, Ceil, 54, Pillow, Amst, Tom, 24 Shaker Street, HelloTown, Awesome castle
492
493```
494The data above contains nested JSON including nested array of JSON objects and plain texts.
495
496Using csvtojson to convert, the result would be like:
497
498```json
499[{
500 "fieldA": {
501 "title": "Food Factory",
502 "children": [{
503 "name": "Oscar",
504 "id": "0023"
505 }, {
506 "name": "Tikka",
507 "employee": [{
508 "name": "Tim"
509 }, {
510 "name": "Joe"
511 }]
512 }],
513 "address": ["3 Lame Road", "Grantstown"]
514 },
515 "description": "A fresh new food factory"
516}, {
517 "fieldA": {
518 "title": "Kindom Garden",
519 "children": [{
520 "name": "Ceil",
521 "id": "54"
522 }, {
523 "name": "Pillow",
524 "employee": [{
525 "name": "Amst"
526 }, {
527 "name": "Tom"
528 }]
529 }],
530 "address": ["24 Shaker Street", "HelloTown"]
531 },
532 "description": "Awesome castle"
533}]
534```
535
536Here is the rule for CSV data headers:
537
538* Use dot(.) to represent nested JSON. e.g. field1.field2.field3 will be converted to {field1:{field2:{field3:< value >}}}
539* Use square brackets([]) to represent an Array. e.g. field1.field2[< index >] will be converted to {field1:{field2:[< values >]}}. Different column with same header name will be added to same array.
540* Array could contain nested JSON object. e.g. field1.field2[< index >].name will be converted to {field1:{field2:[{name:< value >}]}}
541* The index could be omitted in some situation. However it causes information lost. Therefore Index should **NOT** be omitted if array contains JSON objects with more than 1 field (See example above fieldA.children[1].employee field, it is still ok if child JSON contains only 1 field).
542
543Since 0.3.8, JSON parser is the default parser. It does not need to add "\*json\*" to column titles. Theoretically, the JSON parser now should have functionality of "Array" parser, "JSONArray" parser, and old "JSON" parser.
544
545This mainly purposes on the next few versions where csvtojson could convert a JSON object back to CSV format without losing information.
546It can be used to process JSON data exported from no-sql database like MongoDB.
547
548# Field Type
549
550From version 0.3.14, type of fields are supported by csvtojson.
551The parameter checkType is used to whether to check and convert the field type.
552See [here](#params) for the parameter usage.
553
554Thank all who have contributed to ticket [#20](https://github.com/Keyang/node-csvtojson/issues/20).
555
556## Implict Type
557
558When checkType is turned on, parser will try to convert value to its implicit type if it is not explicitly specified.
559
560For example, csv data:
561```csv
562name, age, married, msg
563Tom, 12, false, {"hello":"world","total":23}
564
565```
566Will be converted into:
567
568```json
569{
570 "name":"Tom",
571 "age":12,
572 "married":false,
573 "msg":{
574 "hello":"world",
575 "total":"23"
576 }
577}
578```
579If checkType is turned **OFF**, it will be converted to:
580
581```json
582{
583 "name":"Tom",
584 "age":"12",
585 "married":"false",
586 "msg":"{\"hello\":\"world\",\"total\":23}"
587}
588```
589
590
591## Explicit Type
592CSV header column can explicitly define the type of the field.
593Simply add type before column name with a hash and exclaimation (#!).
594
595### Supported types:
596* string
597* number
598
599### Define Type
600To define the field type, see following example
601
602```csv
603string#!appNumber, string#!finished, *flat*string#!user.msg, unknown#!msg
604201401010002, true, {"hello":"world","total":23},a message
605```
606The data will be converted to:
607
608```json
609{
610 "appNumber":"201401010002",
611 "finished":"true",
612 "user.msg":"{\"hello\":\"world\",\"total\":23}"
613}
614```
615
616## Multi-CPU (Core)
617Since version 0.4.0, csvtojson supports multiple CPU cores to process large csv files.
618The implementation and benchmark result can be found [here](http://keyangxiang.com/2015/06/11/node-js-multi-core-programming-pracitse/).
619
620To enable multi-core, just pass the worker number as parameter of constructor:
621
622```js
623 var Converter=require("csvtojson").Converter;
624 var converter=new Converter({
625 workerNum:2 //use two cores
626 });
627```
628The minimum worker number is 1. When worker number is larger than 1, the parser will balance the job load among workers.
629
630For command line, to use worker just use ```--workerNum``` argument:
631
632```
633csvtojson --workerNum=3 ./myfile.csv
634```
635
636It is worth to mention that for small size of CSV file it actually costs more time to create processes and keep the communication between them. Therefore, use less workers for small CSV files.
637
638### Fork Process (Deprecated since 0.5.0)
639*Node.JS is running on single thread. You will not want to convert a large csv file on the same process where your node.js webserver is running. csvtojson gives an option to fork the whole conversion process to a new system process while the origin process will only pipe the input and result in and out. It very simple to enable this feature:
640
641```js
642var Converter=require("csvtojson").Converter;
643 var converter=new Converter({
644 fork:true //use child process to convert
645 });
646```
647Same as multi-workers, fork a new process will cause extra cost on process communication and life cycle management. Use it wisely.*
648
649Since 0.5.0, fork=true is the same as workerNum=2.
650
651### Header configuration
652
653CSV header row can be configured programmatically.
654
655the *noheader* parameter indicate if first row of csv is header row or not. e.g. CSV data:
656
657```
658CC102-PDMI-001,eClass_5.1.3,10/3/2014,12,40,green,40
659CC200-009-001,eClass_5.1.3,11/3/2014,5,3,blue,38,extra field!
660```
661
662With noheader=true
663
664```
665csvtojson ./test/data/noheadercsv --noheader=true
666```
667
668we can get following result:
669
670```json
671[
672{"field1":"CC102-PDMI-001","field2":"eClass_5.1.3","field3":"10/3/2014","field4":"12","field5":"40","field6":"green","field7":"40"},
673{"field1":"CC200-009-001","field2":"eClass_5.1.3","field3":"11/3/2014","field4":"5","field5":"3","field6":"blue","field7":"38","field8":"extra field!"}
674]
675```
676
677or we can use it in code:
678
679```js
680var converter=new require("csvtojson").Converter({noheader:true});
681```
682
683the *headers* parameter specify the header row in an array. If *noheader* is false, this value will override csv header row. With csv data above, run command:
684
685```
686csvtojson ./test/data/noheadercsv --noheader=true --headers='["hell","csv"]'
687```
688
689we get following results:
690
691```json
692[
693 {"hell":"CC102-PDMI-001","csv":"eClass_5.1.3","field3":"10/3/2014","field4":"12","field5":"40","field6":"green","field7":"40"},
694 {"hell":"CC200-009-001","csv":"eClass_5.1.3","field3":"11/3/2014","field4":"5","field5":"3","field6":"blue","field7":"38","field8":"extra field!"}
695]
696```
697
698If length of headers array is smaller than the column of csv, converter will automatically fill the column with "field*". where * is current column index starting from 1.
699
700Also we can use it in code:
701
702```js
703var converter=new require("csvtojson").Converter({headers:["my header1","hello world"]});
704```
705
706# Error handling
707
708Since version 0.4.4, parser detects CSV data corruption. It is important to catch those erros if CSV data is not guranteed correct. Just simply register a listener to error event:
709
710```js
711 var converter=new require("csvtojson").Converter();
712 converter.on("error",function(errMsg,errData){
713 //do error handling here
714 });
715```
716
717Once an error is emitted, the parser will continously parse csv data if up stream is still populating data. Therefore, a general practise is to close / destroy up stream once error is captured.
718
719Here are built-in error messages and corresponding error data:
720
721* unclosed_quote: If quote in csv is not closed, this error will be populated. The error data is a string which contains un-closed csv row.
722* row_exceed: If maxRowLength is given a number larger than 0 and a row is longer than the value, this error will be populated. The error data is a string which contains the csv row exceeding the length.
723* row_process: Any error happened while parser processing a csv row will populate this error message. The error data is detailed error message (e.g. checkColumn is true and column size of a row does not match that of header).
724
725
726# Parser
727
728** Parser will be replaced by [Result Transform](#result-transform) and [Flags](#flags) **
729
730This feature will be disabled in future.
731
732CSVTOJSON allows adding customised parsers which concentrating on what to parse and how to parse.
733It is the main power of the tool that developer only needs to concentrate on how to deal with the data and other concerns like streaming, memory, web, cli etc are done automatically.
734
735How to add a customised parser:
736
737```js
738//Parser Manager
739var parserMgr=require("csvtojson").parserMgr;
740
741parserMgr.addParser("myParserName",/^\*parserRegExp\*/,function (params){
742 var columnTitle=params.head; //params.head be like: *parserRegExp*ColumnName;
743 var fieldName=columnTitle.replace(this.regExp, ""); //this.regExp is the regular expression above.
744 params.resultRow[fieldName]="Hello my parser"+params.item;
745});
746```
747
748parserMgr's addParser function take three parameters:
749
7501. parser name: the name of your parser. It should be unique.
751
7522. Regular Expression: It is used to test if a column of CSV data is using this parser. In the example above any column's first row starting with *parserRegExp* will be using it.
753
7543. Parse function call back: It is where the parse happens. The converter works row by row and therefore the function will be called each time needs to parse a cell in CSV data.
755
756The parameter of Parse function is a JSON object. It contains following fields:
757
758**head**: The column's first row's data. It generally contains field information. e.g. *array*items
759
760**item**: The data inside current cell. e.g. item1
761
762**itemIndex**: the index of current cell of a row. e.g. 0
763
764**rawRow**: the reference of current row in array format. e.g. ["item1", 23 ,"hello"]
765
766**resultRow**: the reference of result row in JSON format. e.g. {"name":"Joe"}
767
768**rowIndex**: the index of current row in CSV data. start from 1 since 0 is the head. e.g. 1
769
770**resultObject**: the reference of result object in JSON format. It always has a field called csvRows which is in Array format. It changes as parsing going on. e.g.
771
772```json
773{
774 "csvRows":[
775 {
776 "itemName":"item1",
777 "number":10
778 },
779 {
780 "itemName":"item2",
781 "number":4
782 }
783 ]
784}
785```
786#Stream Options
787Since version 1.0.0, the Converter constructor takes stream options as second parameter.
788
789```js
790const conv=new Converter(params,{
791 objectMode:true, // stream down JSON object instead of JSON array
792 highWaterMark:65535 //Buffer level
793})
794
795```
796
797See more detailed information [here](https://nodejs.org/api/stream.html#stream_class_stream_transform).
798
799
800#Change Log
801
802## 1.1.0
803
804* Remove support of `new Converter(true)`
805
806## 1.0.2
807* supported ndjson format as per #113 and #87
808* issue: #120
809
810## 1.0.0
811* Add [Stream Options](#stream-options)
812* Change version syntax to follow x.y.z
813
814## 0.5.12
815* Added support for scientific notation number support (#100)
816* Added "off" option to quote parameter
817
818## 0.5.4
819* Added new feature: accept special delimiter "auto" and array
820
821## 0.5.2
822
823* Changed type separator from # to #!
824* Fixed bugs
825
826## 0.5.0
827
828* Fixed some bugs
829* Performance improvement
830* **Implicity type for numbers now use RegExp:/^[-+]?[0-9]*\.?[0-9]+$/. Previously 00131 is a string now will be recognised as number type**
831* **If a column has no head, now it will use current column index as column name: 'field*'. previously parser uses a fixed index starting from 1. e.g. csv data: 'aa,bb,cc' with head 'a,b'. previously it will convert to {'a':'aa','b':'bb','field1':'cc'} and now it is {'a':'aa','b':'bb','field3':'cc'}**
832
833## 0.4.7
834* ignoreEmpty now ignores empty rows as well
835* optimised performance
836* added fromFile method
837
838## 0.4.4
839* Add error handling for corrupted CSV data
840* Exposed "eol" param
841
842## 0.4.3
843* Added header configuration
844* Refactored worker code
845* **Number type field now returns 0 if parseFloat returns NaN with the value of the field. Previously it returns original value in string.**
846
847## 0.4.0
848* Added Multi-core CPU support to increase performance
849* Added "fork" option to delegate csv converting work to another process.
850* Refactoring general flow
851
852## 0.3.21
853* Refactored Command Line Tool.
854* Added ignoreEmpty parameter.
855
856## 0.3.18
857* Fixed double qoute parse as per CSV standard.
858
859## 0.3.14
860* Added field type support
861* Fixed some minor bugs
862
863## 0.3.8
864* Empowered built-in JSON parser.
865* Change: Use JSON parser as default parser.
866* Added parameter trim in constructor. default: true. trim will trim content spaces.
867
868## 0.3.5
869* Added fromString method to support direct string input
870
871## 0.3.4
872* Added more parameters to command line tool.
873
874## 0.3.2
875* Added quote in parameter to support quoted column content containing delimiters
876* Changed row index starting from 0 instead of 1 when populated from record_parsed event
877
878## 0.3
879* Removed all dependencies
880* Deprecated applyWebServer
881* Added construct parameter for Converter Class
882* Converter Class now works as a proper stream object
883
884# IMPORTANT!!
885Since version 0.3, the core class of csvtojson has been inheriting from stream.Transform class. Therefore, it will behave like a normal Stream object and CSV features will not be available any more. Now the usage is like:
886```js
887//Converter Class
888var fs = require("fs");
889var Converter = require("csvtojson").Converter;
890var fileStream = fs.createReadStream("./file.csv");
891//new converter instance
892var converter = new Converter({constructResult:true});
893//end_parsed will be emitted once parsing finished
894converter.on("end_parsed", function (jsonObj) {
895 console.log(jsonObj); //here is your result json object
896});
897//read from file
898fileStream.pipe(converter);
899```
900
901To convert from a string, previously the code was:
902```js
903csvConverter.from(csvString);
904```
905
906Now it is:
907```js
908csvConverter.fromString(csvString, callback);
909```
910
911The callback function above is optional. see [Parse String](#parse-string).
912
913After version 0.3, csvtojson requires node 0.10 and above.
914