UNPKG

5.81 kBMarkdownView Raw
1# xml-flow
2[![NPM version][npm-image]][npm-url] [![Build status][travis-image]][travis-url] [![Test coverage][coveralls-image]][coveralls-url]
3
4Dealing with XML data can be frustrating. Especially if you have a whole-lot of it. Most XML readers work on the entire XML document as a String: this can be problematic if you need to read very large XML files. With xml-flow, you can use streams to only load a small part of an XML document into memory at a time.
5
6xml-flow has only one dependency, [sax-js](https://github.com/isaacs/sax-js). This means it will run nicely on windows environments.
7
8## Installation
9
10```
11$ npm install xml-flow
12```
13
14## Getting started
15xml-flow tries to keep the parsed output as simple as possible. Here's an example:
16
17### Input File
18```xml
19<root>
20 <person>
21 <name>Bill</name>
22 <id>1</id>
23 <age>27</age>
24 </person>
25 <person>
26 <name>Sally</name>
27 <id>2</id>
28 <age>29</age>
29 </person>
30 <person>
31 <name>Kelly</name>
32 <id>3</id>
33 <age>37</age>
34 </person>
35</root>
36```
37
38### Usage
39```js
40var fs = require('fs')
41 , flow = require('xml-flow')
42 , inFile = fs.createReadStream('./your-xml-file.xml')
43 , xmlStream = flow(inFile)
44;
45
46xmlStream.on('tag:person', function(person) {
47 console.log(person);
48});
49```
50
51### Output
52```js
53{name: 'Bill', id: '1', age: '27'}
54{name: 'Sally', id: '2', age: '29'}
55{name: 'Kelly', id: '3', age: '37'}
56```
57
58## Features
59### Attribute-only Tags
60The above example shows the of an XML document with no attributes. What about the opposite?
61
62##### Input
63```XML
64<root>
65 <person name="Bill" id="1" age="27"/>
66 <person name="Sally" id="2" age="29"/>
67 <person name="Kelly" id="3" age="37"/>
68</root>
69```
70
71##### Output
72```js
73{name: 'Bill', id: '1', age: '27'}
74{name: 'Sally', id: '2', age: '29'}
75{name: 'Kelly', id: '3', age: '37'}
76```
77
78### Both Attributes and Subtags
79When you have tags that have both Attributes and subtags, here's how the output looks:
80
81##### Input
82```XML
83<root>
84 <person name="Bill" id="1" age="27">
85 <friend id="2"/>
86 </person>
87 <person name="Sally" id="2" age="29">
88 <friend id="1"/>
89 <friend id="3"/>
90 </person>
91 <person name="Kelly" id="3" age="37">
92 <friend id="2"/>
93 Kelly likes to ride ponies.
94 </person>
95</root>
96```
97
98##### Output
99```js
100{
101 $attrs: {name: 'Bill', id: '1', age: '27'},
102 friend:'2'
103}
104{
105 $attrs: {name: 'Sally', id: '2', age: '29'},
106 friend: ['1', '3']
107}
108{
109 $attrs: {name: 'Kelly', id: '3', age: '37'},
110 friend: '2',
111 $text: 'Kelly likes to ride ponies.'
112}
113```
114
115### Read as Markup
116If you need to keep track of sub-tag order within a tag, or if it makes sense to have a more markup-style object model, here's how it works:
117
118##### Input
119```HTML
120<div class="science">
121 <h1>Title</h>
122 <p>Some introduction</p>
123 <h2>Subtitle</h>
124 <p>Some more text</p>
125 This text is not inside a p-tag.
126</div>
127```
128
129##### Output
130```js
131{
132 $attrs: {class: 'science'},
133 $markup: [
134 {$name: 'h1', $text: 'Title'},
135 {$name: 'p', $text: 'Some Introduction'},
136 {$name: 'h2', $text: 'Subtitle'},
137 {$name: 'p', $text: 'Some more text'},
138 'This text is not inside a p-tag.'
139 ]
140}
141```
142
143## Options
144You may add a second argument when calling the function, as `flow(stream, options)`. All are optional:
145* `strict` - Boolean. Default = false. Refer to sax-js documentation for more info.
146* `lowercase` - Boolean. Default = true. When not in strict mode, all tags are lowercased, or uppercased when set to false.
147* `trim` - Boolean. Default = true. Whether or not to trim leading and trailing whitespace from text
148* `normalize` - Boolean. Default = true. Turns all whitespace into a single space.
149* `preserveMarkup` - One of flow.ALWAYS, flow.SOMETIMES (default), or flow.NEVER. When set to ALWAYS, All subtags and text are stored in the $markup property with their original order preserved. When set to NEVER, all subtags are collected as separate properties. When set to SOMETIMES, markup is preserved only when subtags have non-contiguous repetition.
150* `simplifyObjects` - Boolean. Default = true. Whether to drop empty $attrs, pull properties out of the $attrs when there are no subtags, or to only use a String instead of an object when $text is the only property.
151* `useArrays` - One of flow.ALWAYS, flow.SOMETIMES (default), or flow.NEVER. When set to ALWAYS, All subtags and text are enclosed in arrays, even if there's only one found. When set to NEVER, only the first instance of a subtag or text node are kept. When set to SOMETIMES, arrays are used only when multiple items are found. *NOTE:* When set to NEVER, `preserveMarkup` is ignored.
152
153## Events
154All events can be listened to via common nodeJS EventEmitter syntax.
155
156`tag:<<TAG_NAME>>` - Fires when any `<<TAG_NAME>>` is parsed. Note that this is case sensitive. If the `lowercase` option is set, make sure you listen to lowercase tag names. If the `strict` option is set, match the case of the tags in your document.
157
158`end` - Fires when the end of the stream has been reached.
159
160`error` - Fires when there are errors.
161
162`query:<<QUERY>>` - Coming soon...
163
164## Utility Methods
165* `toXml(node)` - Returns a string, XML-encoding of an object. Encodes $name, $attrs, $text, and $markup as you would expect. Pretty-print coming soon!
166
167## Authors
168
169 - [Matthew Larson](https://github.com/matthewmatician)
170
171# License
172
173 MIT
174
175[npm-image]: https://img.shields.io/npm/v/xml-flow.svg?style=flat
176[npm-url]: https://npmjs.org/package/xml-flow
177[travis-image]: https://img.shields.io/travis/matthewmatician/xml-flow.svg?style=flat
178[travis-url]: https://travis-ci.org/matthewmatician/xml-flow
179[coveralls-image]: https://img.shields.io/coveralls/matthewmatician/xml-flow.svg?style=flat
180[coveralls-url]: https://coveralls.io/r/matthewmatician/xml-flow