UNPKG

7.4 kBMarkdownView Raw
1# 2021 Update
2
3Consider using [`@root/walk`](https://npmjs.org/package/@root/walk) instead.
4
5I created `walk` quite literally a decade ago, in the Node v0.x days.
6Back then using an EventEmitter seemed like the thing to do. Nowadays,
7it seems a bit overkill for the simple task of walking over directories.
8
9There's nothing wrong with `walk` - it's about the same as it was 10 years ago -
10however, at only 50 lines of code long, `@root/walk` is much simpler and much faster.
11
12# node-walk
13
14| a [Root](https://rootprojects.org) project
15
16nodejs walk implementation.
17
18This is somewhat of a port python's `os.walk`, but using Node.JS conventions.
19
20- EventEmitter
21- Asynchronous
22- Chronological (optionally)
23- Built-in flow-control
24- includes Synchronous version (same API as Asynchronous)
25
26As few file descriptors are opened at a time as possible.
27This is particularly well suited for single hard disks which are not flash or solid state.
28
29## Installation
30
31```bash
32npm install --save walk
33```
34
35# Getting Started
36
37```javascript
38'use strict';
39
40var walk = require('walk');
41var fs = require('fs');
42var walker;
43var options = {};
44
45walker = walk.walk('/tmp', options);
46
47walker.on('file', function (root, fileStats, next) {
48 fs.readFile(fileStats.name, function () {
49 // doStuff
50 next();
51 });
52});
53
54walker.on('errors', function (root, nodeStatsArray, next) {
55 next();
56});
57
58walker.on('end', function () {
59 console.log('all done');
60});
61```
62
63## Common Events
64
65All single event callbacks are in the form of `function (root, stat, next) {}`.
66
67All multiple event callbacks callbacks are in the form of `function (root, stats, next) {}`, except **names** which is an array of strings.
68
69All **error** event callbacks are in the form `function (root, stat/stats, next) {}`.
70**`stat.error`** contains the error.
71
72- `names`
73- `directory`
74- `directories`
75- `file`
76- `files`
77- `end`
78- `nodeError` (`stat` failed)
79- `directoryError` (`stat` succedded, but `readdir` failed)
80- `errors` (a collection of any errors encountered)
81
82A typical `stat` event looks like this:
83
84```javascript
85{ dev: 16777223,
86 mode: 33188,
87 nlink: 1,
88 uid: 501,
89 gid: 20,
90 rdev: 0,
91 blksize: 4096,
92 ino: 49868100,
93 size: 5617,
94 blocks: 16,
95 atime: Mon Jan 05 2015 18:18:10 GMT-0700 (MST),
96 mtime: Thu Sep 25 2014 21:21:28 GMT-0600 (MDT),
97 ctime: Thu Sep 25 2014 21:21:28 GMT-0600 (MDT),
98 birthtime: Thu Sep 25 2014 21:21:28 GMT-0600 (MDT),
99 name: 'README.md',
100 type: 'file' }
101```
102
103# Advanced Example
104
105Both Asynchronous and Synchronous versions are provided.
106
107```javascript
108'use strict';
109
110var walk = require('walk');
111var fs = require('fs');
112var options;
113var walker;
114
115options = {
116 followLinks: false,
117 // directories with these keys will be skipped
118 filters: ['Temp', '_Temp'],
119};
120
121walker = walk.walk('/tmp', options);
122
123// OR
124// walker = walk.walkSync("/tmp", options);
125
126walker.on('names', function (root, nodeNamesArray) {
127 nodeNamesArray.sort(function (a, b) {
128 if (a > b) return 1;
129 if (a < b) return -1;
130 return 0;
131 });
132});
133
134walker.on('directories', function (root, dirStatsArray, next) {
135 // dirStatsArray is an array of `stat` objects with the additional attributes
136 // * type
137 // * error
138 // * name
139
140 next();
141});
142
143walker.on('file', function (root, fileStats, next) {
144 fs.readFile(fileStats.name, function () {
145 // doStuff
146 next();
147 });
148});
149
150walker.on('errors', function (root, nodeStatsArray, next) {
151 next();
152});
153
154walker.on('end', function () {
155 console.log('all done');
156});
157```
158
159### Sync
160
161Note: You **can't use EventEmitter** if you want truly synchronous walker
162(although it's synchronous under the hood, it appears not to be due to the use of `process.nextTick()`).
163
164Instead **you must use `options.listeners`** for truly synchronous walker.
165
166Although the sync version uses all of the `fs.readSync`, `fs.readdirSync`, and other sync methods,
167I don't think I can prevent the `process.nextTick()` that `EventEmitter` calls.
168
169```javascript
170(function () {
171 'use strict';
172
173 var walk = require('walk');
174 var fs = require('fs');
175 var options;
176 var walker;
177
178 // To be truly synchronous in the emitter and maintain a compatible api,
179 // the listeners must be listed before the object is created
180 options = {
181 listeners: {
182 names: function (root, nodeNamesArray) {
183 nodeNamesArray.sort(function (a, b) {
184 if (a > b) return 1;
185 if (a < b) return -1;
186 return 0;
187 });
188 },
189 directories: function (root, dirStatsArray, next) {
190 // dirStatsArray is an array of `stat` objects with the additional attributes
191 // * type
192 // * error
193 // * name
194
195 next();
196 },
197 file: function (root, fileStats, next) {
198 fs.readFile(fileStats.name, function () {
199 // doStuff
200 next();
201 });
202 },
203 errors: function (root, nodeStatsArray, next) {
204 next();
205 },
206 },
207 };
208
209 walker = walk.walkSync('/tmp', options);
210
211 console.log('all done');
212})();
213```
214
215# API
216
217Emitted Values
218
219- `on('XYZ', function(root, stats, next) {})`
220
221- `root` - the containing the files to be inspected
222- _stats[Array]_ - a single `stats` object or an array with some added attributes
223 - type - 'file', 'directory', etc
224 - error
225 - name - the name of the file, dir, etc
226- next - no more files will be read until this is called
227
228Single Events - fired immediately
229
230- `end` - No files, dirs, etc left to inspect
231
232- `directoryError` - Error when `fstat` succeeded, but reading path failed (Probably due to permissions).
233- `nodeError` - Error `fstat` did not succeeded.
234- `node` - a `stats` object for a node of any type
235- `file` - includes links when `followLinks` is `true`
236- `directory` - **NOTE** you could get a recursive loop if `followLinks` and a directory links to its parent
237- `symbolicLink` - always empty when `followLinks` is `true`
238- `blockDevice`
239- `characterDevice`
240- `FIFO`
241- `socket`
242
243Events with Array Arguments - fired after all files in the dir have been `stat`ed
244
245- `names` - before any `stat` takes place. Useful for sorting and filtering.
246
247 - Note: the array is an array of `string`s, not `stat` objects
248 - Note: the `next` argument is a `noop`
249
250- `errors` - errors encountered by `fs.stat` when reading ndes in a directory
251- `nodes` - an array of `stats` of any type
252- `files`
253- `directories` - modification of this array - sorting, removing, etc - affects traversal
254- `symbolicLinks`
255- `blockDevices`
256- `characterDevices`
257- `FIFOs`
258- `sockets`
259
260**Warning** beware of infinite loops when `followLinks` is true (using `walk-recurse` varient).
261
262# Comparisons
263
264Tested on my `/System` containing 59,490 (+ self) directories (and lots of files).
265The size of the text output was 6mb.
266
267`find`:
268time bash -c "find /System -type d | wc"
26959491 97935 6262916
270
271 real 2m27.114s
272 user 0m1.193s
273 sys 0m14.859s
274
275`find.js`:
276
277Note that `find.js` omits the start directory
278
279 time bash -c "node examples/find.js /System -type d | wc"
280 59490 97934 6262908
281
282 # Test 1
283 real 2m52.273s
284 user 0m20.374s
285 sys 0m27.800s
286
287 # Test 2
288 real 2m23.725s
289 user 0m18.019s
290 sys 0m23.202s
291
292 # Test 3
293 real 2m50.077s
294 user 0m17.661s
295 sys 0m24.008s
296
297In conclusion node.js asynchronous walk is much slower than regular "find".
298
299# LICENSE
300
301`node-walk` is available under the following licenses:
302
303- MIT
304- Apache 2
305
306Copyright 2011 - Present AJ ONeal