UNPKG

13.4 kBMarkdownView Raw
1[![NPM Version][npm-image]][npm-url]
2[![NPM Downloads][downloads-image]][downloads-url]
3[![Test Coverage][travis-image]][travis-url]
4[![Coverage][coverage-image]][coverage-url]
5
6[npm-image]: https://img.shields.io/npm/v/unzipper.svg
7[npm-url]: https://npmjs.org/package/unzipper
8[travis-image]: https://api.travis-ci.org/ZJONSSON/node-unzipper.png?branch=master
9[travis-url]: https://travis-ci.org/ZJONSSON/node-unzipper?branch=master
10[downloads-image]: https://img.shields.io/npm/dm/unzipper.svg
11[downloads-url]: https://npmjs.org/package/unzipper
12[coverage-image]: https://3tjjj5abqi.execute-api.us-east-1.amazonaws.com/prod/node-unzipper/badge
13[coverage-url]: https://3tjjj5abqi.execute-api.us-east-1.amazonaws.com/prod/node-unzipper/url
14
15# unzipper
16
17This is an active fork and drop-in replacement of the [node-unzip](https://github.com/EvanOxfeld/node-unzip) and addresses the following issues:
18* finish/close events are not always triggered, particular when the input stream is slower than the receivers
19* Any files are buffered into memory before passing on to entry
20
21The structure of this fork is similar to the original, but uses Promises and inherit guarantees provided by node streams to ensure low memory footprint and emits finish/close events at the end of processing. The new `Parser` will push any parsed `entries` downstream if you pipe from it, while still supporting the legacy `entry` event as well.
22
23Breaking changes: The new `Parser` will not automatically drain entries if there are no listeners or pipes in place.
24
25Unzipper provides simple APIs similar to [node-tar](https://github.com/isaacs/node-tar) for parsing and extracting zip files.
26There are no added compiled dependencies - inflation is handled by node.js's built in zlib support.
27
28Please note: Methods that use the Central Directory instead of parsing entire file can be found under [`Open`](#open)
29
30Chrome extension files (.crx) are zipfiles with an [extra header](http://www.adambarth.com/experimental/crx/docs/crx.html) at the start of the file. Unzipper will parse .crx file with the streaming methods (`Parse` and `ParseOne`). The `Open` methods will check for `crx` headers and parse crx files, but only if you provide `crx: true` in options.
31
32## Installation
33
34```bash
35$ npm install unzipper
36```
37
38## Quick Examples
39
40### Extract to a directory
41```js
42fs.createReadStream('path/to/archive.zip')
43 .pipe(unzipper.Extract({ path: 'output/path' }));
44```
45
46Extract emits the 'close' event once the zip's contents have been fully extracted to disk. `Extract` uses [fstream.Writer](https://www.npmjs.com/package/fstream) and therefore needs need an absolute path to the destination directory. This directory will be automatically created if it doesn't already exits.
47
48### Parse zip file contents
49
50Process each zip file entry or pipe entries to another stream.
51
52__Important__: If you do not intend to consume an entry stream's raw data, call autodrain() to dispose of the entry's
53contents. Otherwise the stream will halt. `.autodrain()` returns an empty stream that provides `error` and `finish` events.
54Additionally you can call `.autodrain().promise()` to get the promisified version of success or failure of the autodrain.
55
56```js
57// If you want to handle autodrain errors you can either:
58entry.autodrain().catch(e => handleError);
59// or
60entry.autodrain().on('error' => handleError);
61```
62
63Here is a quick example:
64
65```js
66fs.createReadStream('path/to/archive.zip')
67 .pipe(unzipper.Parse())
68 .on('entry', function (entry) {
69 const fileName = entry.path;
70 const type = entry.type; // 'Directory' or 'File'
71 const size = entry.vars.uncompressedSize; // There is also compressedSize;
72 if (fileName === "this IS the file I'm looking for") {
73 entry.pipe(fs.createWriteStream('output/path'));
74 } else {
75 entry.autodrain();
76 }
77 });
78```
79
80and the same example using async iterators:
81
82```js
83const zip = fs.createReadStream('path/to/archive.zip').pipe(unzipper.Parse({forceStream: true}));
84for await (const entry of zip) {
85 const fileName = entry.path;
86 const type = entry.type; // 'Directory' or 'File'
87 const size = entry.vars.uncompressedSize; // There is also compressedSize;
88 if (fileName === "this IS the file I'm looking for") {
89 entry.pipe(fs.createWriteStream('output/path'));
90 } else {
91 entry.autodrain();
92 }
93}
94```
95
96### Parse zip by piping entries downstream
97
98If you `pipe` from unzipper the downstream components will receive each `entry` for further processing. This allows for clean pipelines transforming zipfiles into unzipped data.
99
100Example using `stream.Transform`:
101
102```js
103fs.createReadStream('path/to/archive.zip')
104 .pipe(unzipper.Parse())
105 .pipe(stream.Transform({
106 objectMode: true,
107 transform: function(entry,e,cb) {
108 const fileName = entry.path;
109 const type = entry.type; // 'Directory' or 'File'
110 const size = entry.vars.uncompressedSize; // There is also compressedSize;
111 if (fileName === "this IS the file I'm looking for") {
112 entry.pipe(fs.createWriteStream('output/path'))
113 .on('finish',cb);
114 } else {
115 entry.autodrain();
116 cb();
117 }
118 }
119 }
120 }));
121```
122
123Example using [etl](https://www.npmjs.com/package/etl):
124
125```js
126fs.createReadStream('path/to/archive.zip')
127 .pipe(unzipper.Parse())
128 .pipe(etl.map(entry => {
129 if (entry.path == "this IS the file I'm looking for")
130 return entry
131 .pipe(etl.toFile('output/path'))
132 .promise();
133 else
134 entry.autodrain();
135 }))
136
137```
138
139### Parse a single file and pipe contents
140
141`unzipper.parseOne([regex])` is a convenience method that unzips only one file from the archive and pipes the contents down (not the entry itself). If no search criteria is specified, the first file in the archive will be unzipped. Otherwise, each filename will be compared to the criteria and the first one to match will be unzipped and piped down. If no file matches then the the stream will end without any content.
142
143Example:
144
145```js
146fs.createReadStream('path/to/archive.zip')
147 .pipe(unzipper.ParseOne())
148 .pipe(fs.createReadStream('firstFile.txt'));
149```
150
151### Buffering the content of an entry into memory
152
153While the recommended strategy of consuming the unzipped contents is using streams, it is sometimes convenient to be able to get the full buffered contents of each file . Each `entry` provides a `.buffer` function that consumes the entry by buffering the contents into memory and returning a promise to the complete buffer.
154
155```js
156fs.createReadStream('path/to/archive.zip')
157 .pipe(unzipper.Parse())
158 .pipe(etl.map(async entry => {
159 if (entry.path == "this IS the file I'm looking for") {
160 const content = await entry.buffer();
161 await fs.writeFile('output/path',content);
162 }
163 else {
164 entry.autodrain();
165 }
166 }))
167```
168
169### Parse.promise() syntax sugar
170
171The parser emits `finish` and `error` events like any other stream. The parser additionally provides a promise wrapper around those two events to allow easy folding into existing Promise-based structures.
172
173Example:
174
175```js
176fs.createReadStream('path/to/archive.zip')
177 .pipe(unzipper.Parse())
178 .on('entry', entry => entry.autodrain())
179 .promise()
180 .then( () => console.log('done'), e => console.log('error',e));
181```
182
183### Parse zip created by DOS ZIP or Windows ZIP Folders
184
185Archives created by legacy tools usually have filenames encoded with IBM PC (Windows OEM) character set.
186You can decode filenames with preferred character set:
187
188```js
189const il = require('iconv-lite');
190fs.createReadStream('path/to/archive.zip')
191 .pipe(unzipper.Parse())
192 .on('entry', function (entry) {
193 // if some legacy zip tool follow ZIP spec then this flag will be set
194 const isUnicode = entry.props.flags.isUnicode;
195 // decode "non-unicode" filename from OEM Cyrillic character set
196 const fileName = isUnicode ? entry.path : il.decode(entry.props.pathBuffer, 'cp866');
197 const type = entry.type; // 'Directory' or 'File'
198 const size = entry.vars.uncompressedSize; // There is also compressedSize;
199 if (fileName === "Текстовый файл.txt") {
200 entry.pipe(fs.createWriteStream(fileName));
201 } else {
202 entry.autodrain();
203 }
204 });
205```
206
207## Open
208Previous methods rely on the entire zipfile being received through a pipe. The Open methods load take a different approach: load the central directory first (at the end of the zipfile) and provide the ability to pick and choose which zipfiles to extract, even extracting them in parallel. The open methods return a promise on the contents of the directory, with individual `files` listed in an array. Each file element has the following methods:
209* `stream([password])` - returns a stream of the unzipped content which can be piped to any destination
210* `buffer([password])` - returns a promise on the buffered content of the file)
211If the file is encrypted you will have to supply a password to decrypt, otherwise you can leave blank.
212Unlike `adm-zip` the Open methods will never read the entire zipfile into buffer.
213
214The last argument is optional `options` object where you can specify `tailSize` (default 80 bytes), i.e. how many bytes should we read at the end of the zipfile to locate the endOfCentralDirectory. This location can be variable depending on zip64 extensible data sector size. Additionally you can supply option `crx: true` which will check for a crx header and parse the file accordingly by shifting all file offsets by the length of the crx header.
215
216### Open.file([path], [options])
217Returns a Promise to the central directory information with methods to extract individual files. `start` and `end` options are used to avoid reading the whole file.
218
219Example:
220```js
221async function main() {
222 const directory = await unzipper.Open.file('path/to/archive.zip');
223 console.log('directory', d);
224 return new Promise( (resolve, reject) => {
225 directory.files[0]
226 .stream()
227 .pipe(fs.createWriteStream('firstFile'))
228 .on('error',reject)
229 .on('finish',resolve)
230 });
231}
232
233main();
234```
235
236### Open.url([requestLibrary], [url | params], [options])
237This function will return a Promise to the central directory information from a URL point to a zipfile. Range-headers are used to avoid reading the whole file. Unzipper does not ship with a request library so you will have to provide it as the first option.
238
239Live Example: (extracts a tiny xml file from the middle of a 500MB zipfile)
240
241```js
242const request = require('request');
243const unzipper = require('./unzip');
244
245async function main() {
246 const directory = await unzipper.Open.url(request,'http://www2.census.gov/geo/tiger/TIGER2015/ZCTA5/tl_2015_us_zcta510.zip');
247 const file = directory.files.find(d => d.path === 'tl_2015_us_zcta510.shp.iso.xml');
248 const content = await file.buffer();
249 console.log(content.toString());
250}
251
252main();
253```
254
255
256This function takes a second parameter which can either be a string containing the `url` to request, or an `options` object to invoke the supplied `request` library with. This can be used when other request options are required, such as custom headers or authentication to a third party service.
257
258```js
259const request = require('google-oauth-jwt').requestWithJWT();
260
261const googleStorageOptions = {
262 url: `https://www.googleapis.com/storage/v1/b/m-bucket-name/o/my-object-name`,
263 qs: { alt: 'media' },
264 jwt: {
265 email: google.storage.credentials.client_email,
266 key: google.storage.credentials.private_key,
267 scopes: ['https://www.googleapis.com/auth/devstorage.read_only']
268 }
269});
270
271async function getFile(req, res, next) {
272 const directory = await unzipper.Open.url(request, googleStorageOptions);
273 const file = zip.files.find((file) => file.path === 'my-filename');
274 return file.stream().pipe(res);
275});
276```
277
278### Open.s3([aws-sdk], [params], [options])
279This function will return a Promise to the central directory information from a zipfile on S3. Range-headers are used to avoid reading the whole file. Unzipper does not ship with with the aws-sdk so you have to provide an instantiated client as first arguments. The params object requires `Bucket` and `Key` to fetch the correct file.
280
281Example:
282
283```js
284const unzipper = require('./unzip');
285const AWS = require('aws-sdk');
286const s3Client = AWS.S3(config);
287
288async function main() {
289 const directory = await unzipper.Open.s3(s3Client,{Bucket: 'unzipper', Key: 'archive.zip'});
290 return new Promise( (resolve, reject) => {
291 directory.files[0]
292 .stream()
293 .pipe(fs.createWriteStream('firstFile'))
294 .on('error',reject)
295 .on('finish',resolve)
296 });
297}
298
299main();
300```
301
302### Open.buffer(buffer, [options])
303If you already have the zip file in-memory as a buffer, you can open the contents directly.
304
305Example:
306
307```js
308// never use readFileSync - only used here to simplify the example
309const buffer = fs.readFileSync('path/to/arhive.zip');
310
311async function main() {
312 const directory = await unzipper.Open.buffer(buffer);
313 console.log('directory',directory);
314 // ...
315}
316
317main();
318```
319
320### Open.[method].extract()
321
322The directory object returned from `Open.[method]` provides an `extract` method which extracts all the files to a specified `path`, with an optional `concurrency` (default: 1).
323
324Example (with concurrency of 5):
325
326```js
327unzip.Open.file('path/to/archive.zip')
328 .then(d => d.extract({path: '/extraction/path', concurrency: 5}));
329```
330
331## Licenses
332See LICENCE