UNPKG

12.9 kBMarkdownView Raw
1[![NPM Version][npm-image]][npm-url]
2[![NPM Downloads][downloads-image]][downloads-url]
3[![Test Coverage][travis-image]][travis-url]
4[![Coverage][coverage-image]][coverage-url]
5
6[npm-image]: https://img.shields.io/npm/v/unzipper.svg
7[npm-url]: https://npmjs.org/package/unzipper
8[travis-image]: https://api.travis-ci.org/ZJONSSON/node-unzipper.png?branch=master
9[travis-url]: https://travis-ci.org/ZJONSSON/node-unzipper?branch=master
10[downloads-image]: https://img.shields.io/npm/dm/unzipper.svg
11[downloads-url]: https://npmjs.org/package/unzipper
12[coverage-image]: https://3tjjj5abqi.execute-api.us-east-1.amazonaws.com/prod/node-unzipper/badge
13[coverage-url]: https://3tjjj5abqi.execute-api.us-east-1.amazonaws.com/prod/node-unzipper/url
14
15# unzipper
16
17This is an active fork and drop-in replacement of the [node-unzip](https://github.com/EvanOxfeld/node-unzip) and addresses the following issues:
18* finish/close events are not always triggered, particular when the input stream is slower than the receivers
19* Any files are buffered into memory before passing on to entry
20
21The structure of this fork is similar to the original, but uses Promises and inherit guarantees provided by node streams to ensure low memory footprint and emits finish/close events at the end of processing. The new `Parser` will push any parsed `entries` downstream if you pipe from it, while still supporting the legacy `entry` event as well.
22
23Breaking changes: The new `Parser` will not automatically drain entries if there are no listeners or pipes in place.
24
25Unzipper provides simple APIs similar to [node-tar](https://github.com/isaacs/node-tar) for parsing and extracting zip files.
26There are no added compiled dependencies - inflation is handled by node.js's built in zlib support.
27
28Please note: Methods that use the Central Directory instead of parsing entire file can be found under [`Open`](#open)
29
30Chrome extension files (.crx) are zipfiles with an [extra header](http://www.adambarth.com/experimental/crx/docs/crx.html) at the start of the file. Unzipper will parse .crx file with the streaming methods (`Parse` and `ParseOne`). The `Open` methods will check for `crx` headers and parse crx files, but only if you provide `crx: true` in options.
31
32## Installation
33
34```bash
35$ npm install unzipper
36```
37
38## Quick Examples
39
40### Extract to a directory
41```js
42fs.createReadStream('path/to/archive.zip')
43 .pipe(unzipper.Extract({ path: 'output/path' }));
44```
45
46Extract emits the 'close' event once the zip's contents have been fully extracted to disk. `Extract` uses [fstream.Writer](https://www.npmjs.com/package/fstream) and therefore needs need an absolute path to the destination directory. This directory will be automatically created if it doesn't already exits.
47
48### Parse zip file contents
49
50Process each zip file entry or pipe entries to another stream.
51
52__Important__: If you do not intend to consume an entry stream's raw data, call autodrain() to dispose of the entry's
53contents. Otherwise the stream will halt. `.autodrain()` returns an empty stream that provides `error` and `finish` events.
54Additionally you can call `.autodrain().promise()` to get the promisified version of success or failure of the autodrain.
55
56```js
57// If you want to handle autodrain errors you can either:
58entry.autodrain().catch(e => handleError);
59// or
60entry.autodrain().on('error' => handleError);
61```
62
63Here is a quick example:
64
65
66```js
67fs.createReadStream('path/to/archive.zip')
68 .pipe(unzipper.Parse())
69 .on('entry', function (entry) {
70 const fileName = entry.path;
71 const type = entry.type; // 'Directory' or 'File'
72 const size = entry.vars.uncompressedSize; // There is also compressedSize;
73 if (fileName === "this IS the file I'm looking for") {
74 entry.pipe(fs.createWriteStream('output/path'));
75 } else {
76 entry.autodrain();
77 }
78 });
79```
80### Parse zip by piping entries downstream
81
82If you `pipe` from unzipper the downstream components will receive each `entry` for further processing. This allows for clean pipelines transforming zipfiles into unzipped data.
83
84Example using `stream.Transform`:
85
86```js
87fs.createReadStream('path/to/archive.zip')
88 .pipe(unzipper.Parse())
89 .pipe(stream.Transform({
90 objectMode: true,
91 transform: function(entry,e,cb) {
92 const fileName = entry.path;
93 const type = entry.type; // 'Directory' or 'File'
94 const size = entry.vars.uncompressedSize; // There is also compressedSize;
95 if (fileName === "this IS the file I'm looking for") {
96 entry.pipe(fs.createWriteStream('output/path'))
97 .on('finish',cb);
98 } else {
99 entry.autodrain();
100 cb();
101 }
102 }
103 }
104 }));
105```
106
107Example using [etl](https://www.npmjs.com/package/etl):
108
109```js
110fs.createReadStream('path/to/archive.zip')
111 .pipe(unzipper.Parse())
112 .pipe(etl.map(entry => {
113 if (entry.path == "this IS the file I'm looking for")
114 return entry
115 .pipe(etl.toFile('output/path'))
116 .promise();
117 else
118 entry.autodrain();
119 }))
120
121```
122
123### Parse a single file and pipe contents
124
125`unzipper.parseOne([regex])` is a convenience method that unzips only one file from the archive and pipes the contents down (not the entry itself). If no serch criteria is specified, the first file in the archive will be unzipped. Otherwise, each filename will be compared to the criteria and the first one to match will be unzipped and piped down. If no file matches then the the stream will end without any content.
126
127Example:
128
129```js
130fs.createReadStream('path/to/archive.zip')
131 .pipe(unzipper.ParseOne())
132 .pipe(fs.createReadStream('firstFile.txt'));
133```
134
135### Buffering the content of an entry into memory
136
137While the recommended strategy of consuming the unzipped contents is using streams, it is sometimes convenient to be able to get the full buffered contents of each file . Each `entry` provides a `.buffer` function that consumes the entry by buffering the contents into memory and returning a promise to the complete buffer.
138
139```js
140fs.createReadStream('path/to/archive.zip')
141 .pipe(unzipper.Parse())
142 .pipe(etl.map(async entry => {
143 if (entry.path == "this IS the file I'm looking for") {
144 const content = await entry.buffer();
145 await fs.writeFile('output/path',content);
146 }
147 else {
148 entry.autodrain();
149 }
150 }))
151```
152
153### Parse.promise() syntax sugar
154
155The parser emits `finish` and `error` events like any other stream. The parser additionally provides a promise wrapper around those two events to allow easy folding into existing Promise-based structures.
156
157Example:
158
159```js
160fs.createReadStream('path/to/archive.zip')
161 .pipe(unzipper.Parse())
162 .on('entry', entry => entry.autodrain())
163 .promise()
164 .then( () => console.log('done'), e => console.log('error',e));
165```
166
167### Parse zip created by DOS ZIP or Windows ZIP Folders
168
169Archives created by legacy tools usually have filenames encoded with IBM PC (Windows OEM) character set.
170You can decode filenames with preferred character set:
171
172```js
173const il = require('iconv-lite');
174fs.createReadStream('path/to/archive.zip')
175 .pipe(unzipper.Parse())
176 .on('entry', function (entry) {
177 // if some legacy zip tool follow ZIP spec then this flag will be set
178 const isUnicode = entry.props.flags.isUnicode;
179 // decode "non-unicode" filename from OEM Cyrillic character set
180 const fileName = isUnicode ? entry.path : il.decode(entry.props.pathBuffer, 'cp866');
181 const type = entry.type; // 'Directory' or 'File'
182 const size = entry.vars.uncompressedSize; // There is also compressedSize;
183 if (fileName === "Текстовый файл.txt") {
184 entry.pipe(fs.createWriteStream(fileName));
185 } else {
186 entry.autodrain();
187 }
188 });
189```
190
191## Open
192Previous methods rely on the entire zipfile being received through a pipe. The Open methods load take a different approach: load the central directory first (at the end of the zipfile) and provide the ability to pick and choose which zipfiles to extract, even extracting them in parallel. The open methods return a promise on the contents of the directory, with individual `files` listed in an array. Each file element has the following methods:
193* `stream([password])` - returns a stream of the unzipped content which can be piped to any destination
194* `buffer([password])` - returns a promise on the buffered content of the file)
195If the file is encrypted you will have to supply a password to decrypt, otherwise you can leave blank.
196Unlike `adm-zip` the Open methods will never read the entire zipfile into buffer.
197
198The last argument is optional `options` object where you can specify `tailSize` (default 80 bytes), i.e. how many bytes should we read at the end of the zipfile to locate the endOfCentralDirectory. This location can be variable depending on zip64 extensible data sector size. Additionally you can supply option `crx: true` which will check for a crx header and parse the file accordingly by shifting all file offsets by the length of the crx header.
199
200### Open.file([path], [options])
201Returns a Promise to the central directory information with methods to extract individual files. `start` and `end` options are used to avoid reading the whole file.
202
203Example:
204```js
205async function main() {
206 const directory = await unzipper.Open.file('path/to/archive.zip');
207 console.log('directory', d);
208 return new Promise( (resolve, reject) => {
209 directory.files[0]
210 .stream()
211 .pipe(fs.createWriteStream('firstFile'))
212 .on('error',reject)
213 .on('finish',resolve)
214 });
215}
216
217main();
218```
219
220### Open.url([requestLibrary], [url | params], [options])
221This function will return a Promise to the central directory information from a URL point to a zipfile. Range-headers are used to avoid reading the whole file. Unzipper does not ship with a request library so you will have to provide it as the first option.
222
223Live Example: (extracts a tiny xml file from the middle of a 500MB zipfile)
224
225```js
226const request = require('request');
227const unzipper = require('./unzip');
228
229async function main() {
230 const directory = await unzipper.Open.url(request,'http://www2.census.gov/geo/tiger/TIGER2015/ZCTA5/tl_2015_us_zcta510.zip');
231 const file = directory.files.find(d => d.path === 'tl_2015_us_zcta510.shp.iso.xml');
232 const content = await file.buffer();
233 console.log(content.toString());
234}
235
236main();
237```
238
239
240This function takes a second parameter which can either be a string containing the `url` to request, or an `options` object to invoke the supplied `request` library with. This can be used when other request options are required, such as custom headers or authentication to a third party service.
241
242```js
243const request = require('google-oauth-jwt').requestWithJWT();
244
245const googleStorageOptions = {
246 url: `https://www.googleapis.com/storage/v1/b/m-bucket-name/o/my-object-name`,
247 qs: { alt: 'media' },
248 jwt: {
249 email: google.storage.credentials.client_email,
250 key: google.storage.credentials.private_key,
251 scopes: ['https://www.googleapis.com/auth/devstorage.read_only']
252 }
253});
254
255async function getFile(req, res, next) {
256 const directory = await unzipper.Open.url(request, googleStorageOptions);
257 const file = zip.files.find((file) => file.path === 'my-filename');
258 return file.stream().pipe(res);
259});
260```
261
262### Open.s3([aws-sdk], [params], [options])
263This function will return a Promise to the central directory information from a zipfile on S3. Range-headers are used to avoid reading the whole file. Unzipper does not ship with with the aws-sdk so you have to provide an instantiated client as first arguments. The params object requires `Bucket` and `Key` to fetch the correct file.
264
265Example:
266
267```js
268const unzipper = require('./unzip');
269const AWS = require('aws-sdk');
270const s3Client = AWS.S3(config);
271
272async function main() {
273 const directory = await unzipper.Open.s3(s3Client,{Bucket: 'unzipper', Key: 'archive.zip'});
274 return new Promise( (resolve, reject) => {
275 directory.files[0]
276 .stream()
277 .pipe(fs.createWriteStream('firstFile'))
278 .on('error',reject)
279 .on('finish',resolve)
280 });
281}
282
283main();
284```
285
286### Open.buffer(buffer, [options])
287If you already have the zip file in-memory as a buffer, you can open the contents directly.
288
289Example:
290
291```js
292// never use readFileSync - only used here to simplify the example
293const buffer = fs.readFileSync('path/to/arhive.zip');
294
295async function main() {
296 const directory = await unzipper.Open.buffer(buffer);
297 console.log('directory',directory);
298 // ...
299}
300
301main();
302```
303
304### Open.[method].extract()
305
306The directory object returned from `Open.[method]` provides an `extract` method which extracts all the files to a specified `path`, with an optional `concurrency` (default: 1).
307
308Example (with concurrency of 5):
309
310```js
311unzip.Open.file('path/to/archive.zip')
312 .then(d => d.extract({path: '/extraction/path', concurrency: 5}));
313```
314
315## Licenses
316See LICENCE