UNPKG

14.9 kBMarkdownView Raw
1# urllib
2
3[![NPM version][npm-image]][npm-url]
4[![build status][travis-image]][travis-url]
5[![Build Status](https://dev.azure.com/eggjs/egg/_apis/build/status/node-modules.urllib)](https://dev.azure.com/eggjs/egg/_build/latest?definitionId=7)
6[![Test coverage][codecov-image]][codecov-url]
7[![David deps][david-image]][david-url]
8[![Known Vulnerabilities][snyk-image]][snyk-url]
9[![npm download][download-image]][download-url]
10
11[npm-image]: https://img.shields.io/npm/v/urllib.svg?style=flat-square
12[npm-url]: https://npmjs.org/package/urllib
13[travis-image]: https://img.shields.io/travis/node-modules/urllib.svg?style=flat-square
14[travis-url]: https://travis-ci.org/node-modules/urllib
15[codecov-image]: https://codecov.io/gh/node-modules/urllib/branch/master/graph/badge.svg
16[codecov-url]: https://codecov.io/gh/node-modules/urllib
17[david-image]: https://img.shields.io/david/node-modules/urllib.svg?style=flat-square
18[david-url]: https://david-dm.org/node-modules/urllib
19[snyk-image]: https://snyk.io/test/npm/urllib/badge.svg?style=flat-square
20[snyk-url]: https://snyk.io/test/npm/urllib
21[download-image]: https://img.shields.io/npm/dm/urllib.svg?style=flat-square
22[download-url]: https://npmjs.org/package/urllib
23
24Request HTTP URLs in a complex world — basic
25and digest authentication, redirections, cookies, timeout and more.
26
27## Install
28
29```bash
30$ npm install urllib --save
31```
32
33## Usage
34
35### callback
36
37```js
38var urllib = require('urllib');
39
40urllib.request('http://cnodejs.org/', function (err, data, res) {
41 if (err) {
42 throw err; // you need to handle error
43 }
44 console.log(res.statusCode);
45 console.log(res.headers);
46 // data is Buffer instance
47 console.log(data.toString());
48});
49```
50
51### Promise
52
53If you've installed [bluebird][bluebird],
54[bluebird][bluebird] will be used.
55`urllib` does not install [bluebird][bluebird] for you.
56
57Otherwise, if you're using a node that has native v8 Promises (v0.11.13+),
58then that will be used.
59
60Otherwise, this library will crash the process and exit,
61so you might as well install [bluebird][bluebird] as a dependency!
62
63```js
64var urllib = require('urllib');
65
66urllib.request('http://nodejs.org').then(function (result) {
67 // result: {data: buffer, res: response object}
68 console.log('status: %s, body size: %d, headers: %j', result.res.statusCode, result.data.length, result.res.headers);
69}).catch(function (err) {
70 console.error(err);
71});
72```
73
74### co & generator
75
76If you are using [co](https://github.com/visionmedia/co) or [koa](https://github.com/koajs/koa):
77
78```js
79var co = require('co');
80var urllib = require('urllib');
81
82co(function* () {
83 var result = yield urllib.requestThunk('http://nodejs.org');
84 console.log('status: %s, body size: %d, headers: %j',
85 result.status, result.data.length, result.headers);
86})();
87```
88
89## Global `response` event
90
91You should create a urllib instance first.
92
93```js
94var httpclient = require('urllib').create();
95
96httpclient.on('response', function (info) {
97 error: err,
98 ctx: args.ctx,
99 req: {
100 url: url,
101 options: options,
102 size: requestSize,
103 },
104 res: res
105});
106
107httpclient.request('http://nodejs.org', function (err, body) {
108 console.log('body size: %d', body.length);
109});
110```
111
112## API Doc
113
114### Method: `http.request(url[, options][, callback])`
115
116#### Arguments
117
118- **url** String | Object - The URL to request, either a String or a Object that return by [url.parse](http://nodejs.org/api/url.html#url_url_parse_urlstr_parsequerystring_slashesdenotehost).
119- ***options*** Object - Optional
120 - ***method*** String - Request method, defaults to `GET`. Could be `GET`, `POST`, `DELETE` or `PUT`. Alias 'type'.
121 - ***data*** Object - Data to be sent. Will be stringify automatically.
122 - ***dataAsQueryString*** Boolean - Force convert `data` to query string.
123 - ***content*** String | [Buffer](http://nodejs.org/api/buffer.html) - Manually set the content of payload. If set, `data` will be ignored.
124 - ***stream*** [stream.Readable](http://nodejs.org/api/stream.html#stream_class_stream_readable) - Stream to be pipe to the remote. If set, `data` and `content` will be ignored.
125 - ***writeStream*** [stream.Writable](http://nodejs.org/api/stream.html#stream_class_stream_writable) - A writable stream to be piped by the response stream. Responding data will be write to this stream and `callback` will be called with `data` set `null` after finished writing.
126 - ***consumeWriteStream*** [true] - consume the writeStream, invoke the callback after writeStream close.
127 - ***contentType*** String - Type of request data. Could be `json`. If it's `json`, will auto set `Content-Type: application/json` header.
128 - ***nestedQuerystring*** Boolean - urllib default use querystring to stringify form data which don't support nested object, will use [qs](https://github.com/ljharb/qs) instead of querystring to support nested object by set this option to true.
129 - ***dataType*** String - Type of response data. Could be `text` or `json`. If it's `text`, the `callback`ed `data` would be a String. If it's `json`, the `data` of callback would be a parsed JSON Object and will auto set `Accept: application/json` header. Default `callback`ed `data` would be a `Buffer`.
130 - **fixJSONCtlChars** Boolean - Fix the control characters (U+0000 through U+001F) before JSON parse response. Default is `false`.
131 - ***headers*** Object - Request headers.
132 - ***timeout*** Number | Array - Request timeout in milliseconds for connecting phase and response receiving phase. Defaults to `exports.TIMEOUT`, both are 5s. You can use `timeout: 5000` to tell urllib use same timeout on two phase or set them seperately such as `timeout: [3000, 5000]`, which will set connecting timeout to 3s and response 5s.
133 - ***auth*** String - `username:password` used in HTTP Basic Authorization.
134 - ***digestAuth*** String - `username:password` used in HTTP [Digest Authorization](http://en.wikipedia.org/wiki/Digest_access_authentication).
135 - ***agent*** [http.Agent](http://nodejs.org/api/http.html#http_class_http_agent) - HTTP Agent object.
136 Set `false` if you does not use agent.
137 - ***httpsAgent*** [https.Agent](http://nodejs.org/api/https.html#https_class_https_agent) - HTTPS Agent object.
138 Set `false` if you does not use agent.
139 - ***ca*** String | Buffer | Array - An array of strings or Buffers of trusted certificates.
140 If this is omitted several well known "root" CAs will be used, like VeriSign.
141 These are used to authorize connections.
142 **Notes**: This is necessary only if the server uses the self-signed certificate
143 - ***rejectUnauthorized*** Boolean - If true, the server certificate is verified against the list of supplied CAs.
144 An 'error' event is emitted if verification fails. Default: true.
145 - ***pfx*** String | Buffer - A string or Buffer containing the private key,
146 certificate and CA certs of the server in PFX or PKCS12 format.
147 - ***key*** String | Buffer - A string or Buffer containing the private key of the client in PEM format.
148 **Notes**: This is necessary only if using the client certificate authentication
149 - ***cert*** String | Buffer - A string or Buffer containing the certificate key of the client in PEM format.
150 **Notes**: This is necessary only if using the client certificate authentication
151 - ***passphrase*** String - A string of passphrase for the private key or pfx.
152 - ***ciphers*** String - A string describing the ciphers to use or exclude.
153 - ***secureProtocol*** String - The SSL method to use, e.g. SSLv3_method to force SSL version 3.
154 - ***followRedirect*** Boolean - follow HTTP 3xx responses as redirects. defaults to false.
155 - ***maxRedirects*** Number - The maximum number of redirects to follow, defaults to 10.
156 - ***formatRedirectUrl*** Function - Format the redirect url by your self. Default is `url.resolve(from, to)`.
157 - ***beforeRequest*** Function - Before request hook, you can change every thing here.
158 - ***streaming*** Boolean - let you get the `res` object when request connected, default `false`. alias `customResponse`
159 - ***gzip*** Boolean - Accept gzip response content and auto decode it, default is `false`.
160 - ***timing*** Boolean - Enable timing or not, default is `false`.
161 - ***enableProxy*** Boolean - Enable proxy request, default is `false`.
162 - ***proxy*** String | Object - proxy agent uri or options, default is `null`.
163 - ***lookup*** Function - Custom DNS lookup function, default is `dns.lookup`. Require node >= 4.0.0(for http protocol) and node >=8(for https protocol)
164 - ***checkAddress*** Function: optional, check request address to protect from SSRF and similar attacks. It receive tow arguments(`ip` and `family`) and should return true or false to identified the address is legal or not. It rely on `lookup` and have the same version requirement.
165 - ***trace*** Boolean - Enable capture stack include call site of library entrance, default is `false`.
166- ***callback(err, data, res)*** Function - Optional callback.
167 - **err** Error - Would be `null` if no error accured.
168 - **data** Buffer | Object - The data responsed. Would be a Buffer if `dataType` is set to `text` or an JSON parsed into Object if it's set to `json`.
169 - **res** [http.IncomingMessage](http://nodejs.org/api/http.html#http_http_incomingmessage) - The response.
170
171#### Returns
172
173[http.ClientRequest](http://nodejs.org/api/http.html#http_class_http_clientrequest) - The request.
174
175Calling `.abort()` method of the request stream can cancel the request.
176
177#### Options: `options.data`
178
179When making a request:
180
181```js
182urllib.request('http://example.com', {
183 method: 'GET',
184 data: {
185 'a': 'hello',
186 'b': 'world'
187 }
188});
189```
190
191For `GET` request, `data` will be stringify to query string, e.g. `http://example.com/?a=hello&b=world`.
192
193For others like `POST`, `PATCH` or `PUT` request,
194in defaults, the `data` will be stringify into `application/x-www-form-urlencoded` format
195if `Content-Type` header is not set.
196
197If `Content-type` is `application/json`, the `data` will be `JSON.stringify` to JSON data format.
198
199#### Options: `options.content`
200
201`options.content` is useful when you wish to construct the request body by yourself,
202for example making a `Content-Type: application/json` request.
203
204Notes that if you want to send a JSON body, you should stringify it yourself:
205
206```js
207urllib.request('http://example.com', {
208 method: 'POST',
209 headers: {
210 'Content-Type': 'application/json'
211 },
212 content: JSON.stringify({
213 a: 'hello',
214 b: 'world'
215 })
216});
217```
218
219It would make a HTTP request like:
220
221```http
222POST / HTTP/1.1
223Host: example.com
224Content-Type: application/json
225
226{
227 "a": "hello",
228 "b": "world"
229}
230```
231
232This exmaple can use `options.data` with `application/json` content type:
233
234```js
235urllib.request('http://example.com', {
236 method: 'POST',
237 headers: {
238 'Content-Type': 'application/json'
239 },
240 data: {
241 a: 'hello',
242 b: 'world'
243 }
244});
245```
246
247#### Options: `options.stream`
248
249Uploads a file with [formstream](https://github.com/node-modules/formstream):
250
251```js
252var urllib = require('urllib');
253var formstream = require('formstream');
254
255var form = formstream();
256form.file('file', __filename);
257form.field('hello', '你好urllib');
258
259var req = urllib.request('http://my.server.com/upload', {
260 method: 'POST',
261 headers: form.headers(),
262 stream: form
263}, function (err, data, res) {
264 // upload finished
265});
266```
267
268### Response Object
269
270Response is normal object, it contains:
271
272* `status` or `statusCode`: response status code.
273 * `-1` meaning some network error like `ENOTFOUND`
274 * `-2` meaning ConnectionTimeoutError
275* `headers`: response http headers, default is `{}`
276* `size`: response size
277* `aborted`: response was aborted or not
278* `rt`: total request and response time in ms.
279* `timing`: timing object if timing enable.
280* `remoteAddress`: http server ip address
281* `remotePort`: http server ip port
282* `socketHandledRequests`: socket already handled request count
283* `socketHandledResponses`: socket already handled response count
284
285#### Response: `res.aborted`
286
287If the underlaying connection was terminated before `response.end()` was called,
288`res.aborted` should be `true`.
289
290```js
291require('http').createServer(function (req, res) {
292 req.resume();
293 req.on('end', function () {
294 res.write('foo haha\n');
295 setTimeout(function () {
296 res.write('foo haha 2');
297 setTimeout(function () {
298 res.socket.end();
299 }, 300);
300 }, 200);
301 return;
302 });
303}).listen(1984);
304
305urllib.request('http://127.0.0.1:1984/socket.end', function (err, data, res) {
306 data.toString().should.equal('foo haha\nfoo haha 2');
307 should.ok(res.aborted);
308 done();
309});
310```
311
312### HttpClient2
313
314HttpClient2 is a new instance for future. request method only return a promise, compatible with `async/await` and generator in co.
315
316#### Options
317
318options extends from urllib, besides below
319
320- ***retry*** Number - a retry count, when get an error, it will request again until reach the retry count.
321- ***retryDelay*** Number - wait a delay(ms) between retries.
322- ***isRetry*** Function - determine whether retry, a response object as the first argument. it will retry when status >= 500 by default. Request error is not included.
323
324## Proxy
325
326Support both `http` and `https` protocol.
327
328**Notice: Only support on Node.js >= 4.0.0**
329
330### Programming
331
332```js
333urllib.request('https://twitter.com/', {
334 enableProxy: true,
335 proxy: 'http://localhost:8008',
336}, (err, data, res) => {
337 console.log(res.status, res.headers);
338});
339```
340
341### System environment variable
342
343- http
344
345```bash
346HTTP_PROXY=http://localhost:8008
347http_proxy=http://localhost:8008
348```
349
350- https
351
352```bash
353HTTP_PROXY=http://localhost:8008
354http_proxy=http://localhost:8008
355HTTPS_PROXY=https://localhost:8008
356https_proxy=https://localhost:8008
357```
358
359```bash
360$ http_proxy=http://localhost:8008 node index.js
361```
362
363### Trace
364If set trace true, error stack will contains full call stack, like
365```
366Error: connect ECONNREFUSED 127.0.0.1:11
367 at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1113:14)
368 --------------------
369 at ~/workspace/urllib/lib/urllib.js:150:13
370 at new Promise (<anonymous>)
371 at Object.request (~/workspace/urllib/lib/urllib.js:149:10)
372 at Context.<anonymous> (~/workspace/urllib/test/urllib_promise.test.js:49:19)
373 ....
374```
375
376When open the trace, urllib may have poor perfomance, please consider carefully.
377
378## TODO
379
380* [ ] Support component
381* [ ] Browser env use Ajax
382* [√] Support Proxy
383* [√] Upload file like form upload
384* [√] Auto redirect handle
385* [√] https & self-signed certificate
386* [√] Connection timeout & Response timeout
387* [√] Support `Accept-Encoding=gzip` by `options.gzip = true`
388* [] Support [Digest access authentication](http://en.wikipedia.org/wiki/Digest_access_authentication)
389
390## License
391
392[MIT](LICENSE.txt)
393
394
395[bluebird]: https://github.com/petkaantonov/bluebird