1 | <h1 vertical-align="middle">
|
2 | User Agents
|
3 | </h1>
|
4 |
|
5 | <p align="left">
|
6 | <a href="https://circleci.com/gh/intoli/user-agents/tree/master">
|
7 | <img src="https://img.shields.io/circleci/project/github/intoli/user-agents/master.svg"
|
8 | alt="Build Status"></a>
|
9 | <a href="https://circleci.com/gh/intoli/user-agents/tree/master">
|
10 | <img src="https://img.shields.io/github/last-commit/intoli/user-agents/master.svg"
|
11 | alt="Build Status"></a>
|
12 | <a href="https://github.com/intoli/user-agents/blob/master/LICENSE">
|
13 | <img src="https://img.shields.io/badge/License-BSD%202--Clause-blue.svg"
|
14 | alt="License"></a>
|
15 | <a href="https://www.npmjs.com/package/user-agents">
|
16 | <img src="https://img.shields.io/npm/v/user-agents.svg"
|
17 | alt="NPM Version"></a>
|
18 | <span> </span>
|
19 | <a target="_blank" href="https://twitter.com/home?status=User%20Agents%20is%20a%20JavaScript%20module%20for%20generating%20random%20user%20agents%20that's%20updated%20daily%20with%20new%20market%20share%20data.%0A%0Ahttps%3A//github.com/intoli/user-agents">
|
20 | <img height="26px" src="https://simplesharebuttons.com/images/somacro/twitter.png"
|
21 | alt="Tweet"></a>
|
22 | <a target="_blank" href="https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/intoli/user-agents">
|
23 | <img height="26px" src="https://simplesharebuttons.com/images/somacro/facebook.png"
|
24 | alt="Share on Facebook"></a>
|
25 | <a target="_blank" href="http://reddit.com/submit?url=https%3A%2F%2Fgithub.com%2Fintoli%2Fuser-agents&title=User%20Agents%20-%20Random%20user%20agent%20generation%20with%20daily-updated%20market%20share%20data">
|
26 | <img height="26px" src="https://simplesharebuttons.com/images/somacro/reddit.png"
|
27 | alt="Share on Reddit"></a>
|
28 | <a target="_blank" href="https://news.ycombinator.com/submitlink?u=https://github.com/intoli/user-agents&t=User%20Agents%20-%20Random%20user%20agent%20generation%20with%20daily-updated%20market%20share%20data">
|
29 | <img height="26px" src="media/ycombinator.png"
|
30 | alt="Share on Hacker News"></a>
|
31 | </p>
|
32 |
|
33 |
|
34 | ###### [Installation](#installation) | [Examples](#examples) | [API](#api) | [How it Works](https://intoli.com/blog/user-agents/) | [Contributing](#contributing)
|
35 |
|
36 | > User-Agents is a JavaScript package for generating random [User Agents](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent) based on how frequently they're used in the wild.
|
37 | > A new version of the package is automatically released every day, so the data is always up to date.
|
38 | > The generated data includes hard to find browser-fingerprint properties, and powerful filtering capabilities allow you to restrict the generated user agents to fit your exact needs.
|
39 |
|
40 | Web scraping often involves creating realistic traffic patterns, and doing so generally requires a good source of data.
|
41 | The User-Agents package provides a comprehensive dataset of real-world user agents and other browser properties which are commonly used for browser finerprinting and blocking automated web browsers.
|
42 | Unlike other random user agent generation libraries, the User-Agents package is updated automatically on a daily basis.
|
43 | This means that you can use it without worrying about whether the data will be stale in a matter of months.
|
44 |
|
45 | Generating a realistic random user agent is as simple as running `new UserAgent()`, but you can also easily generate user agents which correspond to a specific platform, device category, or even operating system version.
|
46 | The fastest way to get started is to hop down to the [Examples](#examples) section where you can see it in action!
|
47 |
|
48 |
|
49 | ## Installation
|
50 |
|
51 | The User Agents package is available on npm with the package name [user-agents](https://npmjs.com/package/user-agents).
|
52 | You can install it using your favorite JavaScript package manager in the usual way.
|
53 |
|
54 | ```bash
|
55 | # With npm: npm install user-agents
|
56 | # With pnpm: pnpm install user-agents
|
57 | # With yarn:
|
58 | yarn add user-agents
|
59 | ```
|
60 |
|
61 |
|
62 | ## Examples
|
63 |
|
64 | The User-Agents library offers a very flexible interface for generating user agents.
|
65 | These examples illustrate some common use cases, and show how the filtering API can be used in practice.
|
66 |
|
67 |
|
68 | ### Generating a Random User Agent
|
69 |
|
70 | The most basic usage involves simply instantiating a `UserAgent` instance.
|
71 | It will be automatically populated with a random user agent and browser fingerprint.
|
72 |
|
73 |
|
74 | ```javascript
|
75 | import UserAgent from 'user-agents';
|
76 |
|
77 |
|
78 | const userAgent = new UserAgent();
|
79 | console.log(userAgent.toString());
|
80 | console.log(JSON.stringify(userAgent.data, null, 2));
|
81 | ```
|
82 |
|
83 | In this example, we've generated a random user agent and then logged out stringified versions both the `userAgent.data` object and `userAgent` itself to the console.
|
84 | An example output might look something like this.
|
85 |
|
86 | ```literal
|
87 | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36
|
88 | ```
|
89 |
|
90 | ```json
|
91 | {
|
92 | "appName": "Netscape",
|
93 | "connection": {
|
94 | "downlink": 10,
|
95 | "effectiveType": "4g",
|
96 | "rtt": 0
|
97 | },
|
98 | "platform": "Win32",
|
99 | "pluginsLength": 3,
|
100 | "vendor": "Google Inc.",
|
101 | "userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36",
|
102 | "viewportHeight": 660,
|
103 | "viewportWidth": 1260,
|
104 | "deviceCategory": "desktop",
|
105 | "screenHeight": 800,
|
106 | "screenWidth": 1280
|
107 | }
|
108 | ```
|
109 |
|
110 | The `userAgent.toString()` call converts the user agent into a string which corresponds to the actual user agent.
|
111 | The `data` property includes a randomly generated browser fingerprint that can be used for more detailed emulation.
|
112 |
|
113 |
|
114 | ### Restricting Device Categories
|
115 |
|
116 | By passing an object as a filter, each corresponding user agent property will be restricted based on its values.
|
117 |
|
118 | ```javascript
|
119 | import UserAgent from 'user-agents';
|
120 |
|
121 | const userAgent = new UserAgent({ deviceCategory: 'mobile' })
|
122 | ```
|
123 |
|
124 | This code will generate a user agent with a `deviceCategory` of `mobile`.
|
125 | If you replace `mobile` with either `desktop` or `tablet`, then the user agent will correspond to one of those device types instead.
|
126 |
|
127 |
|
128 | ### Generating Multiple User Agents With The Same Filters
|
129 |
|
130 | There is some computational overhead involved with applying a set of filters, so it's far more efficient to reuse the filter initialization when you need to generate many user agents with the same configuration.
|
131 | You can call any initialized `UserAgent` instance like a function, and it will generate a new random instance with the same filters (you can also call `userAgent.random()` if you're not a fan of the shorthand).
|
132 |
|
133 | ```javascript
|
134 | import UserAgent from 'user-agents';
|
135 |
|
136 | const userAgent = new UserAgent({ platform: 'Win32' });
|
137 | const userAgents = Array(1000).fill().map(() => userAgent());
|
138 | ```
|
139 |
|
140 | This code example initializes a single user agent with a filter that limits the platform to `Win32`, and then uses that instance to generate 1000 more user agents with the same filter.
|
141 |
|
142 |
|
143 | ### Regular Expression Matching
|
144 |
|
145 | You can pass a regular expression as a filter and the generated user agent will be guaranteed to match that regular expression.
|
146 |
|
147 | ```javascript
|
148 | import UserAgent from 'user-agents';
|
149 |
|
150 | const userAgent = new UserAgent(/Safari/);
|
151 | ```
|
152 |
|
153 | This example will generate a user agent that contains a `Safari` substring.
|
154 |
|
155 |
|
156 | ### Custom Filter Functions
|
157 |
|
158 | It's also possible to implement completely custom logic by using a filter as a function.
|
159 | The raw `userAgent.data` object will be passed into your function, and it will be included as a possible candidate only if your function returns `true`.
|
160 | In this example, we'll use the [useragent](https://www.npmjs.com/package/useragent) package to parse the user agent string and then restrict the generated user agents to iOS devices with an operating system version of 11 or greater.
|
161 |
|
162 | ```javascript
|
163 | import UserAgent from 'user-agents';
|
164 | import { parse } from 'useragent';
|
165 |
|
166 | const userAgent = new UserAgent((data) => {
|
167 | const os = parse(data.userAgent).os;
|
168 | return os.family === 'iOS' && parseInt(os.major, 10) > 11;
|
169 | });
|
170 | ```
|
171 |
|
172 | The filtering that you apply here is completely up to you, so there's really no limit to how specific it can be.
|
173 |
|
174 |
|
175 | ### Combining Filters With Arrays
|
176 |
|
177 | You can also use arrays to specify collections of filters that will all be applied.
|
178 | This example combines a regular expression filter with an object filter to generate a user agent with a connection type of `wifi`, a platform of `MacIntel`, and a user agent that includes a `Safari` substring.
|
179 |
|
180 | ```javascript
|
181 | import UserAgent from 'user-agents';
|
182 |
|
183 | const userAgent = new UserAgent([
|
184 | /Safari/,
|
185 | {
|
186 | connection: {
|
187 | type: 'wifi',
|
188 | },
|
189 | platform: 'MacIntel',
|
190 | },
|
191 | ]);
|
192 | ```
|
193 |
|
194 | This example also shows that you can specify both multiple and nested properties on object filters.
|
195 |
|
196 |
|
197 | ## API
|
198 |
|
199 | ### class: UserAgent([filters])
|
200 |
|
201 | - `filters` <`Array`, `Function`, `Object`, `RegExp`, or `String`> - A set of filters to apply to the generated user agents.
|
202 | The filter specification is extremely flexible, and reading through the [Examples](#examples) section is the best way to familiarize yourself with what sort of filtering is possible.
|
203 |
|
204 | `UserAgent` is an object that contains the details of a randomly generated user agent and corresponding browser fingerprint.
|
205 | Each time the class is instantiated, it will randomly populate the instance with a new user agent based on the specified filters.
|
206 | The instantiated class can be cast to a user agent string by explicitly calling `toString()`, accessing the `userAgent` property, or implicitly converting the type to a primitive or string in the standard JavaScript ways (*e.g.* `` `${userAgent}` ``).
|
207 | Other properties can be accessed as outlined below.
|
208 |
|
209 |
|
210 | #### userAgent.random()
|
211 |
|
212 | - returns: <`UserAgent`>
|
213 |
|
214 | This method generates a new `UserAgent` instance using the same filters that were used to construct `userAgent`.
|
215 | The following examples both generate two user agents based on the same filters.
|
216 |
|
217 | ```javascript
|
218 | // Explicitly use the constructor twice.
|
219 | const firstUserAgent = new UserAgent(filters);
|
220 | const secondUserAgent = new UserAgent(filters);
|
221 | ```
|
222 |
|
223 | ```javascript
|
224 | // Use the `random()` method to construct a second user agent.
|
225 | const firstUserAgent = new UserAgent(filters);
|
226 | const secondUserAgent = firstUserAgent.random();
|
227 | ```
|
228 |
|
229 | The reason to prefer the second pattern is that it reuses the filter processing and preparation of the data for random selection.
|
230 | Subsequent random generations can easily be over 100x faster than the initial construction.
|
231 |
|
232 |
|
233 | #### userAgent()
|
234 |
|
235 | - returns: <`UserAgent`>
|
236 |
|
237 | As a bit of syntactic sugar, you can call a `UserAgent` instance like `userAgent()` as a shorthand for `userAgent.random()`.
|
238 | This allows you to think of the instance as a generator, and lends itself to writing code like this.
|
239 |
|
240 | ```javascript
|
241 | const generateUserAgent = new UserAgent(filters);
|
242 | const userAgents = Array(100).fill().map(() => generateUserAgent());
|
243 | ```
|
244 |
|
245 | #### userAgent.toString()
|
246 |
|
247 | - returns: <`String`>
|
248 |
|
249 | Casts the `UserAgent` instance to a string which corresponds to the user agent header.
|
250 | Equivalent to accessing the `userAgent.userAgent` property.
|
251 |
|
252 |
|
253 | #### userAgent.data
|
254 |
|
255 | - returns: <`Object`>
|
256 | - `appName` <`String`> - The value of [navigator.appName](https://developer.mozilla.org/en-US/docs/Web/API/NavigatorID/appName).
|
257 | - `connection` <`Object`> - The value of [navigator.connection](https://developer.mozilla.org/en-US/docs/Web/API/Navigator/connection).
|
258 | - `cpuClass` <`String`> - The value of [navigator.cpuClass](https://msdn.microsoft.com/en-us/library/ms531090\(v=vs.85\).aspx).
|
259 | - `deviceCategory` <`String`> - One of `desktop`, `mobile`, or `tablet` depending on the type of device.
|
260 | - `oscpu` <`String`> - The value of [navigator.oscpu](https://developer.mozilla.org/en-US/docs/Web/API/Navigator/oscpu).
|
261 | - `platform` <`String`> - The value of [navigator.platform](https://developer.mozilla.org/en-US/docs/Web/API/NavigatorID/platform).
|
262 | - `pluginsLength` <`Number`> - The value of [navigator.plugins.length](https://developer.mozilla.org/en-US/docs/Web/API/NavigatorPlugins/plugins).
|
263 | - `screenHeight` <`Number`> - The value of [screen.height](https://developer.mozilla.org/en-US/docs/Web/API/Screen/height).
|
264 | - `screenWidth` <`Number`> - The value of [screen.width](https://developer.mozilla.org/en-US/docs/Web/API/Screen/width).
|
265 | - `vendor` <`String`> - The value of [navigator.vendor](https://developer.mozilla.org/en-US/docs/Web/API/Navigator/vendor).
|
266 | - `userAgent` <`String`> - The value of [navigator.userAgent](https://developer.mozilla.org/en-US/docs/Web/API/NavigatorID/userAgent).
|
267 | - `viewportHeight` <`Number`> - The value of [window.innerHeight](https://developer.mozilla.org/en-US/docs/Web/API/Window/innerHeight).
|
268 | - `viewportWidth` <`Number`> - The value of [window.innerWidth](https://developer.mozilla.org/en-US/docs/Web/API/Window/innerWidth).
|
269 |
|
270 | The `userAgent.data` contains the randomly generated fingerprint for the `UserAgent` instance.
|
271 | Note that each property of `data` is also accessible directly on `userAgent`.
|
272 | For example, `userAgent.appName` is equivalent to `userAgent.data.appName`.
|
273 |
|
274 |
|
275 | ## Versioning
|
276 |
|
277 | The project follows [the Semantic Versioning guidelines](https://semver.org/).
|
278 | The automated deployments will always correspond to patch versions, and minor versions should not introduce breaking changes.
|
279 | It's likely that the structure of user agent data will change in the future, and this will correspond to a new major version.
|
280 |
|
281 | Please keep in mind that older major versions will cease to be updated after a new major version is released.
|
282 | You can continue to use older versions of the software, but you'll need to upgrade to get access to the latest data.
|
283 |
|
284 |
|
285 | ## Acknowledgements
|
286 |
|
287 | The user agent frequency data used in this library is generously provided by [Intoli](https://intoli.com), the premier residential and smart proxy provider for web scraping.
|
288 | The details of how the data is updated can be found in the blog post [User-Agents — A random user agent generation library that's always up to date](https://intoli.com/blog/user-agents/).
|
289 |
|
290 | If you have a high-traffic website and would like to contribute data to the project, then send us an email at [contact@intoli.com](mailto:contact@intoli.com).
|
291 | Additional data sources will help make the library more useful, and we'll be happy to add a link to your site in the acknowledgements.
|
292 |
|
293 |
|
294 | ## Contributing
|
295 |
|
296 | Contributions are welcome, but please follow these contributor guidelines outlined in [CONTRIBUTING.md](CONTRIBUTING.md).
|
297 |
|
298 |
|
299 | ## License
|
300 |
|
301 | User-Agents is licensed under a [BSD 2-Clause License](LICENSE) and is copyright [Intoli, LLC](https://intoli.com).
|