UNPKG

@tensorflow-models/posenet/README.md

Version:

17.1 kBMarkdownView Raw

1# Pose Detection in the Browser: PoseNet Model
2
3## Note: We've just released Version 2.0 with a **new ResNet** model and API. Check out the new documentation below.
4
5This package contains a standalone model called PoseNet, as well as some demos, for running real-time pose estimation in the browser using TensorFlow.js.
6
7[Try the demo here!](https://storage.googleapis.com/tfjs-models/demos/posenet/camera.html)
8
9<img src="demos/camera.gif" alt="cameraDemo" style="width: 600px;"/>
10
11PoseNet can be used to estimate either a single pose or multiple poses, meaning there is a version of the algorithm that can detect only one person in an image/video and one version that can detect multiple persons in an image/video.
12
13[Refer to this blog post](https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5) for a high-level description of PoseNet running on Tensorflow.js.
14
15To keep track of issues we use the [tensorflow/tfjs](https://github.com/tensorflow/tfjs) Github repo.
16
17## Documentation Note
18
19>> The README you see here is for the [PoseNet 2.0 version](https://www.npmjs.com/package/@tensorflow-models/posenet). For README of the previous 1.0 version, please look at the [README published on NPM](https://www.npmjs.com/package/@tensorflow-models/posenet/v/1.0.3).
20
21## Installation
22
23You can use this as standalone es5 bundle like this:
24
25```html
26  <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>
27  <script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/posenet"></script>
28```
29
30Or you can install it via npm for use in a TypeScript / ES6 project.
31
32```sh
33npm install @tensorflow-models/posenet
34```
35
36## Usage
37
38Either a single pose or multiple poses can be estimated from an image.
39Each methodology has its own algorithm and set of parameters.
40
41
42### Loading a pre-trained PoseNet Model
43
44In the first step of pose estimation, an image is fed through a pre-trained model.  PoseNet **comes with a few different versions of the model,** corresponding to variances of MobileNet v1 architecture and ResNet50 architecture. To get started, a model must be loaded from a checkpoint:
45
46```javascript
47const net = await posenet.load();
48```
49
50By default, `posenet.load()` loads a faster and smaller model that is based on MobileNetV1 architecture and has a lower accuracy. If you want to load the larger and more accurate model, specify the architecture explicitly in `posenet.load()` using a `ModelConfig` dictionary:
51
52
53#### MobileNet (smaller, faster, less accurate)
54```javascript
55const net = await posenet.load({
56  architecture: 'MobileNetV1',
57  outputStride: 16,
58  inputResolution: { width: 640, height: 480 },
59  multiplier: 0.75
60});
61```
62
63#### ResNet (larger, slower, more accurate) \*\*new!\*\*
64```javascript
65const net = await posenet.load({
66  architecture: 'ResNet50',
67  outputStride: 32,
68  inputResolution: { width: 257, height: 200 },
69  quantBytes: 2
70});
71```
72
73#### Config params in posenet.load()
74
75 * **architecture** - Can be either `MobileNetV1` or `ResNet50`. It determines which PoseNet architecture to load.
76
77 * **outputStride** - Can be one of `8`, `16`, `32` (Stride `16`, `32` are supported for the ResNet architecture and stride `8`, `16`, `32` are supported for the MobileNetV1 architecture). It specifies the output stride of the PoseNet model. The smaller the value, the larger the output resolution, and more accurate the model at the cost of speed. Set this to a larger value to increase speed at the cost of accuracy.
78
79* **inputResolution** - A `number` or an `Object` of type `{width: number, height: number}`. Defaults to `257.` It specifies the size the image is resized and padded to before it is fed into the PoseNet model. The larger the value, the more accurate the model at the cost of speed. Set this to a smaller value to increase speed at the cost of accuracy. If a number is provided, the image will be resized and padded to be a square with the same width and height.  If `width` and `height` are provided, the image will be resized and padded to the specified width and height.
80
81 * **multiplier** - Can be one of `1.01`, `1.0`, `0.75`, or `0.50` (The value is used *only* by the MobileNetV1 architecture and not by the ResNet architecture). It is the float multiplier for the depth (number of channels) for all convolution ops. The larger the value, the larger the size of the layers, and more accurate the model at the cost of speed. Set this to a smaller value to increase speed at the cost of accuracy.
82
83 * **quantBytes** - This argument controls the bytes used for weight quantization. The available options are:
84
85   - `4`. 4 bytes per float (no quantization). Leads to highest accuracy and original model size (~90MB).
86
87   - `2`. 2 bytes per float. Leads to slightly lower accuracy and 2x model size reduction (~45MB).
88   - `1`. 1 byte per float. Leads to lower accuracy and 4x model size reduction (~22MB).
89
90* **modelUrl** - An optional string that specifies custom url of the model. This is useful for local development or countries that don't have access to the model hosted on GCP.
91
92
93**By default,** PoseNet loads a MobileNetV1 architecture with a **`0.75`** multiplier.  This is recommended for computers with **mid-range/lower-end GPUs.**  A model with a **`0.50`** multiplier is recommended for **mobile.** The ResNet achitecture is recommended for computers with **even more powerful GPUs**.
94
95### Single-Person Pose Estimation
96
97Single pose estimation is the simpler and faster of the two algorithms. Its ideal use case is for when there is only one person in the image. The disadvantage is that if there are multiple persons in an image, keypoints from both persons will likely be estimated as being part of the same single pose—meaning, for example, that person #1’s left arm and person #2’s right knee might be conflated by the algorithm as belonging to the same pose. Both the MobileNetV1 and the ResNet architecture support single-person pose estimation. The method returns a **single pose**:
98
99```javascript
100const net = await posenet.load();
101
102const pose = await net.estimateSinglePose(image, {
103  flipHorizontal: false
104});
105```
106
107#### Params in estimateSinglePose()
108
109* **image** - ImageData|HTMLImageElement|HTMLCanvasElement|HTMLVideoElement
110   The input image to feed through the network.
111* **inferenceConfig** - an object containing:
112  * **flipHorizontal** - Defaults to false.  If the pose should be flipped/mirrored  horizontally.  This should be set to true for videos where the video is by default flipped horizontally (i.e. a webcam), and you want the poses to be returned in the proper orientation.
113
114#### Returns
115
116It returns a `Promise` that resolves with a  **single** `pose`. The `pose` has a confidence score and an array of keypoints indexed by part id, each with a score and position.
117
118#### Example Usage
119
120##### via Script Tag
121
122```html
123<html>
124  <head>
125    <!-- Load TensorFlow.js -->
126    <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>
127    <!-- Load Posenet -->
128    <script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/posenet"></script>
129 </head>
130
131  <body>
132    <img id='cat' src='/images/cat.jpg '/>
133  </body>
134  <!-- Place your code in the script tag below. You can also use an external .js file -->
135  <script>
136    var flipHorizontal = false;
137
138    var imageElement = document.getElementById('cat');
139
140    posenet.load().then(function(net) {
141      const pose = net.estimateSinglePose(imageElement, {
142        flipHorizontal: true
143      });
144      return pose;
145    }).then(function(pose){
146      console.log(pose);
147    })
148  </script>
149</html>
150```
151
152###### via NPM
153
154```javascript
155import * as posenet from '@tensorflow-models/posenet';
156
157async function estimatePoseOnImage(imageElement) {
158  // load the posenet model from a checkpoint
159  const net = await posenet.load();
160
161  const pose = await net.estimateSinglePose(imageElement, {
162    flipHorizontal: false
163  });
164  return pose;
165}
166
167const imageElement = document.getElementById('cat');
168
169const pose = estimatePoseOnImage(imageElement);
170
171console.log(pose);
172
173```
174
175which would produce the output:
176
177```json
178{
179  "score": 0.32371445304906,
180  "keypoints": [
181    {
182      "position": {
183        "y": 76.291801452637,
184        "x": 253.36747741699
185      },
186      "part": "nose",
187      "score": 0.99539834260941
188    },
189    {
190      "position": {
191        "y": 71.10383605957,
192        "x": 253.54365539551
193      },
194      "part": "leftEye",
195      "score": 0.98781454563141
196    },
197    {
198      "position": {
199        "y": 71.839515686035,
200        "x": 246.00454711914
201      },
202      "part": "rightEye",
203      "score": 0.99528175592422
204    },
205    {
206      "position": {
207        "y": 72.848854064941,
208        "x": 263.08151245117
209      },
210      "part": "leftEar",
211      "score": 0.84029853343964
212    },
213    {
214      "position": {
215        "y": 79.956565856934,
216        "x": 234.26812744141
217      },
218      "part": "rightEar",
219      "score": 0.92544466257095
220    },
221    {
222      "position": {
223        "y": 98.34538269043,
224        "x": 399.64068603516
225      },
226      "part": "leftShoulder",
227      "score": 0.99559044837952
228    },
229    {
230      "position": {
231        "y": 95.082359313965,
232        "x": 458.21868896484
233      },
234      "part": "rightShoulder",
235      "score": 0.99583911895752
236    },
237    {
238      "position": {
239        "y": 94.626205444336,
240        "x": 163.94561767578
241      },
242      "part": "leftElbow",
243      "score": 0.9518963098526
244    },
245    {
246      "position": {
247        "y": 150.2349395752,
248        "x": 245.06030273438
249      },
250      "part": "rightElbow",
251      "score": 0.98052614927292
252    },
253    {
254      "position": {
255        "y": 113.9603729248,
256        "x": 393.19735717773
257      },
258      "part": "leftWrist",
259      "score": 0.94009721279144
260    },
261    {
262      "position": {
263        "y": 186.47859191895,
264        "x": 257.98034667969
265      },
266      "part": "rightWrist",
267      "score": 0.98029226064682
268    },
269    {
270      "position": {
271        "y": 208.5266418457,
272        "x": 284.46710205078
273      },
274      "part": "leftHip",
275      "score": 0.97870296239853
276    },
277    {
278      "position": {
279        "y": 209.9910736084,
280        "x": 243.31219482422
281      },
282      "part": "rightHip",
283      "score": 0.97424703836441
284    },
285    {
286      "position": {
287        "y": 281.61965942383,
288        "x": 310.93188476562
289      },
290      "part": "leftKnee",
291      "score": 0.98368924856186
292    },
293    {
294      "position": {
295        "y": 282.80120849609,
296        "x": 203.81164550781
297      },
298      "part": "rightKnee",
299      "score": 0.96947449445724
300    },
301    {
302      "position": {
303        "y": 360.62716674805,
304        "x": 292.21047973633
305      },
306      "part": "leftAnkle",
307      "score": 0.8883239030838
308    },
309    {
310      "position": {
311        "y": 347.41177368164,
312        "x": 203.88229370117
313      },
314      "part": "rightAnkle",
315      "score": 0.8255187869072
316    }
317  ]
318}
319```
320
321### Keypoints
322
323All keypoints are indexed by part id.  The parts and their ids are:
324
325| Id | Part |
326| -- | -- |
327| 0 | nose |
328| 1 | leftEye |
329| 2 | rightEye |
330| 3 | leftEar |
331| 4 | rightEar |
332| 5 | leftShoulder |
333| 6 | rightShoulder |
334| 7 | leftElbow |
335| 8 | rightElbow |
336| 9 | leftWrist |
337| 10 | rightWrist |
338| 11 | leftHip |
339| 12 | rightHip |
340| 13 | leftKnee |
341| 14 | rightKnee |
342| 15 | leftAnkle |
343| 16 | rightAnkle |
344
345
346### Multi-Person Pose Estimation
347
348Multiple Pose estimation can decode multiple poses in an image. It is more complex and slightly slower than the single person algorithm, but has the advantage that if multiple people appear in an image, their detected keypoints are less likely to be associated with the wrong pose. Even if the usecase is to detect a single person’s pose, this algorithm may be more desirable in that the accidental effect of two poses being joined together won’t occur when multiple people appear in the image. It uses the `Fast greedy decoding` algorithm from the research paper [PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model](https://arxiv.org/pdf/1803.08225.pdf). Both MobileNetV1 and ResNet architecture support multi-person pose estimation. Returns a **promise** that resolves with an **array of poses.**
349
350```javascript
351const net = await posenet.load();
352
353const poses = await net.estimateMultiplePoses(image, {
354  flipHorizontal: false,
355  maxDetections: 5,
356  scoreThreshold: 0.5,
357  nmsRadius: 20
358});
359```
360
361#### Params in estimateMultiplePoses()
362
363* **image** - ImageData|HTMLImageElement|HTMLCanvasElement|HTMLVideoElement
364   The input image to feed through the network.
365* **inferenceConfig** - an object containing:
366  * **flipHorizontal** - Defaults to false.  If the poses should be flipped/mirrored  horizontally.  This should be set to true for videos where the video is by default flipped horizontally (i.e. a webcam), and you want the poses to be returned in the proper orientation.
367  * **maxDetections** - the maximum number of poses to detect. Defaults to 5.
368  * **scoreThreshold** - Only return instance detections that have root part score greater or equal to this value. Defaults to 0.5.
369  * **nmsRadius** - Non-maximum suppression part distance. It needs to be strictly positive. Two parts suppress each other if they are less than `nmsRadius` pixels away. Defaults to 20.
370
371#### Returns
372
373It returns a `promise` that resolves with an array of `pose`s, each with a confidence score and an array of `keypoints` indexed by part id, each with a score and position.
374
375##### via Script Tag
376
377```html
378<html>
379  <head>
380    <!-- Load TensorFlow.js -->
381    <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>
382    <!-- Load Posenet -->
383    <script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/posenet"></script>
384 </head>
385
386  <body>
387    <img id='cat' src='/images/cat.jpg '/>
388  </body>
389  <!-- Place your code in the script tag below. You can also use an external .js file -->
390  <script>
391    var imageElement = document.getElementById('cat');
392
393    posenet.load().then(function(net){
394      return net.estimateMultiplePoses(imageElement, {
395        flipHorizontal: false,
396        maxDetections: 2,
397        scoreThreshold: 0.6,
398        nmsRadius: 20})
399    }).then(function(poses){
400      console.log(poses);
401    })
402  </script>
403</html>
404```
405
406###### via NPM
407
408```javascript
409import * as posenet from '@tensorflow-models/posenet';
410
411async function estimateMultiplePosesOnImage(imageElement) {
412  const net = await posenet.load();
413
414  // estimate poses
415  const poses = await net.estimateMultiplePoses(imageElement, {
416        flipHorizontal: false,
417        maxDetections: 2,
418        scoreThreshold: 0.6,
419        nmsRadius: 20});
420
421  return poses;
422}
423
424const imageElement = document.getElementById('people');
425
426const poses = estimateMultiplePosesOnImage(imageElement);
427
428console.log(poses);
429```
430
431This produces the output:
432```
433[
434  // pose 1
435  {
436    // pose score
437    "score": 0.42985695206067,
438    "keypoints": [
439      {
440        "position": {
441          "x": 126.09371757507,
442          "y": 97.861720561981
443        },
444        "part": "nose",
445        "score": 0.99710708856583
446      },
447      {
448        "position": {
449          "x": 132.53466176987,
450          "y": 86.429876804352
451        },
452        "part": "leftEye",
453        "score": 0.99919074773788
454      },
455      {
456        "position": {
457          "x": 100.85626316071,
458          "y": 84.421931743622
459        },
460        "part": "rightEye",
461        "score": 0.99851280450821
462      },
463
464      ...
465
466      {
467        "position": {
468          "x": 72.665352582932,
469          "y": 493.34189963341
470        },
471        "part": "rightAnkle",
472        "score": 0.0028593824245036
473      }
474    ],
475  },
476  // pose 2
477  {
478
479    // pose score
480    "score": 0.13461434583673,
481    "keypoints": [
482      {
483        "position": {
484          "x": 116.58444058895,
485          "y": 99.772533416748
486        },
487        "part": "nose",
488        "score": 0.0028593824245036
489      }
490      {
491        "position": {
492          "x": 133.49897611141,
493          "y": 79.644590377808
494        },
495        "part": "leftEye",
496        "score": 0.99919074773788
497      },
498      {
499        "position": {
500          "x": 100.85626316071,
501          "y": 84.421931743622
502        },
503        "part": "rightEye",
504        "score": 0.99851280450821
505      },
506
507      ...
508
509      {
510        "position": {
511          "x": 72.665352582932,
512          "y": 493.34189963341
513        },
514        "part": "rightAnkle",
515        "score": 0.0028593824245036
516      }
517    ],
518  },
519  // pose 3
520  {
521    // pose score
522    "score": 0.13461434583673,
523    "keypoints": [
524      {
525        "position": {
526          "x": 116.58444058895,
527          "y": 99.772533416748
528        },
529        "part": "nose",
530        "score": 0.0028593824245036
531      }
532      {
533        "position": {
534          "x": 133.49897611141,
535          "y": 79.644590377808
536        },
537        "part": "leftEye",
538        "score": 0.99919074773788
539      },
540
541      ...
542
543      {
544        "position": {
545          "x": 59.334579706192,
546          "y": 485.5936152935
547        },
548        "part": "rightAnkle",
549        "score": 0.004110524430871
550      }
551    ]
552  }
553]
554```
555
556## Developing the Demos
557
558Details for how to run the demos are included in the `demos/` folder.
559

1	`# Pose Detection in the Browser: PoseNet Model`
2
3	`## Note: We've just released Version 2.0 with a new ResNet model and API. Check out the new documentation below.`
4
5	`This package contains a standalone model called PoseNet, as well as some demos, for running real-time pose estimation in the browser using TensorFlow.js.`
6
7	`[Try the demo here!](https://storage.googleapis.com/tfjs-models/demos/posenet/camera.html)`
8
9	`<img src="demos/camera.gif" alt="cameraDemo" style="width: 600px;"/>`
10
11	`PoseNet can be used to estimate either a single pose or multiple poses, meaning there is a version of the algorithm that can detect only one person in an image/video and one version that can detect multiple persons in an image/video.`
12
13	`[Refer to this blog post](https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5) for a high-level description of PoseNet running on Tensorflow.js.`
14
15	`To keep track of issues we use the [tensorflow/tfjs](https://github.com/tensorflow/tfjs) Github repo.`
16
17	`## Documentation Note`
18
19	`>> The README you see here is for the [PoseNet 2.0 version](https://www.npmjs.com/package/@tensorflow-models/posenet). For README of the previous 1.0 version, please look at the [README published on NPM](https://www.npmjs.com/package/@tensorflow-models/posenet/v/1.0.3).`
20
21	`## Installation`
22
23	`You can use this as standalone es5 bundle like this:`
24
25	```html
26	`<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>`
27	`<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/posenet"></script>`
28	```
29
30	`Or you can install it via npm for use in a TypeScript / ES6 project.`
31
32	```sh
33	`npm install @tensorflow-models/posenet`
34	```
35
36	`## Usage`
37
38	`Either a single pose or multiple poses can be estimated from an image.`
39	`Each methodology has its own algorithm and set of parameters.`
40
41
42	`### Loading a pre-trained PoseNet Model`
43
44	`In the first step of pose estimation, an image is fed through a pre-trained model. PoseNet comes with a few different versions of the model, corresponding to variances of MobileNet v1 architecture and ResNet50 architecture. To get started, a model must be loaded from a checkpoint:`
45
46	```javascript
47	`const net = await posenet.load();`
48	```
49
50	By default, `posenet.load()` loads a faster and smaller model that is based on MobileNetV1 architecture and has a lower accuracy. If you want to load the larger and more accurate model, specify the architecture explicitly in `posenet.load()` using a `ModelConfig` dictionary:
51
52
53	`#### MobileNet (smaller, faster, less accurate)`
54	```javascript
55	`const net = await posenet.load({`
56	`architecture: 'MobileNetV1',`
57	`outputStride: 16,`
58	`inputResolution: { width: 640, height: 480 },`
59	`multiplier: 0.75`
60	`});`
61	```
62
63	`#### ResNet (larger, slower, more accurate) \\new!\\`
64	```javascript
65	`const net = await posenet.load({`
66	`architecture: 'ResNet50',`
67	`outputStride: 32,`
68	`inputResolution: { width: 257, height: 200 },`
69	`quantBytes: 2`
70	`});`
71	```
72
73	`#### Config params in posenet.load()`
74
75	* architecture - Can be either `MobileNetV1` or `ResNet50`. It determines which PoseNet architecture to load.
76
77	* outputStride - Can be one of `8`, `16`, `32` (Stride `16`, `32` are supported for the ResNet architecture and stride `8`, `16`, `32` are supported for the MobileNetV1 architecture). It specifies the output stride of the PoseNet model. The smaller the value, the larger the output resolution, and more accurate the model at the cost of speed. Set this to a larger value to increase speed at the cost of accuracy.
78
79	* inputResolution - A `number` or an `Object` of type `{width: number, height: number}`. Defaults to `257.` It specifies the size the image is resized and padded to before it is fed into the PoseNet model. The larger the value, the more accurate the model at the cost of speed. Set this to a smaller value to increase speed at the cost of accuracy. If a number is provided, the image will be resized and padded to be a square with the same width and height. If `width` and `height` are provided, the image will be resized and padded to the specified width and height.
80
81	* multiplier - Can be one of `1.01`, `1.0`, `0.75`, or `0.50` (The value is used only by the MobileNetV1 architecture and not by the ResNet architecture). It is the float multiplier for the depth (number of channels) for all convolution ops. The larger the value, the larger the size of the layers, and more accurate the model at the cost of speed. Set this to a smaller value to increase speed at the cost of accuracy.
82
83	`* quantBytes - This argument controls the bytes used for weight quantization. The available options are:`
84
85	- `4`. 4 bytes per float (no quantization). Leads to highest accuracy and original model size (~90MB).
86
87	- `2`. 2 bytes per float. Leads to slightly lower accuracy and 2x model size reduction (~45MB).
88	- `1`. 1 byte per float. Leads to lower accuracy and 4x model size reduction (~22MB).
89
90	`* modelUrl - An optional string that specifies custom url of the model. This is useful for local development or countries that don't have access to the model hosted on GCP.`
91
92
93	By default, PoseNet loads a MobileNetV1 architecture with a `0.75` multiplier. This is recommended for computers with mid-range/lower-end GPUs. A model with a `0.50` multiplier is recommended for mobile. The ResNet achitecture is recommended for computers with even more powerful GPUs.
94
95	`### Single-Person Pose Estimation`
96
97	Single pose estimation is the simpler and faster of the two algorithms. Its ideal use case is for when there is only one person in the image. The disadvantage is that if there are multiple persons in an image, keypoints from both persons will likely be estimated as being part of the same single pose—meaning, for example, that person #1’s left arm and person #2’s right knee might be conflated by the algorithm as belonging to the same pose. Both the MobileNetV1 and the ResNet architecture support single-person pose estimation. The method returns a single pose:
98
99	```javascript
100	`const net = await posenet.load();`
101
102	`const pose = await net.estimateSinglePose(image, {`
103	`flipHorizontal: false`
104	`});`
105	```
106
107	`#### Params in estimateSinglePose()`
108
109	`* image - ImageData\|HTMLImageElement\|HTMLCanvasElement\|HTMLVideoElement`
110	`The input image to feed through the network.`
111	`* inferenceConfig - an object containing:`
112	`* flipHorizontal - Defaults to false. If the pose should be flipped/mirrored horizontally. This should be set to true for videos where the video is by default flipped horizontally (i.e. a webcam), and you want the poses to be returned in the proper orientation.`
113
114	`#### Returns`
115
116	It returns a `Promise` that resolves with a single `pose`. The `pose` has a confidence score and an array of keypoints indexed by part id, each with a score and position.
117
118	`#### Example Usage`
119
120	`##### via Script Tag`
121
122	```html
123	`<html>`
124	`<head>`
125	`<!-- Load TensorFlow.js -->`
126	`<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>`
127	`<!-- Load Posenet -->`
128	`<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/posenet"></script>`
129	`</head>`
130
131	`<body>`
132	`<img id='cat' src='/images/cat.jpg '/>`
133	`</body>`
134	`<!-- Place your code in the script tag below. You can also use an external .js file -->`
135	`<script>`
136	`var flipHorizontal = false;`
137
138	`var imageElement = document.getElementById('cat');`
139
140	`posenet.load().then(function(net) {`
141	`const pose = net.estimateSinglePose(imageElement, {`
142	`flipHorizontal: true`
143	`});`
144	`return pose;`
145	`}).then(function(pose){`
146	`console.log(pose);`
147	`})`
148	`</script>`
149	`</html>`
150	```
151
152	`###### via NPM`
153
154	```javascript
155	`import * as posenet from '@tensorflow-models/posenet';`
156
157	`async function estimatePoseOnImage(imageElement) {`
158	`// load the posenet model from a checkpoint`
159	`const net = await posenet.load();`
160
161	`const pose = await net.estimateSinglePose(imageElement, {`
162	`flipHorizontal: false`
163	`});`
164	`return pose;`
165	`}`
166
167	`const imageElement = document.getElementById('cat');`
168
169	`const pose = estimatePoseOnImage(imageElement);`
170
171	`console.log(pose);`
172
173	```
174
175	`which would produce the output:`
176
177	```json
178	`{`
179	`"score": 0.32371445304906,`
180	`"keypoints": [`
181	`{`
182	`"position": {`
183	`"y": 76.291801452637,`
184	`"x": 253.36747741699`
185	`},`
186	`"part": "nose",`
187	`"score": 0.99539834260941`
188	`},`
189	`{`
190	`"position": {`
191	`"y": 71.10383605957,`
192	`"x": 253.54365539551`
193	`},`
194	`"part": "leftEye",`
195	`"score": 0.98781454563141`
196	`},`
197	`{`
198	`"position": {`
199	`"y": 71.839515686035,`
200	`"x": 246.00454711914`
201	`},`
202	`"part": "rightEye",`
203	`"score": 0.99528175592422`
204	`},`
205	`{`
206	`"position": {`
207	`"y": 72.848854064941,`
208	`"x": 263.08151245117`
209	`},`
210	`"part": "leftEar",`
211	`"score": 0.84029853343964`
212	`},`
213	`{`
214	`"position": {`
215	`"y": 79.956565856934,`
216	`"x": 234.26812744141`
217	`},`
218	`"part": "rightEar",`
219	`"score": 0.92544466257095`
220	`},`
221	`{`
222	`"position": {`
223	`"y": 98.34538269043,`
224	`"x": 399.64068603516`
225	`},`
226	`"part": "leftShoulder",`
227	`"score": 0.99559044837952`
228	`},`
229	`{`
230	`"position": {`
231	`"y": 95.082359313965,`
232	`"x": 458.21868896484`
233	`},`
234	`"part": "rightShoulder",`
235	`"score": 0.99583911895752`
236	`},`
237	`{`
238	`"position": {`
239	`"y": 94.626205444336,`
240	`"x": 163.94561767578`
241	`},`
242	`"part": "leftElbow",`
243	`"score": 0.9518963098526`
244	`},`
245	`{`
246	`"position": {`
247	`"y": 150.2349395752,`
248	`"x": 245.06030273438`
249	`},`
250	`"part": "rightElbow",`
251	`"score": 0.98052614927292`
252	`},`
253	`{`
254	`"position": {`
255	`"y": 113.9603729248,`
256	`"x": 393.19735717773`
257	`},`
258	`"part": "leftWrist",`
259	`"score": 0.94009721279144`
260	`},`
261	`{`
262	`"position": {`
263	`"y": 186.47859191895,`
264	`"x": 257.98034667969`
265	`},`
266	`"part": "rightWrist",`
267	`"score": 0.98029226064682`
268	`},`
269	`{`
270	`"position": {`
271	`"y": 208.5266418457,`
272	`"x": 284.46710205078`
273	`},`
274	`"part": "leftHip",`
275	`"score": 0.97870296239853`
276	`},`
277	`{`
278	`"position": {`
279	`"y": 209.9910736084,`
280	`"x": 243.31219482422`
281	`},`
282	`"part": "rightHip",`
283	`"score": 0.97424703836441`
284	`},`
285	`{`
286	`"position": {`
287	`"y": 281.61965942383,`
288	`"x": 310.93188476562`
289	`},`
290	`"part": "leftKnee",`
291	`"score": 0.98368924856186`
292	`},`
293	`{`
294	`"position": {`
295	`"y": 282.80120849609,`
296	`"x": 203.81164550781`
297	`},`
298	`"part": "rightKnee",`
299	`"score": 0.96947449445724`
300	`},`
301	`{`
302	`"position": {`
303	`"y": 360.62716674805,`
304	`"x": 292.21047973633`
305	`},`
306	`"part": "leftAnkle",`
307	`"score": 0.8883239030838`
308	`},`
309	`{`
310	`"position": {`
311	`"y": 347.41177368164,`
312	`"x": 203.88229370117`
313	`},`
314	`"part": "rightAnkle",`
315	`"score": 0.8255187869072`
316	`}`
317	`]`
318	`}`
319	```
320
321	`### Keypoints`
322
323	`All keypoints are indexed by part id. The parts and their ids are:`
324
325	`\| Id \| Part \|`
326	`\| -- \| -- \|`
327	`\| 0 \| nose \|`
328	`\| 1 \| leftEye \|`
329	`\| 2 \| rightEye \|`
330	`\| 3 \| leftEar \|`
331	`\| 4 \| rightEar \|`
332	`\| 5 \| leftShoulder \|`
333	`\| 6 \| rightShoulder \|`
334	`\| 7 \| leftElbow \|`
335	`\| 8 \| rightElbow \|`
336	`\| 9 \| leftWrist \|`
337	`\| 10 \| rightWrist \|`
338	`\| 11 \| leftHip \|`
339	`\| 12 \| rightHip \|`
340	`\| 13 \| leftKnee \|`
341	`\| 14 \| rightKnee \|`
342	`\| 15 \| leftAnkle \|`
343	`\| 16 \| rightAnkle \|`
344
345
346	`### Multi-Person Pose Estimation`
347
348	Multiple Pose estimation can decode multiple poses in an image. It is more complex and slightly slower than the single person algorithm, but has the advantage that if multiple people appear in an image, their detected keypoints are less likely to be associated with the wrong pose. Even if the usecase is to detect a single person’s pose, this algorithm may be more desirable in that the accidental effect of two poses being joined together won’t occur when multiple people appear in the image. It uses the `Fast greedy decoding` algorithm from the research paper [PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model](https://arxiv.org/pdf/1803.08225.pdf). Both MobileNetV1 and ResNet architecture support multi-person pose estimation. Returns a promise that resolves with an array of poses.
349
350	```javascript
351	`const net = await posenet.load();`
352
353	`const poses = await net.estimateMultiplePoses(image, {`
354	`flipHorizontal: false,`
355	`maxDetections: 5,`
356	`scoreThreshold: 0.5,`
357	`nmsRadius: 20`
358	`});`
359	```
360
361	`#### Params in estimateMultiplePoses()`
362
363	`* image - ImageData\|HTMLImageElement\|HTMLCanvasElement\|HTMLVideoElement`
364	`The input image to feed through the network.`
365	`* inferenceConfig - an object containing:`
366	`* flipHorizontal - Defaults to false. If the poses should be flipped/mirrored horizontally. This should be set to true for videos where the video is by default flipped horizontally (i.e. a webcam), and you want the poses to be returned in the proper orientation.`
367	`* maxDetections - the maximum number of poses to detect. Defaults to 5.`
368	`* scoreThreshold - Only return instance detections that have root part score greater or equal to this value. Defaults to 0.5.`
369	* nmsRadius - Non-maximum suppression part distance. It needs to be strictly positive. Two parts suppress each other if they are less than `nmsRadius` pixels away. Defaults to 20.
370
371	`#### Returns`
372
373	It returns a `promise` that resolves with an array of `pose`s, each with a confidence score and an array of `keypoints` indexed by part id, each with a score and position.
374
375	`##### via Script Tag`
376
377	```html
378	`<html>`
379	`<head>`
380	`<!-- Load TensorFlow.js -->`
381	`<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>`
382	`<!-- Load Posenet -->`
383	`<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/posenet"></script>`
384	`</head>`
385
386	`<body>`
387	`<img id='cat' src='/images/cat.jpg '/>`
388	`</body>`
389	`<!-- Place your code in the script tag below. You can also use an external .js file -->`
390	`<script>`
391	`var imageElement = document.getElementById('cat');`
392
393	`posenet.load().then(function(net){`
394	`return net.estimateMultiplePoses(imageElement, {`
395	`flipHorizontal: false,`
396	`maxDetections: 2,`
397	`scoreThreshold: 0.6,`
398	`nmsRadius: 20})`
399	`}).then(function(poses){`
400	`console.log(poses);`
401	`})`
402	`</script>`
403	`</html>`
404	```
405
406	`###### via NPM`
407
408	```javascript
409	`import * as posenet from '@tensorflow-models/posenet';`
410
411	`async function estimateMultiplePosesOnImage(imageElement) {`
412	`const net = await posenet.load();`
413
414	`// estimate poses`
415	`const poses = await net.estimateMultiplePoses(imageElement, {`
416	`flipHorizontal: false,`
417	`maxDetections: 2,`
418	`scoreThreshold: 0.6,`
419	`nmsRadius: 20});`
420
421	`return poses;`
422	`}`
423
424	`const imageElement = document.getElementById('people');`
425
426	`const poses = estimateMultiplePosesOnImage(imageElement);`
427
428	`console.log(poses);`
429	```
430
431	`This produces the output:`
432	```
433	`[`
434	`// pose 1`
435	`{`
436	`// pose score`
437	`"score": 0.42985695206067,`
438	`"keypoints": [`
439	`{`
440	`"position": {`
441	`"x": 126.09371757507,`
442	`"y": 97.861720561981`
443	`},`
444	`"part": "nose",`
445	`"score": 0.99710708856583`
446	`},`
447	`{`
448	`"position": {`
449	`"x": 132.53466176987,`
450	`"y": 86.429876804352`
451	`},`
452	`"part": "leftEye",`
453	`"score": 0.99919074773788`
454	`},`
455	`{`
456	`"position": {`
457	`"x": 100.85626316071,`
458	`"y": 84.421931743622`
459	`},`
460	`"part": "rightEye",`
461	`"score": 0.99851280450821`
462	`},`
463
464	`...`
465
466	`{`
467	`"position": {`
468	`"x": 72.665352582932,`
469	`"y": 493.34189963341`
470	`},`
471	`"part": "rightAnkle",`
472	`"score": 0.0028593824245036`
473	`}`
474	`],`
475	`},`
476	`// pose 2`
477	`{`
478
479	`// pose score`
480	`"score": 0.13461434583673,`
481	`"keypoints": [`
482	`{`
483	`"position": {`
484	`"x": 116.58444058895,`
485	`"y": 99.772533416748`
486	`},`
487	`"part": "nose",`
488	`"score": 0.0028593824245036`
489	`}`
490	`{`
491	`"position": {`
492	`"x": 133.49897611141,`
493	`"y": 79.644590377808`
494	`},`
495	`"part": "leftEye",`
496	`"score": 0.99919074773788`
497	`},`
498	`{`
499	`"position": {`
500	`"x": 100.85626316071,`
501	`"y": 84.421931743622`
502	`},`
503	`"part": "rightEye",`
504	`"score": 0.99851280450821`
505	`},`
506
507	`...`
508
509	`{`
510	`"position": {`
511	`"x": 72.665352582932,`
512	`"y": 493.34189963341`
513	`},`
514	`"part": "rightAnkle",`
515	`"score": 0.0028593824245036`
516	`}`
517	`],`
518	`},`
519	`// pose 3`
520	`{`
521	`// pose score`
522	`"score": 0.13461434583673,`
523	`"keypoints": [`
524	`{`
525	`"position": {`
526	`"x": 116.58444058895,`
527	`"y": 99.772533416748`
528	`},`
529	`"part": "nose",`
530	`"score": 0.0028593824245036`
531	`}`
532	`{`
533	`"position": {`
534	`"x": 133.49897611141,`
535	`"y": 79.644590377808`
536	`},`
537	`"part": "leftEye",`
538	`"score": 0.99919074773788`
539	`},`
540
541	`...`
542
543	`{`
544	`"position": {`
545	`"x": 59.334579706192,`
546	`"y": 485.5936152935`
547	`},`
548	`"part": "rightAnkle",`
549	`"score": 0.004110524430871`
550	`}`
551	`]`
552	`}`
553	`]`
554	```
555
556	`## Developing the Demos`
557
558	Details for how to run the demos are included in the `demos/` folder.
559