1 | # CHINESE-WHISPERS
|
2 |
|
3 | [![Build Status](https://travis-ci.com/huan/chinese-whispers.svg?branch=master)](https://travis-ci.com/huan/chinese-whispers)
|
4 | [![NPM Version](https://badge.fury.io/js/chinese-whispers.svg)](https://badge.fury.io/js/chinese-whispers)
|
5 | [![npm (tag)](https://img.shields.io/npm/v/chinese-whispers/next.svg)](https://www.npmjs.com/package/chinese-whispers?activeTab=versions)
|
6 | [![Downloads](http://img.shields.io/npm/dm/chinese-whispers.svg?style=flat-square)](https://npmjs.org/package/chinese-whispers)
|
7 | [![TypeScript](https://img.shields.io/badge/%3C%2F%3E-TypeScript-blue.svg)](https://www.typescriptlang.org/)
|
8 | [![Greenkeeper badge](https://badges.greenkeeper.io/huan/chinese-whispers.svg)](https://greenkeeper.io/)
|
9 |
|
10 | An Efficient Graph Clustering Algorithm for JavaScript/Node.js
|
11 |
|
12 | ![Chinese Whispers](https://huan.github.io/chinese-whispers/images/chinese-whispers.gif)
|
13 | > Picture credit: http://www.wikihow.com/Play-Chinese-Whispers
|
14 |
|
15 | ## ALGORITHM
|
16 |
|
17 | Chinese Whispers - _an Efficient Graph Clustering Algorithm and its Application to Natural Language Processing Problems_
|
18 | > Author: Chris Biemann, University of Leipzig, NLP-Dept. Leipzig, Germany
|
19 |
|
20 | 1. [Wikipedia](https://en.wikipedia.org/wiki/Chinese_Whispers_(clustering_method))
|
21 | 2. [Slide](https://www.lt.informatik.tu-darmstadt.de/fileadmin/user_upload/Group_LangTech/publications/pre-langtech/BiemannTextgraphs06.ppt) - 2006 TextGraphs 06, NYC, USA
|
22 | 3. [Paper](https://pdfs.semanticscholar.org/3e71/0251cb01ba6e1c0c735591776a212edc461f.pdf)
|
23 |
|
24 | ## INSTALL
|
25 |
|
26 | ```shell
|
27 | npm install chinese-whispers
|
28 | ```
|
29 |
|
30 | ## EXAMPLE
|
31 |
|
32 | Talk is cheap, show me the code!
|
33 |
|
34 | ```ts
|
35 | import { ChineseWhispers } from 'chinese-whispers'
|
36 |
|
37 | function weightFunc(a, b) {
|
38 | const dist = Math.abs(a - b)
|
39 | return 1 / dist
|
40 | }
|
41 |
|
42 | const cw = new ChineseWhispers({
|
43 | weightFunc,
|
44 | })
|
45 |
|
46 | const dataList = [
|
47 | 0, 1, 2,
|
48 | 10, 11, 12,
|
49 | 20, 21, 22,
|
50 | ]
|
51 |
|
52 | const clusterIndicesList = cw.cluster(dataList)
|
53 |
|
54 | for (const i in clusterIndicesList) {
|
55 | const clusterIndices = clusterIndicesList[i]
|
56 | const cluster = clusterIndices.map(j => dataList[j])
|
57 | console.log('Cluster[' + i + ']: ' + cluster)
|
58 | }
|
59 | // Cluster[0]: 0,1,2
|
60 | // Cluster[1]: 10,11,12
|
61 | // Cluster[2]: 20,21,22
|
62 | ```
|
63 |
|
64 | Source code can be found at: <https://github.com/huan/chinese-whispers/blob/master/examples/demo.ts>
|
65 |
|
66 | API
|
67 | ---
|
68 |
|
69 | The `ChineseWhispers` class is all you need to run the Chinese Whispers Algorithm.
|
70 |
|
71 | ### 1. `constructor(options: ChineseWhisperOptions)`
|
72 |
|
73 | ```ts
|
74 | interface ChineseWhispersOptions<T> {
|
75 | weightFunc: WeightFunc<T>,
|
76 | epochs?: number,
|
77 | threshold?: number,
|
78 | }
|
79 | ```
|
80 |
|
81 | * `options`
|
82 | - `weightFunc`: a function that takes two data item, calculate the weight between them and return the value.
|
83 | - `epochs`: how many epoches to run the algorithm, default 15.
|
84 | - `threshold`: minimum weight required for a edge. default 0.
|
85 |
|
86 | ```ts
|
87 | const nj = require('numjs') // a Javascript implementation of numpy in Python
|
88 |
|
89 | // calculate the distance between vectors
|
90 | function weightFunc(v1, v2) {
|
91 | const njV1 = nj.array(v1)
|
92 | const njV2 = nj.array(v2)
|
93 | const l2 = njV1.subtract(njV2)
|
94 | .pow(2)
|
95 | .sum()
|
96 | const dist = Math.sqrt(l2)
|
97 | return 1 / dist
|
98 | }
|
99 |
|
100 | const cw = new ChineseWhispers({
|
101 | weightFunc,
|
102 | })
|
103 | ```
|
104 |
|
105 | ### 2. `cluster(dataList): number[][]`
|
106 |
|
107 | Process `dataList` which is an array of data, returns the cluster results as an array, each array item is a cluster, and each cluster is an array which includes the indices of dataList that belongs to this cluster.
|
108 |
|
109 | ```ts
|
110 | const clusterIndicesList = cw.cluster(dataList)
|
111 |
|
112 | for (const i in clusterIndicesList) {
|
113 | // get the cluster, which stores the array index dataList
|
114 | const clusterIndices = clusterIndicesList[i]
|
115 | // map the array index of dataList to the actual dataList data
|
116 | const cluster = clusterIndices.map(j => dataList[j])
|
117 | console.log('Cluster[' + i + ']: ' + cluster)
|
118 | }
|
119 | ```
|
120 |
|
121 | ## INSPIRATION
|
122 |
|
123 | The code is heavily inspired by the following implementation:
|
124 |
|
125 | * [facenet chinese whispers(face cluster) in Python - zhly0](http://blog.csdn.net/liyuan123zhouhui/article/details/70312716)
|
126 | * [Chinese Whispers Graph Clustering in Python - Alex Loveless](http://alexloveless.co.uk/data/chinese-whispers-graph-clustering-in-python/)
|
127 | * [A Python implementation of Chris Biemann's algorithm for graph clustering](https://github.com/sanmayaj/ChineseWhispers)
|
128 | * [Implementation of the Chinese Whispers graph clustering algorithm in Java](https://github.com/uhh-lt/chinese-whispers)
|
129 | * [Chinese Whispers Graph Clustering Algorithm in Javascript](https://github.com/anvaka/ngraph.cw)
|
130 |
|
131 | ## SEE ALSO
|
132 |
|
133 | * [The meaning and origin of the expression: Chinese whispers](http://www.phrases.org.uk/meanings/chinese-whispers.html)
|
134 | * [TensorFlow backed FaceNet implementation for Node.js](https://github.com/huan/node-facenet) - Face verification, face recognition and face clustering.
|
135 |
|
136 | ## CHANGELOG
|
137 |
|
138 | ### v0.3 master
|
139 |
|
140 | ### v0.2 Aug 2018
|
141 |
|
142 | 1. Upgrade TypeScript to 3.0
|
143 | 2. DevOps to npm@next
|
144 | 3. Upgrade rollup to 1.0
|
145 |
|
146 | ### v0.1 Sep 2017
|
147 |
|
148 | 1. `ChineseWhispers` class
|
149 | 1. `cluster()` class method
|
150 | 1. Unit test cases
|
151 | 1. Travis CI & CD(publish to NPM automatically)
|
152 |
|
153 | ## AUTHOR
|
154 |
|
155 | Huan LI \<zixia@zixia.net\> (http://linkedin.com/in/zixia)
|
156 |
|
157 | <a href="http://stackoverflow.com/users/1123955/zixia">
|
158 | <img src="http://stackoverflow.com/users/flair/1123955.png" width="208" height="58" alt="profile for zixia at Stack Overflow, Q&A for professional and enthusiast programmers" title="profile for zixia at Stack Overflow, Q&A for professional and enthusiast programmers">
|
159 | </a>
|
160 |
|
161 | ## COPYRIGHT & LICENSE
|
162 |
|
163 | * Code & Docs © 2017-2019 Huan LI \<zixia@zixia.net\>
|
164 | * Code released under the Apache-2.0 License
|
165 | * Docs released under Creative Commons
|