UNPKG

5.66 kBMarkdownView Raw
1# CHINESE-WHISPERS
2
3[![Build Status](https://travis-ci.com/huan/chinese-whispers.svg?branch=master)](https://travis-ci.com/huan/chinese-whispers)
4[![NPM Version](https://badge.fury.io/js/chinese-whispers.svg)](https://badge.fury.io/js/chinese-whispers)
5[![npm (tag)](https://img.shields.io/npm/v/chinese-whispers/next.svg)](https://www.npmjs.com/package/chinese-whispers?activeTab=versions)
6[![Downloads](http://img.shields.io/npm/dm/chinese-whispers.svg?style=flat-square)](https://npmjs.org/package/chinese-whispers)
7[![TypeScript](https://img.shields.io/badge/%3C%2F%3E-TypeScript-blue.svg)](https://www.typescriptlang.org/)
8[![Greenkeeper badge](https://badges.greenkeeper.io/huan/chinese-whispers.svg)](https://greenkeeper.io/)
9
10An Efficient Graph Clustering Algorithm for JavaScript/Node.js
11
12![Chinese Whispers](https://huan.github.io/chinese-whispers/images/chinese-whispers.gif)
13> Picture credit: http://www.wikihow.com/Play-Chinese-Whispers
14
15## ALGORITHM
16
17Chinese Whispers - _an Efficient Graph Clustering Algorithm and its Application to Natural Language Processing Problems_
18> Author: Chris Biemann, University of Leipzig, NLP-Dept. Leipzig, Germany
19
201. [Wikipedia](https://en.wikipedia.org/wiki/Chinese_Whispers_(clustering_method))
212. [Slide](https://www.lt.informatik.tu-darmstadt.de/fileadmin/user_upload/Group_LangTech/publications/pre-langtech/BiemannTextgraphs06.ppt) - 2006 TextGraphs 06, NYC, USA
223. [Paper](https://pdfs.semanticscholar.org/3e71/0251cb01ba6e1c0c735591776a212edc461f.pdf)
23
24## INSTALL
25
26```shell
27npm install chinese-whispers
28```
29
30## EXAMPLE
31
32Talk is cheap, show me the code!
33
34```ts
35import { ChineseWhispers } from 'chinese-whispers'
36
37function weightFunc(a, b) {
38 const dist = Math.abs(a - b)
39 return 1 / dist
40}
41
42const cw = new ChineseWhispers({
43 weightFunc,
44})
45
46const dataList = [
47 0, 1, 2,
48 10, 11, 12,
49 20, 21, 22,
50]
51
52const clusterIndicesList = cw.cluster(dataList)
53
54for (const i in clusterIndicesList) {
55 const clusterIndices = clusterIndicesList[i]
56 const cluster = clusterIndices.map(j => dataList[j])
57 console.log('Cluster[' + i + ']: ' + cluster)
58}
59// Cluster[0]: 0,1,2
60// Cluster[1]: 10,11,12
61// Cluster[2]: 20,21,22
62```
63
64Source code can be found at: <https://github.com/huan/chinese-whispers/blob/master/examples/demo.ts>
65
66API
67---
68
69The `ChineseWhispers` class is all you need to run the Chinese Whispers Algorithm.
70
71### 1. `constructor(options: ChineseWhisperOptions)`
72
73```ts
74interface ChineseWhispersOptions<T> {
75 weightFunc: WeightFunc<T>,
76 epochs?: number,
77 threshold?: number,
78}
79```
80
81* `options`
82 - `weightFunc`: a function that takes two data item, calculate the weight between them and return the value.
83 - `epochs`: how many epoches to run the algorithm, default 15.
84 - `threshold`: minimum weight required for a edge. default 0.
85
86```ts
87const nj = require('numjs') // a Javascript implementation of numpy in Python
88
89// calculate the distance between vectors
90function weightFunc(v1, v2) {
91 const njV1 = nj.array(v1)
92 const njV2 = nj.array(v2)
93 const l2 = njV1.subtract(njV2)
94 .pow(2)
95 .sum()
96 const dist = Math.sqrt(l2)
97 return 1 / dist
98}
99
100const cw = new ChineseWhispers({
101 weightFunc,
102})
103```
104
105### 2. `cluster(dataList): number[][]`
106
107Process `dataList` which is an array of data, returns the cluster results as an array, each array item is a cluster, and each cluster is an array which includes the indices of dataList that belongs to this cluster.
108
109```ts
110const clusterIndicesList = cw.cluster(dataList)
111
112for (const i in clusterIndicesList) {
113 // get the cluster, which stores the array index dataList
114 const clusterIndices = clusterIndicesList[i]
115 // map the array index of dataList to the actual dataList data
116 const cluster = clusterIndices.map(j => dataList[j])
117 console.log('Cluster[' + i + ']: ' + cluster)
118}
119```
120
121## INSPIRATION
122
123The code is heavily inspired by the following implementation:
124
125* [facenet chinese whispers(face cluster) in Python - zhly0](http://blog.csdn.net/liyuan123zhouhui/article/details/70312716)
126* [Chinese Whispers Graph Clustering in Python - Alex Loveless](http://alexloveless.co.uk/data/chinese-whispers-graph-clustering-in-python/)
127* [A Python implementation of Chris Biemann's algorithm for graph clustering](https://github.com/sanmayaj/ChineseWhispers)
128* [Implementation of the Chinese Whispers graph clustering algorithm in Java](https://github.com/uhh-lt/chinese-whispers)
129* [Chinese Whispers Graph Clustering Algorithm in Javascript](https://github.com/anvaka/ngraph.cw)
130
131## SEE ALSO
132
133* [The meaning and origin of the expression: Chinese whispers](http://www.phrases.org.uk/meanings/chinese-whispers.html)
134* [TensorFlow backed FaceNet implementation for Node.js](https://github.com/huan/node-facenet) - Face verification, face recognition and face clustering.
135
136## CHANGELOG
137
138### v0.3 master
139
140### v0.2 Aug 2018
141
1421. Upgrade TypeScript to 3.0
1432. DevOps to npm@next
1443. Upgrade rollup to 1.0
145
146### v0.1 Sep 2017
147
1481. `ChineseWhispers` class
1491. `cluster()` class method
1501. Unit test cases
1511. Travis CI & CD(publish to NPM automatically)
152
153## AUTHOR
154
155Huan LI \<zixia@zixia.net\> (http://linkedin.com/in/zixia)
156
157<a href="http://stackoverflow.com/users/1123955/zixia">
158 <img src="http://stackoverflow.com/users/flair/1123955.png" width="208" height="58" alt="profile for zixia at Stack Overflow, Q&amp;A for professional and enthusiast programmers" title="profile for zixia at Stack Overflow, Q&amp;A for professional and enthusiast programmers">
159</a>
160
161## COPYRIGHT & LICENSE
162
163* Code & Docs © 2017-2019 Huan LI \<zixia@zixia.net\>
164* Code released under the Apache-2.0 License
165* Docs released under Creative Commons