UNPKG

2.33 kBMarkdownView Raw
1#Keyword Extractor
2
3A simple [NPM package](https://npmjs.org/package/keyword-extractor) for extracting _keywords_ from a string by
4removing stopwords.
5
6## Installation
7
8```sh
9$ npm install keyword-extractor
10```
11
12## Running tests
13
14To run the test suite, first install the development dependencies by running the following command within the package's
15directory.
16
17```sh
18$ npm install
19```
20
21To execute the package's tests, run:
22
23``` sh
24$ make test
25```
26
27##Usage of the Module
28
29```javascript
30// include the Keyword Extractor
31var keyword_extractor = require("keyword-extractor");
32
33// Opening sentence to NY Times Article at
34// http://www.nytimes.com/2013/09/10/world/middleeast/surprise-russian-proposal-catches-obama-between-putin-and-house-republicans.html
35var sentence = "President Obama woke up Monday facing a Congressional defeat that many in both parties believed could hobble his presidency."
36
37// Extract the keywords
38var extraction_result = keyword_extractor.extract(sentence,{
39 language:"english",
40 remove_digits: true,
41 return_changed_case:true
42 });
43
44/*
45 extraction result is:
46
47 [
48 "president",
49 "obama",
50 "woke",
51 "monday",
52 "facing",
53 "congressional",
54 "defeat",
55 "parties",
56 "believed",
57 "hobble",
58 "presidency"
59 ]
60*/
61```
62
63###Options Parameters
64
65The second argument of the _extract_ method is an Object of configuration/processing settings for the extraction.
66
67Parameter Name | Description | Permitted Values
68---------------|-------------|-----------------
69language | The stopwords list to use. | _english_ or _spanish_
70return_changed_case | The case of the extracted keywords. Setting the value to _true_ will return the results all lower-cased, if _false_ the results will be in the original case. | _true_ or _false_
71
72## Credits
73
74The initial stopwords lists are taken from the following sources:
75
76- English [http://jmlr.org/papers/volume5/lewis04a/a11-smart-stop-list/english.stop]
77- Spanish [https://stop-words.googlecode.com/svn/trunk/stop-words/stop-words/stop-words-spanish.txt]