UNPKG

1.67 kBMarkdownView Raw
1# crawler-hbase
2a library to interact with the crawler tables stored in hbase.
3crawler hbase exports two modules: class called Client which constructs an hbase client and a module Utils which is an object containing helper functions.
4
5## Class Client
6```javascript
7var HbaseClient = require("crawler-hbase").Client;
8var client = new HbaseClient("0.0.0.0:9090");
9```
10
11#### CrawlHbaseClient(dbUrl)
12Constructs the client using the provided hbase dbUrl. It is assumed that there is Hbase-thrift running on the provided dbUrl.
13
14
15#### storeRawCrawl(crawl)
16Stores a raw crawl into table raw_crawls.
17
18#### getRows(startKey, endKey, limit, descending, tableName, filterString)
19The generic get function used by almost all the other specific gets
20
21#### getLatestRawCrawl()
22Returns the latest raw crawl.
23
24#### getRawCrawlByKey(key)
25Gets a raw crawl by key.
26
27#### storeProcessedCrawl(newCrawl, oldCrawl)
28Stores newCrawl. oldCrawl is used to calculate the changes that happened between the two crawls.
29
30#### getCrawlInfo(crawlKey)
31Get crawl info.
32
33#### getNodeHistory(pubKey)
34Get the array of all different versions tha given node appeared in crawls.
35
36#### getCrawlNodeStats(crawlKey)
37Get stats about the given nodes in the given crawl
38
39#### getConnections(crawlKey, pubKey, type)
40Get links between nodes. type is either 'in' or 'out' to get ingoing or outgoing connections respectively.
41
42#### getAllConnections(crawlKey)
43Get all links for the given crawl
44
45## Utils
46provides helper methods to work with hbase tables' keys which have a lot of hidden information in them.
47
48#### keyToStart(key)
49Get crawl start time from crawl's key
50
51#### keyToEnd(key)
52Get crawl end time from crawl's key
\No newline at end of file