1 |
|
2 | # Good Enough Recommendations (GER)
|
3 | <img src="./assets/ger300x200.png" align="right" alt="GER logo" />
|
4 |
|
5 | [![Build Status](https://travis-ci.org/grahamjenson/ger.svg?branch=master)](https://travis-ci.org/grahamjenson/ger)
|
6 |
|
7 | Providing good recommendations can get greater user engagement and provide an opportunity to add value that would otherwise not exist. The main reason why many applications don't provide recommendations is the difficulty in either implementing a custom engine or using an existing engine.
|
8 |
|
9 | Good Enough Recommendations (**GER**) is a recommendation engine that is scalable, easily usable and easy to integrate. GER's goal is to generate **good enough** recommendations for your application or product, so that you can provide value quickly and painlessly.
|
10 |
|
11 | ##Quick Start Guide
|
12 |
|
13 | **Note: functions from GER return a promises**
|
14 |
|
15 | Install `ger` and `coffee-script` with `npm`:
|
16 |
|
17 | ```bash
|
18 | npm install ger
|
19 | ```
|
20 |
|
21 | In your javascript code, first require `ger`:
|
22 |
|
23 | ```javascript
|
24 | var g = require('ger')
|
25 | ```
|
26 |
|
27 | Initialize an in memory Event Store Manager (ESM) and create a Good Enough Recommender (GER):
|
28 |
|
29 | ```javascript
|
30 | var esm = new g.MemESM()
|
31 | var ger = new g.GER(esm);
|
32 | ```
|
33 |
|
34 | The next step is to initialize a namespace, e.g. `movies`. *A namespace is a bucket of events that will not interfere with other buckets*.
|
35 |
|
36 | ```javascript
|
37 | ger.initialize_namespace('movies')
|
38 | ```
|
39 |
|
40 | Next add events to the namespace. *An event is a triple (person, action, thing)* e.g. `bob` `likes` `xmen`.
|
41 |
|
42 | ```javascript
|
43 | ger.events([{
|
44 | namespace: 'movies',
|
45 | person: 'bob',
|
46 | action: 'likes',
|
47 | thing: 'xmen',
|
48 | expires_at: '2020-06-06'
|
49 | }])
|
50 | ```
|
51 |
|
52 | An event is used by GER in two ways:
|
53 |
|
54 | 1. to compare two people by looking at their history, e.g. `bob` and `alice` `like` similar movies, so `bob` and `alice` are similar
|
55 | 2. provide recommendations from a persons history, e.g. `bob` `liked` a movie `alice` might like, so we can recommend that movie to `alice`
|
56 |
|
57 | There are two caveats with using events as recommendations:
|
58 |
|
59 | 1. an action may be negative, e.g. `bob` `dislikes` `xmen` which is **not** a recommendation
|
60 | 2. a recommendation ALWAYS expires, e.g. `bob` `likes` `xmen` occurred 15 years ago, so maybe he wouldn't recommend `xmen` now
|
61 |
|
62 | So GER has the rule: *If an event has an expiry date it is treated as a recommendation until it expires*.
|
63 |
|
64 | GER can generate recommendations for a person, e.g. *what would alice like?*
|
65 |
|
66 | ```
|
67 | ger.recommendations_for_person('movies', 'alice', {actions: {likes: 1}
|
68 | ```
|
69 |
|
70 | and recommendations for a thing, e.g. *what would a person who likes xmen like?*
|
71 |
|
72 | ```
|
73 | ger.recommendations_for_thing('movies', 'xmen', {actions: {likes: 1}})
|
74 | ```
|
75 |
|
76 |
|
77 | Lets put it all together:
|
78 |
|
79 |
|
80 | ```javascript
|
81 | var g = require('ger')
|
82 | var esm = new g.MemESM()
|
83 | var ger = new g.GER(esm);
|
84 |
|
85 | ger.initialize_namespace('movies')
|
86 | .then( function() {
|
87 | return ger.events([
|
88 | {
|
89 | namespace: 'movies',
|
90 | person: 'bob',
|
91 | action: 'likes',
|
92 | thing: 'xmen',
|
93 | expires_at: '2020-06-06'
|
94 | },
|
95 | {
|
96 | namespace: 'movies',
|
97 | person: 'bob',
|
98 | action: 'likes',
|
99 | thing: 'avengers',
|
100 | expires_at: '2020-06-06'
|
101 | },
|
102 | {
|
103 | namespace: 'movies',
|
104 | person: 'alice',
|
105 | action: 'likes',
|
106 | thing: 'xmen',
|
107 | expires_at: '2020-06-06'
|
108 | },
|
109 | ])
|
110 | })
|
111 | .then( function() {
|
112 | // What things might alice like?
|
113 | return ger.recommendations_for_person('movies', 'alice', {actions: {likes: 1}})
|
114 | })
|
115 | .then( function(recommendations) {
|
116 | console.log("\nRecommendations For 'alice'")
|
117 | console.log(JSON.stringify(recommendations,null,2))
|
118 | })
|
119 | .then( function() {
|
120 | // What things are similar to xmen?
|
121 | return ger.recommendations_for_thing('movies', 'xmen', {actions: {likes: 1}})
|
122 | })
|
123 | .then( function(recommendations) {
|
124 | console.log("\nRecommendations Like 'xmen'")
|
125 | console.log(JSON.stringify(recommendations,null,2))
|
126 | })
|
127 | ```
|
128 |
|
129 | This will output:
|
130 |
|
131 | ```json
|
132 | Recommendations For 'alice'
|
133 | {
|
134 | "recommendations": [
|
135 | {
|
136 | "thing": "xmen",
|
137 | "weight": 1.5,
|
138 | "last_actioned_at": "2015-07-09T14:33:37+01:00",
|
139 | "last_expires_at": "2020-06-06T01:00:00+01:00",
|
140 | "people": [
|
141 | "alice",
|
142 | "bob"
|
143 | ]
|
144 | },
|
145 | {
|
146 | "thing": "avengers",
|
147 | "weight": 0.5,
|
148 | "last_actioned_at": "2015-07-09T14:33:37+01:00",
|
149 | "last_expires_at": "2020-06-06T01:00:00+01:00",
|
150 | "people": [
|
151 | "bob"
|
152 | ]
|
153 | }
|
154 | ],
|
155 | "neighbourhood": {
|
156 | "bob": 0.5,
|
157 | "alice": 1
|
158 | },
|
159 | "confidence": 0.0007147696406599602
|
160 | }
|
161 |
|
162 | Recommendations Like 'xmen'
|
163 | {
|
164 | "recommendations": [
|
165 | {
|
166 | "thing": "avengers",
|
167 | "weight": 0.5,
|
168 | "last_actioned_at": "2015-07-09T14:33:37+01:00",
|
169 | "last_expires_at": "2020-06-06T01:00:00+01:00",
|
170 | "people": [
|
171 | "bob"
|
172 | ]
|
173 | }
|
174 | ],
|
175 | "neighbourhood": {
|
176 | "avengers": 0.5
|
177 | },
|
178 | "confidence": 0.0007923350883032776
|
179 | }
|
180 | ```
|
181 |
|
182 | In the recommendations for `alice`, `xmen` is the highest rated recommendations because alice has `liked` it before, so she probably likes it now. You can filter out recommendations that have been actioned before using the `filter_previous_actions` configuration key described below.
|
183 |
|
184 | *This code for this example is in the `./examples/basic_recommendations_exmaple.js` script*
|
185 |
|
186 | ## Configuration
|
187 |
|
188 | GER lets you set some values to customize recommendations generation using a `configuration`. Below is a description of all the configurable keys and their defaults:
|
189 |
|
190 | | Key | Default
|
191 | |--- |---
|
192 | | `actions` | `{}`
|
193 | | `minimum_history_required` | `0`
|
194 | | `neighbourhood_search_size` | `100`
|
195 | | `similarity_search_size` | `100`
|
196 | | `neighbourhood_size` | `25`
|
197 | | `recommendations_per_neighbour` | `10`
|
198 | | `filter_previous_actions` | `[]`
|
199 | | `event_decay_rate` | `1`
|
200 | | `time_until_expiry` | `0`
|
201 | | `current_datetime` | `now()`
|
202 |
|
203 |
|
204 | 2. `actions` is an object where the keys are actions names, and the values are action weights that represent the importance of the action
|
205 | 3. `minimum_history_required` is the minimum amount of events a person has to have to even bother generating recommendations. It is good to stop low confidence recommendations being generated.
|
206 | 4. `neighbourhood_search_size` the amount of events in the past that are used to search for the neighborhood. This value has the highest impact on performance but past a certain point has no (or negative) impact on recommendations.
|
207 | 5. `similarity_search_size` is the amount of events in the history used to calculate the similarity between things or people.
|
208 | 5. `neighbourhood_size` the number of similar people (or things) that are searched for. This value has a significant performance impact, and increasing it past a point will also gain diminishing returns.
|
209 | 6. `recommendations_per_neighbour` the number of recommendations each similar person can offer. This is to stop a situation where a single highly similar person provides all recommendations.
|
210 | 7. `filter_previous_actions` it removes recommendations that the person being recommended already has in their history. For example, if a person has already liked `xmen`, then if `filter_previous_actions` is `["liked"]` they will not be recommended `xmen`.
|
211 | 8. `event_decay_rate` the rate at which event weight will decay over time, `weight * event_decay_rate ^ (- days since event)`
|
212 | 9. `time_until_expiry` is the number (in seconds) from `now()` where recommendations that expire will be removed. For example, recommendations on a website might be valid for minutes, where in a email you might recommendations valid for days.
|
213 | 10. `current_datetime` defines a "simulated" current time that will not use any events that are performed after `current_datetime` when generating recommendations.
|
214 |
|
215 | For example, generating recommendations with a configuration from GER:
|
216 |
|
217 | ```javascript
|
218 | ger.recommendations_for_person('movies', 'alice', {
|
219 | "actions": {
|
220 | "like": 1,
|
221 | "watch": 5
|
222 | },
|
223 | "minimum_history_required": 5,
|
224 | "similarity_search_size": 50,
|
225 | "neighbourhood_size": 20,
|
226 | "recommendations_per_neighbour": 10,
|
227 | "filter_previous_actions": ["watch"],
|
228 | "event_decay_rate": 1.05,
|
229 | "time_until_expiry": 180
|
230 | })
|
231 | ```
|
232 |
|
233 | ## Technology
|
234 |
|
235 | GER is implemented in Coffee-Script on top of Node.js ([here](http://www.maori.geek.nz/post/why_should_you_use_coffeescript_instead_of_javascript) are my reasons for using Coffee-Script). The core logic is implemented in an abstractions called an Event Store Manager (**ESM**), this is the persistency and many calculations occur.
|
236 |
|
237 | Currently there is an in memory ESM and a PostgreSQL ESM. There is also a RethinkDB ESM in the works being implemented by the awesome [linuxlich](https://github.com/thelinuxlich/ger).
|
238 |
|
239 | ## Event Store Manager
|
240 |
|
241 | If you ask
|
242 |
|
243 | > Why is GER not available on X?
|
244 |
|
245 | Where X is some database or store (e.g. Redis, Mongo, Cassandra ...). The way to make it available on these systems is to implement your own ESM for it.
|
246 |
|
247 | The API for an ESM is:
|
248 |
|
249 |
|
250 | *Initialization*:
|
251 |
|
252 | 1. `esm = new ESM(options)` where options is used to setup connections and such.
|
253 | 2. `initialize(namespace)` will create a `namespace` for events.
|
254 | 3. `destroy(namespace)` will destroy all resources for ESM in namespace
|
255 | 4. `exists(namespace)` will check if the namespace exists
|
256 | 5. `list_namespaces` returns a list of namespaces
|
257 |
|
258 |
|
259 | *Events*:
|
260 |
|
261 | 1. `add_events`
|
262 | 2. `add_event`
|
263 | 3. `find_events`
|
264 | 4. `delete_events`
|
265 |
|
266 |
|
267 | *Thing Recommendations*:
|
268 |
|
269 | 1. `thing_neighbourhood`
|
270 | 1. `calculate_similarities_from_thing`
|
271 |
|
272 |
|
273 | *Person Recommendations*
|
274 |
|
275 | 1. `person_neighbourhood`
|
276 | 1. `calculate_similarities_from_person`
|
277 | 1. `filter_things_by_previous_actions`
|
278 | 1. `recent_recommendations_by_people`
|
279 |
|
280 |
|
281 | *Compacting*:
|
282 |
|
283 | 1. pre_compact
|
284 | 2. compact_people
|
285 | 3. compact_things
|
286 | 4. post_compact
|
287 |
|
288 |
|
289 | ## Additional Reading
|
290 |
|
291 | Posts about (or related to) GER:
|
292 |
|
293 | 1. Demo Movie Recommendations Site: [Yeah, Nah](http://yeahnah.maori.geek.nz/)
|
294 | 1. Overall description and motivation of GER: [Good Enough Recommendations with GER](http://maori.geek.nz/post/good_enough_recomendations_with_ger)
|
295 | 2. How GER works [GER's Anatomy: How to Generate Good Enough Recommendations](http://www.maori.geek.nz/post/how_ger_generates_recommendations_the_anatomy_of_a_recommendations_engine)
|
296 | 2. Testing frameworks being used to test GER: [Testing Javascript with Mocha, Chai, and Sinon](http://www.maori.geek.nz/post/introduction_to_testing_node_js_with_mocha_chai_and_sinon)
|
297 | 4. [Postgres Upsert (Update or Insert) in GER using Knex.js](http://www.maori.geek.nz/post/postgres_upsert_update_or_insert_in_ger_using_knex_js)
|
298 | 5. [List of Recommender Systems](https://github.com/grahamjenson/list_of_recommender_systems)
|
299 |
|
300 | ## Changelog
|
301 |
|
302 | 2015-07-09 - updated readme and fixed basicmem ESM bug.
|
303 |
|
304 | 2015-02-01 - fixed bug with set_namespace and added tests
|
305 |
|
306 | 2015-01-30 - added a few helper methods for namespaces, and removed caches to be truly stateless.
|
307 |
|
308 | 2014-12-30 - added find and delete events methods.
|
309 |
|
310 | 2014-12-22 - added exists to check if namespace is initilaized. also changed some indexes in rethinkdb, and changed some semantics around initialize
|
311 |
|
312 | 2014-12-22 - Added Rethink DB Event Store Manager.
|
313 |
|
314 | 2014-12-9 - Added more explanation to the returned recommendations so they can be reasoned about externally
|
315 |
|
316 | 2014-12-4 - Changed ESM API to be more understandable and also updated README
|
317 |
|
318 | 2014-11-27 - Started returning the last actioned at date with recommendations
|
319 |
|
320 | 2014-11-25 - Added better way of selecting recommendations from similar people.
|
321 |
|
322 | 2014-11-12 - Added better heuristic to select related people. Meaning less related people need to be selected to find good values
|