WEBVTT 1 00:00:00.040 --> 00:00:02.080 - [Instructor] Let's get some terminology out of the way. 2 00:00:02.080 --> 00:00:05.010 All of the recommender systems we've looked at so far 3 00:00:05.010 --> 00:00:08.030 are what's called top-N recommender systems. 4 00:00:08.030 --> 00:00:10.080 That means that their job is to produce a finite list 5 00:00:10.080 --> 00:00:13.080 of the best things to present to a given person. 6 00:00:13.080 --> 00:00:16.070 Here's a shot of my music recommendations on Amazon, 7 00:00:16.070 --> 00:00:19.060 and you'll see it's made of 20 pages of five results 8 00:00:19.060 --> 00:00:25.020 per page, so this is a top-N recommender where N is 100. 9 00:00:25.020 --> 00:00:28.040 As you'll soon see, a lot of recommender system research 10 00:00:28.040 --> 00:00:31.030 tends to focus on the problem of predicting a user's ratings 11 00:00:31.030 --> 00:00:34.000 for everything they haven't rated already, good or bad. 12 00:00:34.000 --> 00:00:36.020 But that's very different from what recommender systems 13 00:00:36.020 --> 00:00:38.040 need to do in the real world. 14 00:00:38.040 --> 00:00:41.020 Usually users don't care about your ability to predict 15 00:00:41.020 --> 00:00:43.010 how they'll rate some new item. 16 00:00:43.010 --> 00:00:45.020 That's why the ratings you see in this widget 17 00:00:45.020 --> 00:00:47.060 are the aggregate ratings from other users 18 00:00:47.060 --> 00:00:50.080 and not the ratings the system thinks you'll give them. 19 00:00:50.080 --> 00:00:52.090 Customers don't want to see your ability 20 00:00:52.090 --> 00:00:54.090 to predict their rating for an item, 21 00:00:54.090 --> 00:00:58.010 they just want to see things they're likely to love. 22 00:00:58.010 --> 00:01:00.030 Throughout this course, it's important to remember 23 00:01:00.030 --> 00:01:02.080 that ultimately our goal is to put the best content 24 00:01:02.080 --> 00:01:06.020 we can find in front of users in the form of a top-N list 25 00:01:06.020 --> 00:01:07.070 like this one. 26 00:01:07.070 --> 00:01:09.040 Our success depends on our ability 27 00:01:09.040 --> 00:01:12.010 to find the best top recommendations for people 28 00:01:12.010 --> 00:01:14.010 so it makes sense to focus on finding things 29 00:01:14.010 --> 00:01:16.060 people will love and not our ability 30 00:01:16.060 --> 00:01:18.090 to predict the items people will hate. 31 00:01:18.090 --> 00:01:20.090 It sounds obvious but this point is missed 32 00:01:20.090 --> 00:01:22.060 by a lot of researchers. 33 00:01:22.060 --> 00:01:25.030 Those top five or 10 recommendations for each user 34 00:01:25.030 --> 00:01:29.060 are what really matters. 35 00:01:29.060 --> 00:01:32.010 Here's one way a top-N recommender system might work, 36 00:01:32.010 --> 00:01:34.030 and there are many ways to do it, 37 00:01:34.030 --> 00:01:36.060 but generally you start with some data store 38 00:01:36.060 --> 00:01:38.090 representing the individual interests of each user. 39 00:01:38.090 --> 00:01:40.070 For example, their ratings for movies 40 00:01:40.070 --> 00:01:42.060 that they've seen or implicit ratings 41 00:01:42.060 --> 00:01:45.030 such as the stuff they've bought in the past. 42 00:01:45.030 --> 00:01:47.070 In practice this is usually a big distributed 43 00:01:47.070 --> 00:01:50.080 no SQL data store like Cassandra, or MongoDB, 44 00:01:50.080 --> 00:01:52.030 or Memcached, or something, 45 00:01:52.030 --> 00:01:54.050 because it has to vend lots of data 46 00:01:54.050 --> 00:01:56.070 but with very simple queries. 47 00:01:56.070 --> 00:01:59.060 Ideally, this interest data is normalized using techniques 48 00:01:59.060 --> 00:02:02.010 such as mean centering or z-scores 49 00:02:02.010 --> 00:02:05.010 to ensure that the data is comparable between users, 50 00:02:05.010 --> 00:02:05.090 but in the real world, 51 00:02:05.090 --> 00:02:10.010 your data is often too sparse to normalize it effectively. 52 00:02:10.010 --> 00:02:13.060 The first step is to generate recommendation candidates, 53 00:02:13.060 --> 00:02:15.070 items we think might be interesting to the user 54 00:02:15.070 --> 00:02:17.070 based on their past behavior, 55 00:02:17.070 --> 00:02:19.090 so the candidate generation phase 56 00:02:19.090 --> 00:02:21.050 might take all of the items the user 57 00:02:21.050 --> 00:02:23.020 indicated interest in before 58 00:02:23.020 --> 00:02:25.080 and consult another data store of items that are similar 59 00:02:25.080 --> 00:02:28.080 to those items based on aggregate behavior. 60 00:02:28.080 --> 00:02:30.030 Let's take an example. 61 00:02:30.030 --> 00:02:32.060 Let's say you're making recommendations for me. 62 00:02:32.060 --> 00:02:35.030 You might consult my database of individual interests 63 00:02:35.030 --> 00:02:38.050 and see that I have liked Star Trek stuff in the past. 64 00:02:38.050 --> 00:02:40.020 Based on everyone else's behavior, 65 00:02:40.020 --> 00:02:43.070 I know that people who like Star Trek also like Star Wars, 66 00:02:43.070 --> 00:02:46.000 so based on my interest in Star Trek, 67 00:02:46.000 --> 00:02:47.080 I might get some recommendation candidates 68 00:02:47.080 --> 00:02:50.060 that include Star Wars stuff. 69 00:02:50.060 --> 00:02:52.070 In the process of building up those recommendations, 70 00:02:52.070 --> 00:02:54.090 I might assign scores to each candidate 71 00:02:54.090 --> 00:02:57.000 based on how I rated the items they came from 72 00:02:57.000 --> 00:02:59.060 and how strong the similarities are between the item 73 00:02:59.060 --> 00:03:01.080 and the candidates that came from them. 74 00:03:01.080 --> 00:03:03.090 I might even filter out candidates at this stage 75 00:03:03.090 --> 00:03:06.030 if the score isn't high enough. 76 00:03:06.030 --> 00:03:08.050 Next we move to candidate ranking. 77 00:03:08.050 --> 00:03:10.050 Many candidates will appear more than once 78 00:03:10.050 --> 00:03:12.080 and need to be combined together in some way, 79 00:03:12.080 --> 00:03:14.070 maybe boosting their score in the process, 80 00:03:14.070 --> 00:03:16.090 since they keep coming up repeatedly. 81 00:03:16.090 --> 00:03:18.030 After that, it can just be a matter 82 00:03:18.030 --> 00:03:21.030 of sorting the resulting recommendation candidates by score 83 00:03:21.030 --> 00:03:24.080 to get our first cut at a top-N list of recommendations. 84 00:03:24.080 --> 00:03:27.010 Although much more complicated approaches exist 85 00:03:27.010 --> 00:03:30.040 such as learning to rank where machine learning is employed 86 00:03:30.040 --> 00:03:33.090 to find the optimal ranking of candidates at this stage. 87 00:03:33.090 --> 00:03:36.000 This ranking stage might also have access 88 00:03:36.000 --> 00:03:38.030 to more information about the recommendation candidates 89 00:03:38.030 --> 00:03:41.040 that it can use such as average review scores 90 00:03:41.040 --> 00:03:43.040 that can be used to boost results for highly rated 91 00:03:43.040 --> 00:03:46.050 or popular items, for example. 92 00:03:46.050 --> 00:03:47.090 Some filtering will be required 93 00:03:47.090 --> 00:03:49.080 before presenting the final sorted list 94 00:03:49.080 --> 00:03:52.010 of recommendation candidates to the user. 95 00:03:52.010 --> 00:03:53.070 This filtering stage is where we might 96 00:03:53.070 --> 00:03:55.060 eliminate recommendations for items the user 97 00:03:55.060 --> 00:03:56.090 has already rated, 98 00:03:56.090 --> 00:03:58.030 since we don't want to recommend things 99 00:03:58.030 --> 00:04:00.040 the user has already seen. 100 00:04:00.040 --> 00:04:02.090 We might also apply a stop list here to remove items 101 00:04:02.090 --> 00:04:05.000 that are potentially offensive to the user 102 00:04:05.000 --> 00:04:07.090 or remove items that are below some minimum quality score 103 00:04:07.090 --> 00:04:10.060 or minimum rating threshold. 104 00:04:10.060 --> 00:04:13.050 It's also where we apply the N in top-N recommenders 105 00:04:13.050 --> 00:04:17.010 and cut things off if we have more results than we need. 106 00:04:17.010 --> 00:04:19.030 The output of the filtering stage is then handed off 107 00:04:19.030 --> 00:04:21.060 to your display layer where a pretty widget 108 00:04:21.060 --> 00:04:24.080 of product recommendations is presented to the user. 109 00:04:24.080 --> 00:04:27.000 Generally speaking, the candidate generation, 110 00:04:27.000 --> 00:04:29.010 ranking, and filtering will live inside 111 00:04:29.010 --> 00:04:31.060 some distributed recommendation web service 112 00:04:31.060 --> 00:04:33.060 that your web front end talks to 113 00:04:33.060 --> 00:04:37.070 in the process of rendering a page for a specific user. 114 00:04:37.070 --> 00:04:40.040 This diagram is a simplified version of what we call 115 00:04:40.040 --> 00:04:42.040 item-based collaborative filtering, 116 00:04:42.040 --> 00:04:46.020 and it's the same algorithm Amazon published in 2003. 117 00:04:46.020 --> 00:04:48.010 You can see it's not really that complicated 118 00:04:48.010 --> 00:04:49.090 from an architecture standpoint. 119 00:04:49.090 --> 00:04:51.070 The hard part is building up that database 120 00:04:51.070 --> 00:04:53.010 of item similarities, really. 121 00:04:53.010 --> 00:04:55.010 That's where the magic happens. 122 00:04:55.010 --> 00:04:56.090 Again, this is just one way to do it. 123 00:04:56.090 --> 00:04:58.070 We're going to explore many many others 124 00:04:58.070 --> 00:05:03.010 throughout the course. 125 00:05:03.010 --> 00:05:05.040 Another architecture that's popular with researchers, 126 00:05:05.040 --> 00:05:06.070 and that we'll see in this course, 127 00:05:06.070 --> 00:05:09.060 is to build up a database ahead of time of predicted ratings 128 00:05:09.060 --> 00:05:12.010 of every item by every user. 129 00:05:12.010 --> 00:05:14.070 The candidate generation phase is then just retrieving 130 00:05:14.070 --> 00:05:16.070 all of the rating predictions for a given user 131 00:05:16.070 --> 00:05:18.010 for every item, 132 00:05:18.010 --> 00:05:21.000 and ranking is just a matter of sorting them. 133 00:05:21.000 --> 00:05:23.020 This requires you to look at every single item 134 00:05:23.020 --> 00:05:25.050 in your catalog for every single user, however, 135 00:05:25.050 --> 00:05:28.010 which isn't very efficient at runtime. 136 00:05:28.010 --> 00:05:29.030 In the previous slide, 137 00:05:29.030 --> 00:05:32.000 we only started with items the user actually liked, 138 00:05:32.000 --> 00:05:34.020 and worked from there instead of looking at every item 139 00:05:34.020 --> 00:05:36.020 that exists. 140 00:05:36.020 --> 00:05:38.010 The reason we see this sort of architecture 141 00:05:38.010 --> 00:05:40.030 is because people like to measure themselves 142 00:05:40.030 --> 00:05:43.060 on how accurately they can predict ratings, good or bad. 143 00:05:43.060 --> 00:05:45.060 But, as we'll see, that's not really the right thing 144 00:05:45.060 --> 00:05:47.090 to focus on in the real world. 145 00:05:47.090 --> 00:05:50.050 If you have a small catalog of items to recommend, however, 146 00:05:50.050 --> 00:05:52.009 this approach isn't entirely unreasonable.