WEBVTT
1
00:00:00.004 --> 00:00:02.001
- [Instructor] We've already seen
2
00:00:02.001 --> 00:00:04.005
how to calculate a conditional probability
3
00:00:04.005 --> 00:00:07.001
using the data in our Wordle spreadsheet,
4
00:00:07.001 --> 00:00:10.006
specifically, we calculated the probability
5
00:00:10.006 --> 00:00:18.007
of having two Os given that we don't have an A, I, E or U.
6
00:00:18.007 --> 00:00:21.008
But what if we had to calculate the other way?
7
00:00:21.008 --> 00:00:26.005
The probability of no A, I, E, or U
8
00:00:26.005 --> 00:00:30.006
given two Os in our word.
9
00:00:30.006 --> 00:00:33.006
A natural first reaction would be to simply go back
10
00:00:33.006 --> 00:00:34.008
to the spreadsheet.
11
00:00:34.008 --> 00:00:38.004
After all, it wasn't too difficult to do the calculation.
12
00:00:38.004 --> 00:00:40.005
However, in this instance,
13
00:00:40.005 --> 00:00:42.008
we have complete information
14
00:00:42.008 --> 00:00:47.008
and often you don't have complete information.
15
00:00:47.008 --> 00:00:50.007
That's where the famous Bayes's rule
16
00:00:50.007 --> 00:00:53.000
or Bayes's theorem comes in.
17
00:00:53.000 --> 00:00:55.005
It allows you to solve for the probability
18
00:00:55.005 --> 00:01:00.003
of A given B if you know the other three values.
19
00:01:00.003 --> 00:01:03.006
The probability of A, the probability of B
20
00:01:03.006 --> 00:01:07.007
and the probability of B given A.
21
00:01:07.007 --> 00:01:09.006
So here's what it looks like.
22
00:01:09.006 --> 00:01:13.006
So we're solving for the probability of A given B
23
00:01:13.006 --> 00:01:15.005
over there on the left,
24
00:01:15.005 --> 00:01:18.000
and on the right side of the equation
25
00:01:18.000 --> 00:01:21.000
is the probability of B given A
26
00:01:21.000 --> 00:01:23.008
times the probability of A
27
00:01:23.008 --> 00:01:27.004
all divided by the probability of B.
28
00:01:27.004 --> 00:01:28.009
A fair question at this point
29
00:01:28.009 --> 00:01:31.002
is why is this so helpful?
30
00:01:31.002 --> 00:01:32.008
Now, we've already mentioned
31
00:01:32.008 --> 00:01:36.003
that you might know the probability of B given A
32
00:01:36.003 --> 00:01:39.002
but not be able to calculate the probability
33
00:01:39.002 --> 00:01:42.001
of A given B directly.
34
00:01:42.001 --> 00:01:46.006
What might be unclear is how often this is an issue.
35
00:01:46.006 --> 00:01:49.001
So the second reason listed here
36
00:01:49.001 --> 00:01:51.007
is perhaps the more important one.
37
00:01:51.007 --> 00:01:54.006
But it might seem abstract at first.
38
00:01:54.006 --> 00:01:56.007
The other reason is that we might want
39
00:01:56.007 --> 00:02:01.004
to generate what is called a posterior distribution
40
00:02:01.004 --> 00:02:03.009
from a prior distribution
41
00:02:03.009 --> 00:02:07.000
combined with new information.
42
00:02:07.000 --> 00:02:10.005
As Allen Downey, author of "Think Bayes" puts it,
43
00:02:10.005 --> 00:02:12.009
"This formula, Bayes's theorem,
44
00:02:12.009 --> 00:02:17.004
is a recipe for updating beliefs in light of new data."
45
00:02:17.004 --> 00:02:19.005
That's really what it's all about.
46
00:02:19.005 --> 00:02:22.009
Constantly ingesting new information
47
00:02:22.009 --> 00:02:26.003
and updating all of the probabilities.
48
00:02:26.003 --> 00:02:28.002
I especially like a phrase
49
00:02:28.002 --> 00:02:33.000
from John Kruschke's book, "Doing Bayesian Data Analysis."
50
00:02:33.000 --> 00:02:36.006
"Bayesian inference is the reallocation
51
00:02:36.006 --> 00:02:39.009
of credibility across possibilities."
52
00:02:39.009 --> 00:02:43.006
In the context of our ongoing example,
53
00:02:43.006 --> 00:02:45.009
you wouldn't likely make a word
54
00:02:45.009 --> 00:02:50.007
with two Os your first guess in Wordle
55
00:02:50.007 --> 00:02:56.000
but if you learned that you have no A, I, E, or U,
56
00:02:56.000 --> 00:02:59.001
you might make it your second guess.
57
00:02:59.001 --> 00:03:01.006
Bayesian models do the same kind of thing
58
00:03:01.006 --> 00:03:03.006
but on a grander scale
59
00:03:03.006 --> 00:03:05.008
to predict political elections
60
00:03:05.008 --> 00:03:08.002
or the locations of missing submarines
61
00:03:08.002 --> 00:03:11.003
and many other use cases.
62
00:03:11.003 --> 00:03:15.005
So let's resolve those fractions we had.
63
00:03:15.005 --> 00:03:17.003
We've got our three values
64
00:03:17.003 --> 00:03:19.004
to plug into our formula.
65
00:03:19.004 --> 00:03:22.008
0.022,
66
00:03:22.008 --> 00:03:28.003
0.082 and 0.171.
67
00:03:28.003 --> 00:03:30.009
And here are the same numbers inserted
68
00:03:30.009 --> 00:03:35.002
into the right-hand side of our equation.
69
00:03:35.002 --> 00:03:39.009
And the answer is 63.7%.
70
00:03:39.009 --> 00:03:41.006
Let's pause for a moment
71
00:03:41.006 --> 00:03:44.008
and reflect on if that makes sense.
72
00:03:44.008 --> 00:03:47.009
Well, we know that words with two Os
73
00:03:47.009 --> 00:03:50.004
are not incredibly rare
74
00:03:50.004 --> 00:03:53.007
but they aren't incredibly common either.
75
00:03:53.007 --> 00:03:56.006
However, with only five letters,
76
00:03:56.006 --> 00:03:59.007
if two of those letters are O,
77
00:03:59.007 --> 00:04:02.001
that doesn't leave a lot of room left over
78
00:04:02.001 --> 00:04:03.008
for other vowels.
79
00:04:03.008 --> 00:04:08.000
So a much higher percentage makes sense.
80
00:04:08.000 --> 00:04:11.008
In fact, if no examples come to mind,
81
00:04:11.008 --> 00:04:13.001
our little spreadsheet
82
00:04:13.001 --> 00:04:15.009
has just three examples and they all start
83
00:04:15.009 --> 00:04:17.005
with the letter A.
84
00:04:17.005 --> 00:04:21.007
Afoot, aloof, and agogo.
85
00:04:21.007 --> 00:04:24.008
So there you have it, the famous Bayes rule.
86
00:04:24.008 --> 00:04:27.002
Many discussions of Bayes start
87
00:04:27.002 --> 00:04:30.002
with the rule on the very first page.
88
00:04:30.002 --> 00:04:33.002
But I hope that the discussion up to this point
89
00:04:33.002 --> 00:04:35.007
has put it into context.
90
00:04:35.007 --> 00:04:38.004
And there'll be much more to say about Bayes
91
00:04:38.004 --> 00:04:40.001
in the second half of the course.