1 00:00:00,000 --> 00:00:08,520 CHRISTINE BREINER: Welcome back to recitation. 2 00:00:08,520 --> 00:00:11,060 In this video, what I'd like you to do 3 00:00:11,060 --> 00:00:14,040 is use least squares to fit a line to the following 4 00:00:14,040 --> 00:00:17,210 data, which includes three points: the point (0, 1), 5 00:00:17,210 --> 00:00:19,910 the point (2, 1), and the point (3, 4). 6 00:00:19,910 --> 00:00:23,050 And I've drawn a rough picture where these points are 7 00:00:23,050 --> 00:00:26,100 on a graph, and I'll be talking a little bit about that 8 00:00:26,100 --> 00:00:28,880 after you try this problem. 9 00:00:28,880 --> 00:00:31,657 So why don't you try to solve the problem based 10 00:00:31,657 --> 00:00:33,240 on the technique you learned in class, 11 00:00:33,240 --> 00:00:35,460 and then bring the video back up. 12 00:00:35,460 --> 00:00:37,740 I'll show you again what you're actually doing, 13 00:00:37,740 --> 00:00:40,990 and then I will also solve the problem and you can see, 14 00:00:40,990 --> 00:00:42,490 you can compare your answer to mine. 15 00:00:42,490 --> 00:00:51,240 16 00:00:51,240 --> 00:00:52,317 OK, welcome back. 17 00:00:52,317 --> 00:00:54,900 So the first thing I want to do when we're talking about least 18 00:00:54,900 --> 00:00:57,730 squares to fit a line to these data, what I'd like to do 19 00:00:57,730 --> 00:00:59,375 is I'm going to draw a line on here, 20 00:00:59,375 --> 00:01:01,916 and we're going to talk about what the least squares actually 21 00:01:01,916 --> 00:01:02,890 is trying to do. 22 00:01:02,890 --> 00:01:08,390 And so I'm going to draw this line, say, something like this. 23 00:01:08,390 --> 00:01:10,700 That's my first guess on what might 24 00:01:10,700 --> 00:01:16,730 be the actual least squares line for these data. 25 00:01:16,730 --> 00:01:19,290 And if you recall, what you're actually trying to do 26 00:01:19,290 --> 00:01:21,530 is you're trying to minimize a certain quantity, 27 00:01:21,530 --> 00:01:23,770 and the quantity you're trying to minimize 28 00:01:23,770 --> 00:01:27,440 is the difference between the actual value you get 29 00:01:27,440 --> 00:01:30,920 and the expected value you get, the square of that. 30 00:01:30,920 --> 00:01:35,120 So this is one thing we're going to square. 31 00:01:35,120 --> 00:01:37,920 So even pictorially, we could say 32 00:01:37,920 --> 00:01:43,109 we have something that contributes some area, OK, 33 00:01:43,109 --> 00:01:44,650 and then this is another thing that's 34 00:01:44,650 --> 00:01:47,120 the difference between the expected value in y 35 00:01:47,120 --> 00:01:50,171 and the actual value you get. 36 00:01:50,171 --> 00:01:51,670 So let me try and make that squared, 37 00:01:51,670 --> 00:01:54,100 because we square that value. 38 00:01:54,100 --> 00:01:55,800 And then this last one is pretty small. 39 00:01:55,800 --> 00:01:57,680 The difference between the expected value 40 00:01:57,680 --> 00:02:00,950 and the actual value is very small over in this case 41 00:02:00,950 --> 00:02:02,814 from the picture I drew. 42 00:02:02,814 --> 00:02:04,230 And so what you're trying to do is 43 00:02:04,230 --> 00:02:06,350 you're trying to find the line that 44 00:02:06,350 --> 00:02:09,220 minimizes the area of the sum of these three, 45 00:02:09,220 --> 00:02:11,020 in this case, three squares. 46 00:02:11,020 --> 00:02:14,750 So if I, you know, if I tip the line down further here, 47 00:02:14,750 --> 00:02:17,600 but I kept it fixed there, the area of this square 48 00:02:17,600 --> 00:02:21,160 would get bigger, because this distance would get bigger. 49 00:02:21,160 --> 00:02:23,624 But, of course, the area of this square would get smaller. 50 00:02:23,624 --> 00:02:25,040 And so you're trying to figure out 51 00:02:25,040 --> 00:02:28,080 a way to optimize, in this case, minimize, 52 00:02:28,080 --> 00:02:31,180 the area of these-- sorry, the sum 53 00:02:31,180 --> 00:02:34,230 of the areas of these squares by figuring out 54 00:02:34,230 --> 00:02:36,310 where to put the line. 55 00:02:36,310 --> 00:02:39,600 Now, I mentioned if I could fix this here and move it 56 00:02:39,600 --> 00:02:42,550 up and down, then that would change the slope 57 00:02:42,550 --> 00:02:43,942 but fix the intercept. 58 00:02:43,942 --> 00:02:45,942 But I could also move the intercept up and down, 59 00:02:45,942 --> 00:02:48,830 and I could slide the line up and down, 60 00:02:48,830 --> 00:02:52,980 and that would fix the slope but vary the intercept, 61 00:02:52,980 --> 00:02:54,022 or I could do both. 62 00:02:54,022 --> 00:02:55,730 And ultimately, that's what you're doing. 63 00:02:55,730 --> 00:02:59,720 You're trying to find the best y-intercept and the best slope 64 00:02:59,720 --> 00:03:02,790 at the same time, and that's why you learned this process 65 00:03:02,790 --> 00:03:06,150 in multivariable calculus, because you're 66 00:03:06,150 --> 00:03:08,920 trying to optimize something that has two 67 00:03:08,920 --> 00:03:10,580 quantities that you can change. 68 00:03:10,580 --> 00:03:13,780 You can change slope and you can change the y-intercept, 69 00:03:13,780 --> 00:03:15,510 so that's why you learned it now instead 70 00:03:15,510 --> 00:03:18,290 of in single-variable calculus. 71 00:03:18,290 --> 00:03:20,180 Now, if we come over here, I wrote down ahead 72 00:03:20,180 --> 00:03:26,590 of time what the equations are you get when you optimize, 73 00:03:26,590 --> 00:03:29,430 when you're trying to optimize the least squares line. 74 00:03:29,430 --> 00:03:32,150 So if you write down, you know, the function 75 00:03:32,150 --> 00:03:33,610 as a function of a and b, and you 76 00:03:33,610 --> 00:03:35,350 take the derivative with respect to a 77 00:03:35,350 --> 00:03:37,365 and you set it equal to zero, you get the first equation. 78 00:03:37,365 --> 00:03:38,930 When you take the derivative with respect 79 00:03:38,930 --> 00:03:40,420 to b and you set it equal to zero, 80 00:03:40,420 --> 00:03:41,980 you get the second equation. 81 00:03:41,980 --> 00:03:45,020 And so I'm just going to fill in the details 82 00:03:45,020 --> 00:03:46,790 and then tell you what the solution is, 83 00:03:46,790 --> 00:03:49,820 and then I'm going to mention a little more sophisticated thing 84 00:03:49,820 --> 00:03:52,670 we can do, or I guess maybe the next level 85 00:03:52,670 --> 00:03:56,240 of least squares we can do, that's not a linear problem. 86 00:03:56,240 --> 00:03:57,870 So I'll do that last. 87 00:03:57,870 --> 00:04:00,650 So what I want to do is I put all the values I need over here 88 00:04:00,650 --> 00:04:01,420 on the right. 89 00:04:01,420 --> 00:04:03,211 So all the values we need are on the right. 90 00:04:03,211 --> 00:04:05,780 We're going to need the sum of all the x_i's and that's 5. 91 00:04:05,780 --> 00:04:07,565 The sum of all the y_i's is 6. 92 00:04:07,565 --> 00:04:09,345 The sum of the x_i squareds is 13, 93 00:04:09,345 --> 00:04:12,736 and the sum of the x_i*y_i product is 14. 94 00:04:12,736 --> 00:04:14,110 Now, where are these coming from? 95 00:04:14,110 --> 00:04:15,780 Remember, what is x_i and what is y_i? 96 00:04:15,780 --> 00:04:23,580 If we come back over here, this is (x_1, y_1), (x_2, y_2), 97 00:04:23,580 --> 00:04:25,330 (x_3, y_3), OK? 98 00:04:25,330 --> 00:04:28,830 So that's sort of how we get-- if we switch the whole pairs 99 00:04:28,830 --> 00:04:30,120 around, it doesn't matter. 100 00:04:30,120 --> 00:04:32,960 Obviously, if we switched x- and y-values together, 101 00:04:32,960 --> 00:04:35,900 it does matter, and that you see actually from the equation 102 00:04:35,900 --> 00:04:36,900 does matter. 103 00:04:36,900 --> 00:04:38,850 Right here you have a product of them. 104 00:04:38,850 --> 00:04:40,450 So let me write this out. 105 00:04:40,450 --> 00:04:43,270 And again, we're solving for a and b, so in this case, 106 00:04:43,270 --> 00:04:46,830 a is the slope and b is the intercept, right? 107 00:04:46,830 --> 00:04:48,080 So let me write everything in. 108 00:04:48,080 --> 00:04:52,640 So we actually get that we want to solve the following system: 109 00:04:52,640 --> 00:05:00,800 13a plus 5b is equal to 14. 110 00:05:00,800 --> 00:05:03,380 Make sure I put in everything correctly. 111 00:05:03,380 --> 00:05:09,570 And then the second one is 5a plus-- n in this case 112 00:05:09,570 --> 00:05:16,130 is 3, because there were three points-- plus 3b is equal to 6. 113 00:05:16,130 --> 00:05:18,930 OK, this is a system of equations. 114 00:05:18,930 --> 00:05:22,180 It has two equations and two unknowns, 115 00:05:22,180 --> 00:05:24,480 and if you solve this-- I should look at my notes, 116 00:05:24,480 --> 00:05:26,240 because I didn't memorize it and I'm not 117 00:05:26,240 --> 00:05:27,620 going to do all the algebra. 118 00:05:27,620 --> 00:05:30,230 If you go to solve this, what you actually get 119 00:05:30,230 --> 00:05:34,780 is that a is equal to 6/7 and b is equal to 4/7. 120 00:05:34,780 --> 00:05:38,470 So you get that the line-- let me write it here. 121 00:05:38,470 --> 00:05:45,590 You get a is equal to 6/7 and b is equal to 4/7, OK? 122 00:05:45,590 --> 00:05:48,720 And so what that tells you is if you come back over here, 123 00:05:48,720 --> 00:05:51,410 the line we actually want-- I'll write the solution right here-- 124 00:05:51,410 --> 00:05:59,551 the line we actually want is the line y equals 6/7 x plus 4/7. 125 00:05:59,551 --> 00:06:01,717 That is our least squares line that is our solution. 126 00:06:01,717 --> 00:06:03,930 OK? 127 00:06:03,930 --> 00:06:08,990 So that is actually-- solving for a line to fit these data, 128 00:06:08,990 --> 00:06:11,514 you get the following solution. 129 00:06:11,514 --> 00:06:13,680 But I want to point out, and what we're going to do, 130 00:06:13,680 --> 00:06:16,780 after this video there will be a challenge problem, to have 131 00:06:16,780 --> 00:06:18,502 you actually finish this. 132 00:06:18,502 --> 00:06:20,210 I want to point out that you can actually 133 00:06:20,210 --> 00:06:25,040 also fit a higher-degree polynomial to these data. 134 00:06:25,040 --> 00:06:26,840 For instance, you could try and use 135 00:06:26,840 --> 00:06:29,360 the technique of least squares to fit a parabola 136 00:06:29,360 --> 00:06:30,267 to these data. 137 00:06:30,267 --> 00:06:32,100 And let me point out what the function would 138 00:06:32,100 --> 00:06:33,820 look like in this case. 139 00:06:33,820 --> 00:06:38,470 So this is for the challenge problem. 140 00:06:38,470 --> 00:06:40,380 For the challenge problem, it now 141 00:06:40,380 --> 00:06:43,320 will be a function of three variables, 142 00:06:43,320 --> 00:06:44,820 so it will look something like this. 143 00:06:44,820 --> 00:06:49,470 We'll say a comma b comma c, and the point that we'll 144 00:06:49,470 --> 00:06:51,450 be wanting to minimize, or the function 145 00:06:51,450 --> 00:06:54,720 we'll be wanting to minimize, we're summing over i, again 146 00:06:54,720 --> 00:06:56,860 from 1 to 3-- I didn't write that down above, 147 00:06:56,860 --> 00:06:58,340 but it was fairly obvious that that 148 00:06:58,340 --> 00:07:02,000 was what we were summing over-- the following thing. 149 00:07:02,000 --> 00:07:08,110 We have y_i minus this big quantity: 150 00:07:08,110 --> 00:07:15,820 a x_i squared plus b*x_i plus c and we square the whole thing. 151 00:07:15,820 --> 00:07:17,430 And so what this is actually giving 152 00:07:17,430 --> 00:07:19,090 you is, just as in the picture before, 153 00:07:19,090 --> 00:07:20,465 this is giving you the difference 154 00:07:20,465 --> 00:07:23,720 between the expected value and the actual value. 155 00:07:23,720 --> 00:07:28,040 So this is your actual y-value, and based on the parabola 156 00:07:28,040 --> 00:07:29,940 you're looking for, what's in parentheses is 157 00:07:29,940 --> 00:07:32,830 your expected y-value, right? 158 00:07:32,830 --> 00:07:36,330 So this would actually be my parabola. 159 00:07:36,330 --> 00:07:38,720 When I evaluate it at x_i, I get a number. 160 00:07:38,720 --> 00:07:42,330 I get a value, and that value will be on the parabola. 161 00:07:42,330 --> 00:07:44,870 This will be the y-value associated 162 00:07:44,870 --> 00:07:46,550 to the x-value from the data. 163 00:07:46,550 --> 00:07:48,720 So this is the actual value. 164 00:07:48,720 --> 00:07:50,390 This is the expected value. 165 00:07:50,390 --> 00:07:52,280 We take the difference and square it. 166 00:07:52,280 --> 00:07:54,960 And because we want to minimize how 167 00:07:54,960 --> 00:07:56,360 we're going to solve this problem 168 00:07:56,360 --> 00:07:58,470 and this is what you'll have to do 169 00:07:58,470 --> 00:08:01,660 is, just like with the line situation, 170 00:08:01,660 --> 00:08:03,880 you're going to take the derivative of this function 171 00:08:03,880 --> 00:08:06,800 with respect to each of these three variables separately. 172 00:08:06,800 --> 00:08:11,020 So you'll take f sub a, and then you'll set that equal to zero. 173 00:08:11,020 --> 00:08:14,224 And then you'll take f sub b, you'll set that equal to zero. 174 00:08:14,224 --> 00:08:15,890 And then you'll take f sub c, and you'll 175 00:08:15,890 --> 00:08:16,870 set that equal to zero. 176 00:08:16,870 --> 00:08:18,460 Again, we set them all equal to zero, 177 00:08:18,460 --> 00:08:20,650 because this is really an optimization problem. 178 00:08:20,650 --> 00:08:22,040 We want to minimize something. 179 00:08:22,040 --> 00:08:24,530 We want to minimize this quantity. 180 00:08:24,530 --> 00:08:27,217 So you take each of those three derivatives, 181 00:08:27,217 --> 00:08:29,050 partial derivatives, set them equal to zero, 182 00:08:29,050 --> 00:08:32,440 and you have a system of three equations with three variables. 183 00:08:32,440 --> 00:08:35,960 And you know how to solve such a system, by using matrices, 184 00:08:35,960 --> 00:08:36,680 for instance. 185 00:08:36,680 --> 00:08:38,840 That would be one way to solve such a system. 186 00:08:38,840 --> 00:08:40,410 And so then you can actually come up 187 00:08:40,410 --> 00:08:46,860 with a formula for the sort of parabola of best fit, 188 00:08:46,860 --> 00:08:48,320 you could think of it as. 189 00:08:48,320 --> 00:08:50,920 And so I'm going to stop there, because I think from here, 190 00:08:50,920 --> 00:08:52,250 you guys will do it. 191 00:08:52,250 --> 00:08:53,680 But I just want to point out this 192 00:08:53,680 --> 00:08:55,442 is where you're going to be starting 193 00:08:55,442 --> 00:08:56,900 this next sort of challenge problem 194 00:08:56,900 --> 00:09:00,900 to find the quadratic of best fit for these three data 195 00:09:00,900 --> 00:09:01,930 points. 196 00:09:01,930 --> 00:09:03,511 So that's where I'll end it. 197 00:09:03,511 --> 00:09:04,010