1 00:00:00,000 --> 00:00:02,560 PROFESSOR: Welcome to class again. 2 00:00:02,560 --> 00:00:05,560 This time it's not Professor Forney it's me, so my name is 3 00:00:05,560 --> 00:00:06,810 Ralf Koetter. 4 00:00:06,810 --> 00:00:11,650 5 00:00:11,650 --> 00:00:14,150 You guys have some substantial chalk here at MIT. 6 00:00:14,150 --> 00:00:20,130 7 00:00:20,130 --> 00:00:24,390 And I'm visiting here from the University of Illinois, so 8 00:00:24,390 --> 00:00:26,860 Professor Forney thought I could teach this class here. 9 00:00:26,860 --> 00:00:31,530 10 00:00:31,530 --> 00:00:31,722 All right, let's see. 11 00:00:31,722 --> 00:00:35,070 So I understand that last time, last Wednesday, you went 12 00:00:35,070 --> 00:00:40,060 through all the finite field stuff, meaning, so you know 13 00:00:40,060 --> 00:00:46,080 what that would mean, the finite field. 14 00:00:46,080 --> 00:00:48,620 There's p elements, p to the m elements. 15 00:00:48,620 --> 00:00:53,190 Whatever q you have here, is a power of a prime in order to 16 00:00:53,190 --> 00:00:54,440 be a field. 17 00:00:54,440 --> 00:00:56,710 18 00:00:56,710 --> 00:01:02,740 So this one, as a notation, is a ring of polynomials. 19 00:01:02,740 --> 00:01:06,010 You've seen that too. 20 00:01:06,010 --> 00:01:09,170 So I assume you know everything about finite fields 21 00:01:09,170 --> 00:01:12,660 that you will need to know here, at least, except for one 22 00:01:12,660 --> 00:01:14,460 more theorem which Professor Forney told 23 00:01:14,460 --> 00:01:17,870 me he did not cover. 24 00:01:17,870 --> 00:01:21,192 And this is the fundamental theorem of algebra. 25 00:01:21,192 --> 00:01:23,630 I have to write a little bit smaller with this thing here, 26 00:01:23,630 --> 00:01:26,626 otherwise I'll run out. 27 00:01:26,626 --> 00:01:27,876 AUDIENCE: [UNINTELLIGIBLE PHRASE] 28 00:01:27,876 --> 00:01:29,950 29 00:01:29,950 --> 00:01:33,440 PROFESSOR: Oh, I know, that's probably better. 30 00:01:33,440 --> 00:01:34,690 Better. 31 00:01:34,690 --> 00:01:40,540 32 00:01:40,540 --> 00:01:53,175 With the algebra, at least that's what it's often called, 33 00:01:53,175 --> 00:01:57,420 and really, about 60 percent of all the proofs in algebra 34 00:01:57,420 --> 00:01:59,520 eventually boil down to this here. 35 00:01:59,520 --> 00:02:30,510 And what it says is, polynomial of degree m, f beta 36 00:02:30,510 --> 00:02:46,490 equals zero, at most, m of beta. 37 00:02:46,490 --> 00:02:48,310 At least, that's one way to formulate it. 38 00:02:48,310 --> 00:02:49,560 Let me see. 39 00:02:49,560 --> 00:02:52,420 40 00:02:52,420 --> 00:02:54,400 So that's fine. 41 00:02:54,400 --> 00:02:57,490 So what it says is a polynomial of degree m has at 42 00:02:57,490 --> 00:03:00,560 most m roots. 43 00:03:00,560 --> 00:03:02,910 Once you all have seen that, probably one way or another, 44 00:03:02,910 --> 00:03:06,250 but because of its importance, I want to 45 00:03:06,250 --> 00:03:07,500 emphasize it once more. 46 00:03:07,500 --> 00:03:11,140 47 00:03:11,140 --> 00:03:14,670 Do we need a proof of this? 48 00:03:14,670 --> 00:03:17,940 In true MIT spirit we do. 49 00:03:17,940 --> 00:03:23,460 And the proof would go something like this. 50 00:03:23,460 --> 00:03:27,020 You look at problem number one in your homework assignment, 51 00:03:27,020 --> 00:03:30,630 and from problem number one, I could prove that here, too, 52 00:03:30,630 --> 00:03:32,280 but since it's in the homework, I won't. 53 00:03:32,280 --> 00:03:35,790 54 00:03:35,790 --> 00:03:38,170 You can write the following given any beta. 55 00:03:38,170 --> 00:03:42,810 56 00:03:42,810 --> 00:03:57,440 Write f of x as f of x is equal to plus alpha. 57 00:03:57,440 --> 00:04:01,064 58 00:04:01,064 --> 00:04:05,340 Alphas are the field so that's by some sort of long division 59 00:04:05,340 --> 00:04:07,754 you get to that. 60 00:04:07,754 --> 00:04:09,480 That's what I'm not going to prove. 61 00:04:09,480 --> 00:04:17,710 Then f of beta is equal zero is the same as saying that 62 00:04:17,710 --> 00:04:19,160 alpha is equal to zero. 63 00:04:19,160 --> 00:04:29,570 So if either is a root of the polynomial, zero, it follows 64 00:04:29,570 --> 00:04:39,464 that f of x has this thing here as a factor, this h of x, 65 00:04:39,464 --> 00:04:52,010 x minus beta, where because of the degree properties of 66 00:04:52,010 --> 00:05:01,740 polynomials, h of x is m minus 1. 67 00:05:01,740 --> 00:05:05,740 And so the rest follows by induction. 68 00:05:05,740 --> 00:05:08,580 So basically, then we can prove that this polynomial 69 00:05:08,580 --> 00:05:10,870 has, at most, m minus 1 roots, and so on. 70 00:05:10,870 --> 00:05:13,450 And you can descend this route, and so the rest follows 71 00:05:13,450 --> 00:05:14,700 by induction. 72 00:05:14,700 --> 00:05:18,480 73 00:05:18,480 --> 00:05:33,180 In particular we can say if f of x has m distinct roots beta 74 00:05:33,180 --> 00:05:51,040 one, beta m, then it factors completely into the linear 75 00:05:51,040 --> 00:05:53,310 factors like this. 76 00:05:53,310 --> 00:05:59,490 So I just wanted to quickly state the fundamental theorem 77 00:05:59,490 --> 00:06:03,380 of algebra, since we need it in a proof later on, and I 78 00:06:03,380 --> 00:06:04,630 think you didn't go through it. 79 00:06:04,630 --> 00:06:09,660 80 00:06:09,660 --> 00:06:09,760 OK. 81 00:06:09,760 --> 00:06:13,080 So last time, you learned everything about fields, 82 00:06:13,080 --> 00:06:17,010 finite fields, extension fields, so chapter eight is 83 00:06:17,010 --> 00:06:18,750 pretty much what we have to cover now. 84 00:06:18,750 --> 00:06:22,310 85 00:06:22,310 --> 00:06:25,120 What is the whole idea of chapter eight? 86 00:06:25,120 --> 00:06:41,190 It's linear codes, codes, MDS codes, and redundant codes. 87 00:06:41,190 --> 00:06:44,520 Oh, by the way, do you have any questions about this here? 88 00:06:44,520 --> 00:06:45,020 That in any way? 89 00:06:45,020 --> 00:06:46,270 It's pretty straight, right? 90 00:06:46,270 --> 00:06:48,350 91 00:06:48,350 --> 00:06:52,280 OK, so I understand in chapter six or so, you had already 92 00:06:52,280 --> 00:06:53,876 linear codes over the binary fields. 93 00:06:53,876 --> 00:06:56,540 94 00:06:56,540 --> 00:07:05,610 So let's just define codes over a larger field, formally, 95 00:07:05,610 --> 00:07:12,380 a linear code C of length n. 96 00:07:12,380 --> 00:07:31,180 97 00:07:31,180 --> 00:07:39,620 No subspace of Fn. 98 00:07:39,620 --> 00:07:41,050 So whatever the field is. 99 00:07:41,050 --> 00:07:43,890 So F could be any extension field, could be the binary 100 00:07:43,890 --> 00:07:49,080 field, so it really generalizes a definition of 101 00:07:49,080 --> 00:07:51,670 code, of what a linear code is. 102 00:07:51,670 --> 00:07:54,330 OK, so it's a subspace. 103 00:07:54,330 --> 00:07:57,690 What can be derived from that? 104 00:07:57,690 --> 00:08:00,580 Since it's a subspace, it's a group. 105 00:08:00,580 --> 00:08:05,440 And then we can derive minimum distance properties. 106 00:08:05,440 --> 00:08:09,620 So let's first define it again, since it's slightly 107 00:08:09,620 --> 00:08:11,865 different than the definition for binary codes. 108 00:08:11,865 --> 00:08:21,310 109 00:08:21,310 --> 00:08:32,480 Between Fn, say Fqn. 110 00:08:32,480 --> 00:08:35,530 So I denote the vectors with an underscore. 111 00:08:35,530 --> 00:08:38,840 I think in the notes, it's boldface notation, so 112 00:08:38,840 --> 00:08:41,835 translate that online as I go here. 113 00:08:41,835 --> 00:08:53,920 The distance between two words x and y, given as dx, the 114 00:08:53,920 --> 00:09:00,786 number of positions that x_i is unequal to y_i. 115 00:09:00,786 --> 00:09:02,540 AUDIENCE: What's the subscript? 116 00:09:02,540 --> 00:09:04,230 PROFESSOR: There, a q. 117 00:09:04,230 --> 00:09:06,650 Oh, this is another thing I should warn you about. 118 00:09:06,650 --> 00:09:10,890 My handwriting is bound to deteriorate during class. 119 00:09:10,890 --> 00:09:13,830 So I usually start out reasonably okay, towards the 120 00:09:13,830 --> 00:09:15,570 end of the class it's -- 121 00:09:15,570 --> 00:09:18,370 I tell my students to throw little pieces of chalk at me 122 00:09:18,370 --> 00:09:23,710 when it gets too bad and I'm not facing them, so please 123 00:09:23,710 --> 00:09:25,400 just say something if it gets too bad. 124 00:09:25,400 --> 00:09:28,720 125 00:09:28,720 --> 00:09:30,595 So distance is defined as that, quickly. 126 00:09:30,595 --> 00:09:33,770 127 00:09:33,770 --> 00:09:36,290 So it doesn't really matter what the values are here. 128 00:09:36,290 --> 00:09:39,700 The x_i and the y_i could assume different values. 129 00:09:39,700 --> 00:09:44,050 It's a somewhat coarse measure for the real, the difference 130 00:09:44,050 --> 00:09:49,520 between code words, or difference between words. 131 00:09:49,520 --> 00:09:52,560 Why do you think I say it's coarse? 132 00:09:52,560 --> 00:09:53,965 In digital communications in particular? 133 00:09:53,965 --> 00:09:58,760 134 00:09:58,760 --> 00:09:59,670 Good question, right? 135 00:09:59,670 --> 00:10:04,600 In the end, we want to map that into a modulation scheme. 136 00:10:04,600 --> 00:10:06,870 In the end, we want to map our codes that we are deriving 137 00:10:06,870 --> 00:10:08,210 here into modulation schemes. 138 00:10:08,210 --> 00:10:09,630 In the end, we want to embed them into 139 00:10:09,630 --> 00:10:11,610 some Euclidean space. 140 00:10:11,610 --> 00:10:15,060 Now, different elements of our alphabet we will map to 141 00:10:15,060 --> 00:10:17,900 different elements in Euclidean space. 142 00:10:17,900 --> 00:10:22,650 So basically, approximating their distance relation in 143 00:10:22,650 --> 00:10:25,410 Euclidean space, which we are really interested in with the 144 00:10:25,410 --> 00:10:30,630 Hamming distance here is pretty coarse, but we can do, 145 00:10:30,630 --> 00:10:31,290 so we do that. 146 00:10:31,290 --> 00:10:32,540 It's an approximation, at least. 147 00:10:32,540 --> 00:10:35,660 148 00:10:35,660 --> 00:10:36,145 That clear? 149 00:10:36,145 --> 00:10:37,770 All set? 150 00:10:37,770 --> 00:10:38,285 All right. 151 00:10:38,285 --> 00:10:38,855 AUDIENCE: [UNINTELLIGIBLE PHRASE] the 152 00:10:38,855 --> 00:10:39,926 Hamming distance [UNINTELLIGIBLE] the same as 153 00:10:39,926 --> 00:10:42,450 the Euclidean distance? 154 00:10:42,450 --> 00:10:44,035 PROFESSOR: Well, it depends on the modulation scheme. 155 00:10:44,035 --> 00:10:45,970 It very much depends on the modulation scheme. 156 00:10:45,970 --> 00:10:54,890 If you have a 8-PSK scheme, where you would label, put in 157 00:10:54,890 --> 00:10:58,730 the words, here, with three bit symbols, or with the 158 00:10:58,730 --> 00:11:05,900 symbol from F8, then it's definitely different. 159 00:11:05,900 --> 00:11:08,690 It's definitely different. 160 00:11:08,690 --> 00:11:10,460 So if you do anti-polar signaling, then 161 00:11:10,460 --> 00:11:13,970 it's directly reflected. 162 00:11:13,970 --> 00:11:17,510 OK, I'm starting to digress already. 163 00:11:17,510 --> 00:11:22,930 164 00:11:22,930 --> 00:11:32,440 So just for completeness, minimum distance, minimum 165 00:11:32,440 --> 00:12:01,540 Hamming, of a code subset Fqn is d as a minimum code of dxy, 166 00:12:01,540 --> 00:12:04,682 and they have to be different, it's the same as before. 167 00:12:04,682 --> 00:12:08,250 168 00:12:08,250 --> 00:12:10,340 So now if I claim that -- 169 00:12:10,340 --> 00:12:15,610 170 00:12:15,610 --> 00:12:21,400 so the minimum distance of a code is also given by the 171 00:12:21,400 --> 00:12:36,870 minimum between 0 and x in the code 0 and x, and this is 172 00:12:36,870 --> 00:12:46,260 minimum of the Hamming weight of x, and you could 173 00:12:46,260 --> 00:12:50,080 do 0 x in the code. 174 00:12:50,080 --> 00:12:51,440 So that's all old stuff. 175 00:12:51,440 --> 00:12:55,850 I just write it down so we get started here. 176 00:12:55,850 --> 00:13:00,050 Is that clear, from the group property, why 177 00:13:00,050 --> 00:13:01,680 this would be true? 178 00:13:01,680 --> 00:13:05,990 So if you just take this, we can add basically x to both x 179 00:13:05,990 --> 00:13:08,890 and y, just translating the whole relation 180 00:13:08,890 --> 00:13:10,380 to somewhere else. 181 00:13:10,380 --> 00:13:13,380 So in particular, we translate it here, once we have it here, 182 00:13:13,380 --> 00:13:16,890 than the distance between 0 and x is just the weight. 183 00:13:16,890 --> 00:13:18,660 OK. 184 00:13:18,660 --> 00:13:21,140 So far, so good. 185 00:13:21,140 --> 00:13:22,410 Now what is next? 186 00:13:22,410 --> 00:13:25,060 Generate a matrix. 187 00:13:25,060 --> 00:13:26,710 This is not really in the notes, but 188 00:13:26,710 --> 00:13:27,960 I think it's important. 189 00:13:27,960 --> 00:13:35,100 190 00:13:35,100 --> 00:13:39,160 So see, the code here is a subspace. 191 00:13:39,160 --> 00:13:42,190 It's a linear space, so it has a generator, it has 192 00:13:42,190 --> 00:13:44,210 generators, k generators. 193 00:13:44,210 --> 00:13:59,793 So let g1 be k, write this off the code. 194 00:13:59,793 --> 00:14:02,430 195 00:14:02,430 --> 00:14:07,080 So as a basis of the vector space, that this would be a 196 00:14:07,080 --> 00:14:09,610 basis of the vector space, any basis would be fine here. 197 00:14:09,610 --> 00:14:12,710 198 00:14:12,710 --> 00:14:27,370 Then C may be defined as all the x in Fqn such that x is 199 00:14:27,370 --> 00:14:29,260 sum over -- 200 00:14:29,260 --> 00:14:30,510 what do I call it -- 201 00:14:30,510 --> 00:14:32,440 202 00:14:32,440 --> 00:14:47,050 fi gi, where fi is in Fq. 203 00:14:47,050 --> 00:14:56,690 And the reason I introduce this, we can -- 204 00:14:56,690 --> 00:14:58,900 this is just the definition of a space, right? 205 00:14:58,900 --> 00:14:59,450 That's clear. 206 00:14:59,450 --> 00:15:08,620 So if you have these generators, you find a 207 00:15:08,620 --> 00:15:13,866 generator matrix, uh-oh, it already starts. 208 00:15:13,866 --> 00:15:15,116 Let me -- 209 00:15:15,116 --> 00:15:24,420 210 00:15:24,420 --> 00:15:42,940 matrix g which contains, as a m matrix 211 00:15:42,940 --> 00:15:48,680 containing the rows gi. 212 00:15:48,680 --> 00:15:52,320 So the i-th row in the generator matrix is just gi. 213 00:15:52,320 --> 00:16:12,540 Then you also can write as x is equal to f times g, f 214 00:16:12,540 --> 00:16:20,040 element Fqk, or just the same statement as this one, so 215 00:16:20,040 --> 00:16:21,290 nothing has happened. 216 00:16:21,290 --> 00:16:25,280 217 00:16:25,280 --> 00:16:28,260 So basically, the reason I did that, I wanted to introduce 218 00:16:28,260 --> 00:16:34,350 the term generator matrix, which is sort of important. 219 00:16:34,350 --> 00:16:52,220 And one more property of this orthogonal complement 220 00:16:52,220 --> 00:16:54,750 of C, of the code. 221 00:16:54,750 --> 00:16:57,270 So what does that mean? 222 00:16:57,270 --> 00:17:09,030 So the orthogonal complement of the code you could write as 223 00:17:09,030 --> 00:17:22,140 Fqn such that sum of x_i y_i is equal to 0. 224 00:17:22,140 --> 00:17:29,210 The sum is obviously over the field for all y in the code. 225 00:17:29,210 --> 00:17:32,560 226 00:17:32,560 --> 00:17:37,084 What's the dimension of this, of the orthogonal complement? 227 00:17:37,084 --> 00:17:38,240 AUDIENCE: n minus k. 228 00:17:38,240 --> 00:17:41,400 PROFESSOR: n minus k, clearly, because we have ambient space 229 00:17:41,400 --> 00:17:46,330 is n dimensional, we impose k linear constraints on this, by 230 00:17:46,330 --> 00:17:50,830 the k generators, so the k dimensions of, take note, by 231 00:17:50,830 --> 00:17:52,680 the generators drop out. 232 00:17:52,680 --> 00:17:58,310 So the dimension of the orthogonal 233 00:17:58,310 --> 00:18:00,910 complement is n minus k. 234 00:18:00,910 --> 00:18:06,100 235 00:18:06,100 --> 00:18:08,400 So what else do we need to say about this? 236 00:18:08,400 --> 00:18:13,390 237 00:18:13,390 --> 00:18:22,910 C is called the dual code for this reason. 238 00:18:22,910 --> 00:18:25,510 239 00:18:25,510 --> 00:18:26,470 C is called dual code. 240 00:18:26,470 --> 00:18:29,300 In particular, it's a code that's a linear space. 241 00:18:29,300 --> 00:18:32,160 It's a subspace of Fqn again, it's a code. 242 00:18:32,160 --> 00:18:36,130 So it's just as nice a code as C at this 243 00:18:36,130 --> 00:18:37,380 point in time at least. 244 00:18:37,380 --> 00:18:42,740 245 00:18:42,740 --> 00:18:45,110 So it's called a dual code. 246 00:18:45,110 --> 00:18:48,785 To C, if it is a code, it has a generator matrix. 247 00:18:48,785 --> 00:18:51,400 248 00:18:51,400 --> 00:19:04,630 Let h be a generator matrix for C dual. 249 00:19:04,630 --> 00:19:07,470 So in particular, we could define C dual now, for 250 00:19:07,470 --> 00:19:10,810 example, by the equivalent of this relation here. 251 00:19:10,810 --> 00:19:17,700 But because it's a dual code, we now also can define the 252 00:19:17,700 --> 00:19:34,420 original code in an equivalent way such that x times h 253 00:19:34,420 --> 00:19:37,990 transpose is 0. 254 00:19:37,990 --> 00:19:43,330 We could define our original code C either as the image of 255 00:19:43,330 --> 00:19:49,580 a matrix g, of a generator matrix g, or as a kernel of a 256 00:19:49,580 --> 00:19:52,470 parity-check matrix h. 257 00:19:52,470 --> 00:19:55,440 So h is a ...WRITING ON BOARD... 258 00:19:55,440 --> 00:20:03,790 259 00:20:03,790 --> 00:20:07,960 for C. So that's all pretty much straight linear algebra, 260 00:20:07,960 --> 00:20:12,340 and I'm sure you've seen that in many different places. 261 00:20:12,340 --> 00:20:13,660 Any questions about any of this? 262 00:20:13,660 --> 00:20:17,962 263 00:20:17,962 --> 00:20:19,396 AUDIENCE: So the addition of the dual 264 00:20:19,396 --> 00:20:20,840 [UNINTELLIGIBLE PHRASE] 265 00:20:20,840 --> 00:20:22,430 the summation [UNINTELLIGIBLE PHRASE] 266 00:20:22,430 --> 00:20:24,994 equals 0 for all [INAUDIBLE] other than x, right? 267 00:20:24,994 --> 00:20:26,260 [UNINTELLIGIBLE PHRASE] 268 00:20:26,260 --> 00:20:28,170 PROFESSOR: Oh no, no, no, it doesn't have to be 269 00:20:28,170 --> 00:20:29,740 different from x. 270 00:20:29,740 --> 00:20:35,460 If y is in the code, if y is in C, then x has to be 271 00:20:35,460 --> 00:20:37,020 orthogonal to it. 272 00:20:37,020 --> 00:20:38,850 They can be the same vector, in particular, if you have 273 00:20:38,850 --> 00:20:41,630 binary vectors, an even made binary vector is 274 00:20:41,630 --> 00:20:43,560 orthogonal to itself. 275 00:20:43,560 --> 00:20:45,820 It's a little bit odd, but that's the 276 00:20:45,820 --> 00:20:47,070 magic of finite fields. 277 00:20:47,070 --> 00:20:51,100 278 00:20:51,100 --> 00:20:51,275 OK. 279 00:20:51,275 --> 00:20:51,830 Good. 280 00:20:51,830 --> 00:20:53,430 So these are codes, now we could stop. 281 00:20:53,430 --> 00:20:57,100 We have defined the object, and obviously it exists, 282 00:20:57,100 --> 00:21:02,720 because we could just write something down and it exists. 283 00:21:02,720 --> 00:21:06,050 So once we have defined it, the next question is, what 284 00:21:06,050 --> 00:21:09,120 sort of codes do exist? 285 00:21:09,120 --> 00:21:13,140 So that's what we're going to do next. 286 00:21:13,140 --> 00:21:16,650 287 00:21:16,650 --> 00:21:17,900 First, question one. 288 00:21:17,900 --> 00:21:30,000 289 00:21:30,000 --> 00:21:36,880 Codes do, what type of codes do exist? 290 00:21:36,880 --> 00:21:40,130 So which codes do you know? 291 00:21:40,130 --> 00:21:42,920 AUDIENCE: [INAUDIBLE] 292 00:21:42,920 --> 00:21:45,800 PROFESSOR: You know Reed-Muller codes, you know 293 00:21:45,800 --> 00:21:49,330 probably sporadic binary codes that are out there. 294 00:21:49,330 --> 00:21:51,900 295 00:21:51,900 --> 00:21:53,430 These are all binary codes. 296 00:21:53,430 --> 00:21:58,220 So what type of codes exist over larger fields? 297 00:21:58,220 --> 00:22:01,600 298 00:22:01,600 --> 00:22:03,070 Many, many classes. 299 00:22:03,070 --> 00:22:07,400 There exists the equivalent of the Reed-Muller codes, there 300 00:22:07,400 --> 00:22:10,910 exist QRE Reed-Muller codes, and there exist generalized 301 00:22:10,910 --> 00:22:13,690 Reed-Muller codes, and, and, and, and, and. 302 00:22:13,690 --> 00:22:19,700 But we are interested in a very special class today, 303 00:22:19,700 --> 00:22:23,080 which is MDS codes. 304 00:22:23,080 --> 00:22:35,475 It stands for Maximum Distance Separable. 305 00:22:35,475 --> 00:22:38,520 306 00:22:38,520 --> 00:22:40,020 It's a strange name. 307 00:22:40,020 --> 00:22:42,350 There's no particular reason for MDS. 308 00:22:42,350 --> 00:22:51,440 309 00:22:51,440 --> 00:22:52,690 But, let's see what we can do with that. 310 00:22:52,690 --> 00:22:55,400 311 00:22:55,400 --> 00:22:58,420 What type of codes do exist? 312 00:22:58,420 --> 00:23:04,640 So we have parameters of codes -- oh, I think you write the 313 00:23:04,640 --> 00:23:06,740 curly bracket, right -- 314 00:23:06,740 --> 00:23:09,730 n, k and d. 315 00:23:09,730 --> 00:23:14,040 So that would mean a code of length n, dimension k, and 316 00:23:14,040 --> 00:23:16,140 distance d. 317 00:23:16,140 --> 00:23:20,450 And let me add something to it a q, if you want to emphasize 318 00:23:20,450 --> 00:23:21,925 that this is a query field. 319 00:23:21,925 --> 00:23:24,810 320 00:23:24,810 --> 00:23:29,140 So are all numbers possible here? 321 00:23:29,140 --> 00:23:36,345 What do we have, a 20, 19, 17 code over, I 322 00:23:36,345 --> 00:23:39,390 don't know, over F8. 323 00:23:39,390 --> 00:23:41,580 Is this possible? 324 00:23:41,580 --> 00:23:42,830 What would you think? 325 00:23:42,830 --> 00:23:49,750 326 00:23:49,750 --> 00:23:51,482 No? 327 00:23:51,482 --> 00:23:52,430 AUDIENCE: [INAUDIBLE] 328 00:23:52,430 --> 00:23:54,100 PROFESSOR: It's not possible. 329 00:23:54,100 --> 00:23:56,160 It doesn't seem likely. 330 00:23:56,160 --> 00:24:01,430 What conflicts here, is the dimension and the distance. 331 00:24:01,430 --> 00:24:03,900 If you get a large dimension, in particular, if we would 332 00:24:03,900 --> 00:24:07,170 make this 20, what would that mean? 333 00:24:07,170 --> 00:24:09,840 It would mean we have to take the entire space. 334 00:24:09,840 --> 00:24:11,990 If you take the entire space, then the minimum 335 00:24:11,990 --> 00:24:13,780 weight word is 1. 336 00:24:13,780 --> 00:24:16,870 So this is possible. 337 00:24:16,870 --> 00:24:17,940 You know this is possible. 338 00:24:17,940 --> 00:24:22,240 If you drop this by 1, that seems very unlikely that we 339 00:24:22,240 --> 00:24:25,920 would get a 17 here. 340 00:24:25,920 --> 00:24:29,080 But what do we get here? 341 00:24:29,080 --> 00:24:30,330 2. 342 00:24:30,330 --> 00:24:31,920 343 00:24:31,920 --> 00:24:34,910 You get a 2 because that's what we can achieve with a 344 00:24:34,910 --> 00:24:37,290 single parity-check code. 345 00:24:37,290 --> 00:24:38,810 The parity-check code doesn't have to be 346 00:24:38,810 --> 00:24:41,820 restrained to binary. 347 00:24:41,820 --> 00:24:42,130 Why? 348 00:24:42,130 --> 00:24:43,520 Why would it be restrained to binary? 349 00:24:43,520 --> 00:24:46,850 350 00:24:46,850 --> 00:24:56,430 You could just, the set of all vectors let's define the 351 00:24:56,430 --> 00:25:03,900 single parity-check codes s p c, q, as the set of all 352 00:25:03,900 --> 00:25:11,165 vectors such that sum of the x_i is equal to 0. 353 00:25:11,165 --> 00:25:14,320 354 00:25:14,320 --> 00:25:18,380 So could we have a word of weight 1 in here? 355 00:25:18,380 --> 00:25:19,270 Obviously not, right? 356 00:25:19,270 --> 00:25:22,970 If it has a weight 1, how would it add up to 0? 357 00:25:22,970 --> 00:25:25,745 Because one position would never cancel 358 00:25:25,745 --> 00:25:27,070 with any other position. 359 00:25:27,070 --> 00:25:29,820 So the minimum weight is 2 here, and we get 360 00:25:29,820 --> 00:25:30,600 a distance of 2. 361 00:25:30,600 --> 00:25:32,940 So what's the next one? 362 00:25:32,940 --> 00:25:37,250 363 00:25:37,250 --> 00:25:41,740 It's tempting to say 3, right? 364 00:25:41,740 --> 00:25:47,190 3, but this is very much a question, now. 365 00:25:47,190 --> 00:25:50,710 Because this is not as easy to come by as a single 366 00:25:50,710 --> 00:25:53,060 parity-check. 367 00:25:53,060 --> 00:25:56,320 And that's what we're going to do next. 368 00:25:56,320 --> 00:26:04,010 We're going to define bounds on the maximum distance that a 369 00:26:04,010 --> 00:26:07,310 code can have altogether. 370 00:26:07,310 --> 00:26:12,210 OK, so let's do the following. 371 00:26:12,210 --> 00:26:14,750 372 00:26:14,750 --> 00:26:27,720 Which parameter is possible? 373 00:26:27,720 --> 00:26:28,970 OK. 374 00:26:28,970 --> 00:26:30,690 375 00:26:30,690 --> 00:26:34,270 So let's assume you have a code, an n,k,d code, and now 376 00:26:34,270 --> 00:26:38,720 we want to find a relation, a bound between n, k and d. 377 00:26:38,720 --> 00:26:39,970 How do we do this? 378 00:26:39,970 --> 00:26:43,220 379 00:26:43,220 --> 00:26:44,470 Any ideas? 380 00:26:44,470 --> 00:26:46,720 381 00:26:46,720 --> 00:26:48,740 Let a computer run for eternity and find 382 00:26:48,740 --> 00:26:49,790 all possible codes? 383 00:26:49,790 --> 00:26:51,190 No, no, no, no. 384 00:26:51,190 --> 00:26:53,256 We don't do this. 385 00:26:53,256 --> 00:26:56,770 We wouldn't get far. 386 00:26:56,770 --> 00:27:07,190 Let's assume we have an n,k,d code. 387 00:27:07,190 --> 00:27:10,910 388 00:27:10,910 --> 00:27:12,990 What does that mean? 389 00:27:12,990 --> 00:27:15,660 Well, let's write the code words all down in a huge 390 00:27:15,660 --> 00:27:26,790 matrix, so each row in this matrix 391 00:27:26,790 --> 00:27:28,240 corresponds to one code word. 392 00:27:28,240 --> 00:27:36,330 So this has a length n, this is q to the k, q is whatever 393 00:27:36,330 --> 00:27:41,200 the alphabet is of the code in question, and now we say it's 394 00:27:41,200 --> 00:27:43,640 an n,k,d code. 395 00:27:43,640 --> 00:27:49,660 What that means, it means, among other things is, say we 396 00:27:49,660 --> 00:27:57,850 delete, just punch out, d minus 1 positions. 397 00:27:57,850 --> 00:28:01,100 398 00:28:01,100 --> 00:28:04,450 We punch out d minus 1 positions of all code words 399 00:28:04,450 --> 00:28:07,020 and we look at the code that remains. 400 00:28:07,020 --> 00:28:08,350 You guys don't have colored chalk here, huh? 401 00:28:08,350 --> 00:28:12,020 402 00:28:12,020 --> 00:28:14,270 We look at the code that remains, it means we look at 403 00:28:14,270 --> 00:28:19,875 this part of the matrix. 404 00:28:19,875 --> 00:28:22,930 405 00:28:22,930 --> 00:28:24,180 Is that clear, what I'm doing here? 406 00:28:24,180 --> 00:28:26,440 407 00:28:26,440 --> 00:28:32,470 So if the code indeed had distance d, can there be any 408 00:28:32,470 --> 00:28:36,110 two rows equal in this part? 409 00:28:36,110 --> 00:28:38,150 Remember, we punch out all d minus 1. 410 00:28:38,150 --> 00:28:40,790 411 00:28:40,790 --> 00:28:45,160 Can there be any rows in this part that are equal? 412 00:28:45,160 --> 00:28:46,130 No, right? 413 00:28:46,130 --> 00:28:47,290 Couldn't be. 414 00:28:47,290 --> 00:28:50,360 They all have to be different. 415 00:28:50,360 --> 00:28:51,560 What does that mean? 416 00:28:51,560 --> 00:28:58,070 They all have to be different, but how many different tuples 417 00:28:58,070 --> 00:29:00,650 can we have in this part? 418 00:29:00,650 --> 00:29:10,200 Well, we have at most q to the n minus d minus 1. 419 00:29:10,200 --> 00:29:11,450 That's the length here. 420 00:29:11,450 --> 00:29:14,200 421 00:29:14,200 --> 00:29:17,650 This n minus d minus 1. 422 00:29:17,650 --> 00:29:18,900 Different tuples. 423 00:29:18,900 --> 00:29:32,350 424 00:29:32,350 --> 00:29:35,130 So how can we patch that together into a relation on 425 00:29:35,130 --> 00:29:36,380 the parameters? 426 00:29:36,380 --> 00:29:42,840 427 00:29:42,840 --> 00:29:46,820 It basically says, q, this is q to the k. 428 00:29:46,820 --> 00:29:51,460 q to the k is upper bounded by this. 429 00:29:51,460 --> 00:29:56,790 430 00:29:56,790 --> 00:29:58,040 It's upper bounded by this. 431 00:29:58,040 --> 00:30:00,740 432 00:30:00,740 --> 00:30:04,660 And let me take the logarithm on here, 433 00:30:04,660 --> 00:30:11,600 and we get this relation. 434 00:30:11,600 --> 00:30:14,265 That's a first incarnation of the tension that we get on 435 00:30:14,265 --> 00:30:16,950 code construction, on codes. 436 00:30:16,950 --> 00:30:18,270 And bound on this, at least. 437 00:30:18,270 --> 00:30:22,730 If you choose d large, the distance large, 438 00:30:22,730 --> 00:30:25,930 then k has to go. 439 00:30:25,930 --> 00:30:30,700 If you choose k large, the distance cannot be very large. 440 00:30:30,700 --> 00:30:34,620 So this is where we, for the first time, see this tension. 441 00:30:34,620 --> 00:30:38,210 And it's also important, I'm sorry that I run around like 442 00:30:38,210 --> 00:30:47,605 this here, n has to be at least k plus d, k 443 00:30:47,605 --> 00:30:48,960 plus d minus 1. 444 00:30:48,960 --> 00:30:51,590 445 00:30:51,590 --> 00:31:02,040 So here, you see this 28 in 3, it would just satisfy this. 446 00:31:02,040 --> 00:31:05,580 It would just satisfy this. 447 00:31:05,580 --> 00:31:06,830 So do we know it exists? 448 00:31:06,830 --> 00:31:09,180 449 00:31:09,180 --> 00:31:10,430 No. 450 00:31:10,430 --> 00:31:11,690 No, why would it? 451 00:31:11,690 --> 00:31:18,710 So far, we only have looked at this here, and so, if it would 452 00:31:18,710 --> 00:31:20,420 exist, it would have to satisfy that. 453 00:31:20,420 --> 00:31:23,110 But there's no reason to assume it exists. 454 00:31:23,110 --> 00:31:26,250 At the moment, at least. 455 00:31:26,250 --> 00:31:27,680 OK. 456 00:31:27,680 --> 00:31:29,180 This is called the Singleton bound. 457 00:31:29,180 --> 00:31:44,290 458 00:31:44,290 --> 00:32:02,570 Any code over any field, phi n, this relationship on the 459 00:32:02,570 --> 00:32:03,440 parameters. 460 00:32:03,440 --> 00:32:04,690 Good. 461 00:32:04,690 --> 00:32:11,040 462 00:32:11,040 --> 00:32:36,740 Any code satisfying and bound with equality is called MDS. 463 00:32:36,740 --> 00:32:39,750 So we have an MDS code if and only if it satisfies the 464 00:32:39,750 --> 00:32:42,070 Singleton bound with equality. 465 00:32:42,070 --> 00:32:44,670 That's the definition of MDS codes. 466 00:32:44,670 --> 00:32:48,280 And now it makes maybe a little bit more sense to talk 467 00:32:48,280 --> 00:32:51,530 about Maximum Distance Separable codes, well, in a 468 00:32:51,530 --> 00:32:55,020 sense, they have the maximum distance among all codes. 469 00:32:55,020 --> 00:32:58,300 You find all codes with the given n and k, if they're MDS, 470 00:32:58,300 --> 00:33:01,780 they have the maximum distance. 471 00:33:01,780 --> 00:33:05,720 OK, let's think about this a little here. 472 00:33:05,720 --> 00:33:09,080 473 00:33:09,080 --> 00:33:11,000 AUDIENCE: [INAUDIBLE] 474 00:33:11,000 --> 00:33:13,925 dependence on q [UNINTELLIGIBLE]? 475 00:33:13,925 --> 00:33:15,850 PROFESSOR: Yeah, there's a very strong dependence on q. 476 00:33:15,850 --> 00:33:17,800 The bound, not. 477 00:33:17,800 --> 00:33:20,610 The bound has no dependence on q. 478 00:33:20,610 --> 00:33:23,800 If the guys exist or not, that's very much 479 00:33:23,800 --> 00:33:24,680 dependent on q. 480 00:33:24,680 --> 00:33:26,185 We'll get to that. 481 00:33:26,185 --> 00:33:28,000 AUDIENCE: [INAUDIBLE] when the q is large, we have more 482 00:33:28,000 --> 00:33:30,090 options to [UNINTELLIGIBLE]? 483 00:33:30,090 --> 00:33:30,445 PROFESSOR: Absolutely. 484 00:33:30,445 --> 00:33:31,190 Absolutely. 485 00:33:31,190 --> 00:33:39,180 For binary, there's a very simple argument to show that 486 00:33:39,180 --> 00:33:43,200 there are no binary MDS codes except for the parity-check 487 00:33:43,200 --> 00:33:45,790 codes and the repetition codes and trivial code. 488 00:33:45,790 --> 00:33:53,290 489 00:33:53,290 --> 00:34:09,380 So say we have a binary code, a binary n,k,d code with a 490 00:34:09,380 --> 00:34:12,150 generator matrix. 491 00:34:12,150 --> 00:34:13,929 So what could the generator matrix be? 492 00:34:13,929 --> 00:34:22,159 There will be an identity part, and then there will be 493 00:34:22,159 --> 00:34:26,270 the rest of the generator matrix, and how could we 494 00:34:26,270 --> 00:34:30,489 possibly fill that in, in order to make it MDS? 495 00:34:30,489 --> 00:34:39,460 Because this is n, this is k, and we see in order to make it 496 00:34:39,460 --> 00:34:46,730 MDS, every single row has to have all entries equal to 1. 497 00:34:46,730 --> 00:34:50,250 Because if not all entries are equal to 1, here, then we 498 00:34:50,250 --> 00:34:53,780 immediately have exhibited a code word with a weight less 499 00:34:53,780 --> 00:34:57,560 than n minus k plus 1. 500 00:34:57,560 --> 00:35:01,420 So OK, we know the first row has to have all 1's. 501 00:35:01,420 --> 00:35:05,110 Because now, the weight of this row is exactly on the MDS 502 00:35:05,110 --> 00:35:05,950 [INAUDIBLE]. 503 00:35:05,950 --> 00:35:07,200 What about the next one? 504 00:35:07,200 --> 00:35:09,490 505 00:35:09,490 --> 00:35:11,080 The next one, same thing. 506 00:35:11,080 --> 00:35:13,880 507 00:35:13,880 --> 00:35:16,030 All entries have to be 1. 508 00:35:16,030 --> 00:35:17,500 But now we see the problem, right? 509 00:35:17,500 --> 00:35:20,050 Now we add those two guys, it should again be a code word, 510 00:35:20,050 --> 00:35:22,220 and we have a grade two code word. 511 00:35:22,220 --> 00:35:29,370 So this is, in a nutshell, to prove that there are no binary 512 00:35:29,370 --> 00:35:32,410 MDS codes except the trivial ones. 513 00:35:32,410 --> 00:35:43,390 So the trivial ones are n, n1, n, n minus 1, 2 and n1, n. 514 00:35:43,390 --> 00:35:45,920 These are the trivial ones. 515 00:35:45,920 --> 00:35:48,355 The space itself, so it's in a parity-check code, and the 516 00:35:48,355 --> 00:35:50,890 repetition code. 517 00:35:50,890 --> 00:35:53,510 These are the only binary MDS codes. 518 00:35:53,510 --> 00:35:57,270 And the argument is roughly there. 519 00:35:57,270 --> 00:35:58,520 OK, where was I? 520 00:35:58,520 --> 00:36:03,170 521 00:36:03,170 --> 00:36:05,000 Yeah, let's think about this a little bit more. 522 00:36:05,000 --> 00:36:07,370 And we are getting to exactly your question about the 523 00:36:07,370 --> 00:36:08,620 [UNINTELLIGIBLE]. 524 00:36:08,620 --> 00:36:11,380 525 00:36:11,380 --> 00:36:17,380 This here has to hold, this argument has to hold 526 00:36:17,380 --> 00:36:23,300 regardless of which d minus 1 positions we punch out. 527 00:36:23,300 --> 00:36:25,520 This argument has always to hold. 528 00:36:25,520 --> 00:36:29,100 Which means, think about it, it's an enormously strong 529 00:36:29,100 --> 00:36:32,030 combinatorial condition on the code. 530 00:36:32,030 --> 00:36:39,620 So you have a code, that means you have a code, you write it 531 00:36:39,620 --> 00:36:42,990 in a matrix like this, all the code words. 532 00:36:42,990 --> 00:36:49,000 You punch out an arbitrary collection of d minus 1one 533 00:36:49,000 --> 00:36:53,920 positions, and the rest, the remaining positions, have to 534 00:36:53,920 --> 00:36:58,100 make up the entire space here. 535 00:36:58,100 --> 00:37:02,910 The entire space Fqn minus to that right exponent. 536 00:37:02,910 --> 00:37:04,340 This is a very -- 537 00:37:04,340 --> 00:37:07,830 think about it, I mean, just writing down this is an 538 00:37:07,830 --> 00:37:11,290 enormously strong combinatorial condition. 539 00:37:11,290 --> 00:37:17,870 So that will actually lead to the codes existing only for a 540 00:37:17,870 --> 00:37:22,820 very, very special, for a subset of field sizes. 541 00:37:22,820 --> 00:37:26,360 In particular, like you said, we have to have enough freedom 542 00:37:26,360 --> 00:37:31,800 in the field size to fill up this matrix to satisfy this. 543 00:37:31,800 --> 00:37:34,360 544 00:37:34,360 --> 00:37:42,230 OK, before we get to that, before I say a word about the 545 00:37:42,230 --> 00:37:50,880 field size, let me formalize what I just said here, namely, 546 00:37:50,880 --> 00:37:54,850 that all of the other positions have to be exactly 547 00:37:54,850 --> 00:37:59,450 the q to the n minus d minus 1, different tuples. 548 00:37:59,450 --> 00:38:17,610 And the definition, let the code with q to the k, code 549 00:38:17,610 --> 00:38:30,190 words over alphabet Fq. 550 00:38:30,190 --> 00:38:35,410 551 00:38:35,410 --> 00:39:02,990 Let subset of the positions in C, i is called an information 552 00:39:02,990 --> 00:39:30,970 set if C constrained to i runs exactly through all the q to 553 00:39:30,970 --> 00:39:37,395 the k, runs through all the q to the k. 554 00:39:37,395 --> 00:39:46,060 555 00:39:46,060 --> 00:39:48,570 Fqk. 556 00:39:48,570 --> 00:39:53,300 So what it means is, you have a code, and you have a subset 557 00:39:53,300 --> 00:39:58,960 of positions, maybe this one, this one, this one, this one. 558 00:39:58,960 --> 00:40:03,090 This is a subset of positions if the code words. 559 00:40:03,090 --> 00:40:06,230 So if the matrix that remains after you take out the 560 00:40:06,230 --> 00:40:10,470 punctured columns, runs through all the q to the k 561 00:40:10,470 --> 00:40:14,298 elements of Fqk, then this is an information search. 562 00:40:14,298 --> 00:40:15,548 AUDIENCE: [INAUDIBLE] 563 00:40:15,548 --> 00:40:18,040 564 00:40:18,040 --> 00:40:21,870 PROFESSOR: Constrained to i, because i has size k. 565 00:40:21,870 --> 00:40:25,275 i is just -- its just about enough to describe every code 566 00:40:25,275 --> 00:40:31,830 word, if the restraint of C to the set would indeed be giving 567 00:40:31,830 --> 00:40:35,700 a unique vector for each code word. 568 00:40:35,700 --> 00:40:38,210 The reason to call it -- so this is the definition of 569 00:40:38,210 --> 00:40:40,220 information set. 570 00:40:40,220 --> 00:40:44,910 The reason to call it an information set, it's pretty 571 00:40:44,910 --> 00:40:45,620 straight, right? 572 00:40:45,620 --> 00:40:47,685 Why is it called an information set? 573 00:40:47,685 --> 00:40:53,740 574 00:40:53,740 --> 00:40:55,500 Because it's enough, right? 575 00:40:55,500 --> 00:40:56,250 Because it's enough. 576 00:40:56,250 --> 00:40:59,320 If you know exactly the value of a code word in these 577 00:40:59,320 --> 00:41:02,820 positions, then it is enough to recover 578 00:41:02,820 --> 00:41:04,706 the entire code word. 579 00:41:04,706 --> 00:41:07,210 When some genie tells you, gives you a code world which 580 00:41:07,210 --> 00:41:10,110 was corrupted by noise or something, but tells you, 581 00:41:10,110 --> 00:41:12,930 these k positions are OK. 582 00:41:12,930 --> 00:41:14,030 That's enough, that's all you need. 583 00:41:14,030 --> 00:41:15,050 That's an information set. 584 00:41:15,050 --> 00:41:17,420 You can recover the information from them. 585 00:41:17,420 --> 00:41:21,130 Actually, it is an application that pops up sometimes. 586 00:41:21,130 --> 00:41:23,870 That somehow, you get side information about some 587 00:41:23,870 --> 00:41:26,230 positions in the code word indeed being correct, and 588 00:41:26,230 --> 00:41:27,560 others not. 589 00:41:27,560 --> 00:41:30,340 And others you don't know about. 590 00:41:30,340 --> 00:41:35,755 So that's the information set, with respect to our MDS code. 591 00:41:35,755 --> 00:41:38,300 592 00:41:38,300 --> 00:41:45,180 So with respect to our MDS code, a corollary of the thing 593 00:41:45,180 --> 00:42:14,410 that involved any k positions in an MDS code, 594 00:42:14,410 --> 00:42:15,660 an information set. 595 00:42:15,660 --> 00:42:22,170 596 00:42:22,170 --> 00:42:25,380 So any k positions on information set. 597 00:42:25,380 --> 00:42:26,910 It's a really strong property. 598 00:42:26,910 --> 00:42:30,000 Really strong combinatorial property. 599 00:42:30,000 --> 00:42:33,420 OK, so far, so good. 600 00:42:33,420 --> 00:42:42,470 601 00:42:42,470 --> 00:42:46,970 This is so strong, this property, that we can say 602 00:42:46,970 --> 00:42:50,700 something about these codes even without even knowing if 603 00:42:50,700 --> 00:42:51,950 they exist. 604 00:42:51,950 --> 00:42:54,210 605 00:42:54,210 --> 00:42:57,320 So, so far, we have talked about these codes as if we 606 00:42:57,320 --> 00:42:59,080 knew they existed. 607 00:42:59,080 --> 00:43:01,890 Well, it's not entirely trivial, since we know those 608 00:43:01,890 --> 00:43:03,400 guys here exist. 609 00:43:03,400 --> 00:43:05,710 So it's not entirely empty, we're not out in 610 00:43:05,710 --> 00:43:08,250 cuckoo space, here. 611 00:43:08,250 --> 00:43:12,750 But do any other one exist, except for those? 612 00:43:12,750 --> 00:43:15,480 That's the question. 613 00:43:15,480 --> 00:43:16,470 We don't know that yet. 614 00:43:16,470 --> 00:43:19,920 We will show in a little while that they do, but we don't 615 00:43:19,920 --> 00:43:21,170 know that yet. 616 00:43:21,170 --> 00:43:23,960 617 00:43:23,960 --> 00:43:26,160 But the interesting part is that we can derive properties 618 00:43:26,160 --> 00:43:29,870 of those codes without even knowing they exist. 619 00:43:29,870 --> 00:43:31,470 And how do we do that? 620 00:43:31,470 --> 00:43:40,470 For example, we want to derive the following property, how 621 00:43:40,470 --> 00:44:00,620 many words of weight d exists in linear MDS code? 622 00:44:00,620 --> 00:44:01,960 One could ask that, right? 623 00:44:01,960 --> 00:44:05,280 If they exist, they're nice, and if they exist, we also 624 00:44:05,280 --> 00:44:09,820 want to know how many words do exist at minimum distance. 625 00:44:09,820 --> 00:44:12,460 Because that translates, again, directly into union 626 00:44:12,460 --> 00:44:17,060 bound arguments later on, and probability of error. 627 00:44:17,060 --> 00:44:18,210 So that's a good question. 628 00:44:18,210 --> 00:44:23,300 How many words of weight d exist in a MDS code? 629 00:44:23,300 --> 00:44:28,630 Let's call this n d, and we want to know how many. 630 00:44:28,630 --> 00:44:34,550 631 00:44:34,550 --> 00:44:37,000 So I'll let you think about this for a sec while I erase 632 00:44:37,000 --> 00:44:40,130 the board, and then somebody will tell me the answer. 633 00:44:40,130 --> 00:44:52,058 634 00:44:52,058 --> 00:44:53,984 So how can we think about this? 635 00:44:53,984 --> 00:45:00,290 636 00:45:00,290 --> 00:45:04,030 Let's try to do a similar argument as this one. 637 00:45:04,030 --> 00:45:18,780 Let's look at a single word, and let's assume that d 638 00:45:18,780 --> 00:45:23,600 positions, we ask the questions does there exist a 639 00:45:23,600 --> 00:45:27,730 code word within the first d positions? 640 00:45:27,730 --> 00:45:29,970 It is equivalent to the question, does there exist a 641 00:45:29,970 --> 00:45:35,410 code word that covers exactly all d positions? 642 00:45:35,410 --> 00:45:37,140 Any set of d positions. 643 00:45:37,140 --> 00:45:43,522 644 00:45:43,522 --> 00:45:44,480 AUDIENCE: 0 everywhere else? 645 00:45:44,480 --> 00:45:47,400 PROFESSOR: And 0 everywhere else. 646 00:45:47,400 --> 00:45:48,660 Why is that nice? 647 00:45:48,660 --> 00:45:51,880 If you could prove that, that there exists a word for all d 648 00:45:51,880 --> 00:45:57,800 positions, because then, we pretty much know what happens. 649 00:45:57,800 --> 00:46:02,837 Then we know that, well, if this is true, then there are n 650 00:46:02,837 --> 00:46:04,700 choose d ways to choose those d positions. 651 00:46:04,700 --> 00:46:08,210 652 00:46:08,210 --> 00:46:11,810 And then within those d positions, and since it's a 653 00:46:11,810 --> 00:46:16,690 linear code, we can multiply with the q minus 1 on the 654 00:46:16,690 --> 00:46:17,940 repeated element. 655 00:46:17,940 --> 00:46:20,200 656 00:46:20,200 --> 00:46:23,540 So if we can choose our d positions arbitrarily, then 657 00:46:23,540 --> 00:46:28,540 this is the number over words at distance d. 658 00:46:28,540 --> 00:46:31,540 So let's look at a word, and let's, without loss of 659 00:46:31,540 --> 00:46:33,515 generality, assume it's a first d positions. 660 00:46:33,515 --> 00:46:36,960 661 00:46:36,960 --> 00:46:39,770 So the first d positions. 662 00:46:39,770 --> 00:46:43,710 So in particular, these would be the first d minus 1 663 00:46:43,710 --> 00:46:58,200 positions, which would mean that this have length k. 664 00:46:58,200 --> 00:47:04,450 So if we have an MDS code, this is an information set. 665 00:47:04,450 --> 00:47:08,020 So if this is an information set, then we can fill up this 666 00:47:08,020 --> 00:47:11,880 thing with just about anything we want. 667 00:47:11,880 --> 00:47:18,330 So we choose this information set to be equal to 1. 668 00:47:18,330 --> 00:47:21,360 This is how we choose this information set, and by the 669 00:47:21,360 --> 00:47:24,840 property of MDS code, we are guaranteed that there exists a 670 00:47:24,840 --> 00:47:27,576 code word which is this part in the information set. 671 00:47:27,576 --> 00:47:30,590 672 00:47:30,590 --> 00:47:34,070 But we also are guaranteed it's a weight d word, right? 673 00:47:34,070 --> 00:47:37,240 The minimum distance is d, that means all of these m 674 00:47:37,240 --> 00:47:44,680 entries here, they must all be non-zero, in this part. 675 00:47:44,680 --> 00:47:47,370 Otherwise, it wouldn't have weight d. 676 00:47:47,370 --> 00:47:48,330 OK? 677 00:47:48,330 --> 00:47:49,435 And there we have it. 678 00:47:49,435 --> 00:47:51,180 That was all we needed to show. 679 00:47:51,180 --> 00:47:51,910 Right? 680 00:47:51,910 --> 00:47:56,160 Because now we have shown that there exists a word of weight 681 00:47:56,160 --> 00:47:59,840 d in the first d positions. 682 00:47:59,840 --> 00:48:01,090 Is that clear? 683 00:48:01,090 --> 00:48:14,650 684 00:48:14,650 --> 00:48:15,250 Let's try again. 685 00:48:15,250 --> 00:48:16,270 I will say the same words. 686 00:48:16,270 --> 00:48:17,575 Maybe it becomes clearer by that. 687 00:48:17,575 --> 00:48:22,710 688 00:48:22,710 --> 00:48:24,130 Let's look at a code word. 689 00:48:24,130 --> 00:48:27,790 This is a generic code word, at first, and we want to 690 00:48:27,790 --> 00:48:31,410 answer the question, does there exist a code word 691 00:48:31,410 --> 00:48:34,240 within, which has support only in the first d positions? 692 00:48:34,240 --> 00:48:36,890 693 00:48:36,890 --> 00:48:42,026 So does there exist a code word which is non-zero here, 694 00:48:42,026 --> 00:48:46,626 up to d, and which is zero everywhere else? 695 00:48:46,626 --> 00:48:49,225 That's the question we want to answer. 696 00:48:49,225 --> 00:48:55,575 697 00:48:55,575 --> 00:48:55,817 OK. 698 00:48:55,817 --> 00:48:57,930 Now here's what we do. 699 00:48:57,930 --> 00:48:59,700 We look at this road and say, you know what? 700 00:48:59,700 --> 00:49:04,890 Let's look at the last k positions, which have an 701 00:49:04,890 --> 00:49:08,380 overlap of 1 with this word here, 702 00:49:08,380 --> 00:49:10,130 because it's an MDS property. 703 00:49:10,130 --> 00:49:14,160 So we have this relation between n, k, and d. 704 00:49:14,160 --> 00:49:17,760 And since any k positions in the word are in information 705 00:49:17,760 --> 00:49:24,210 set, so we can choose whatever we want in this part, and this 706 00:49:24,210 --> 00:49:25,460 is what we choose. 707 00:49:25,460 --> 00:49:28,030 708 00:49:28,030 --> 00:49:33,050 By the property of MDS codes, this corollary, we are 709 00:49:33,050 --> 00:49:37,600 guaranteed there exists the code word which in the second 710 00:49:37,600 --> 00:49:41,180 half of the code word looks like this. 711 00:49:41,180 --> 00:49:44,570 And in the first half, it looks different. 712 00:49:44,570 --> 00:49:48,110 There's something else here, and I say, well it cannot have 713 00:49:48,110 --> 00:49:51,640 any 0 in here, because then it would have 714 00:49:51,640 --> 00:49:53,610 weighed less that d. 715 00:49:53,610 --> 00:49:56,330 So it has non-zeros here. 716 00:49:56,330 --> 00:50:00,780 So indeed, we have shown the existence of a code word which 717 00:50:00,780 --> 00:50:05,630 has non-zeroes in the first d positions. 718 00:50:05,630 --> 00:50:07,870 Very simple. 719 00:50:07,870 --> 00:50:10,320 And that was without loss of generality. 720 00:50:10,320 --> 00:50:14,310 You could make the same argument for any d positions. 721 00:50:14,310 --> 00:50:15,250 What have we shown? 722 00:50:15,250 --> 00:50:19,480 We have shown that indeed, we can choose any d positions in 723 00:50:19,480 --> 00:50:22,325 the code to support the minimum weight code 724 00:50:22,325 --> 00:50:25,540 word of weight d. 725 00:50:25,540 --> 00:50:29,440 This is how many ways we can choose this, then we have to 726 00:50:29,440 --> 00:50:32,840 multiply it with q minus 1. 727 00:50:32,840 --> 00:50:34,970 All non-zero field elements. 728 00:50:34,970 --> 00:50:37,550 The reason is, we might have chosen this, or we might have 729 00:50:37,550 --> 00:50:40,920 chosen omega or omega squared here, or just the multiples, 730 00:50:40,920 --> 00:50:42,812 the scalar multiples of it. 731 00:50:42,812 --> 00:50:47,320 732 00:50:47,320 --> 00:50:48,740 Interesting, right? 733 00:50:48,740 --> 00:50:54,230 This property, that any k positions on information set 734 00:50:54,230 --> 00:50:56,520 is really strong enough to prove the -- 735 00:50:56,520 --> 00:50:58,570 actually, it's strong enough to prove the entire weight 736 00:50:58,570 --> 00:51:00,980 distribution of an MDS code. 737 00:51:00,980 --> 00:51:03,175 AUDIENCE: [INAUDIBLE] 738 00:51:03,175 --> 00:51:04,425 [UNINTELLIGIBLE]? 739 00:51:04,425 --> 00:51:06,870 740 00:51:06,870 --> 00:51:10,030 PROFESSOR: No, no, no, why no, no, no, no. 741 00:51:10,030 --> 00:51:13,660 742 00:51:13,660 --> 00:51:16,190 So then you would get too much. 743 00:51:16,190 --> 00:51:21,000 If you write q minus 1 to the d, then you would want to 744 00:51:21,000 --> 00:51:22,920 multiply each position with a different value. 745 00:51:22,920 --> 00:51:26,590 746 00:51:26,590 --> 00:51:31,440 That would imply that there's more than one code word in the 747 00:51:31,440 --> 00:51:33,380 first d positions. 748 00:51:33,380 --> 00:51:36,790 More than one code word so that they are not scalar 749 00:51:36,790 --> 00:51:38,900 multiples of each other. 750 00:51:38,900 --> 00:51:42,060 If that would be true, then you could find a linear 751 00:51:42,060 --> 00:51:45,940 combination which is still 0 in this part, but has 752 00:51:45,940 --> 00:51:48,470 additional 0 here somewhere. 753 00:51:48,470 --> 00:51:50,470 But if that is true, then we don't have 754 00:51:50,470 --> 00:51:51,730 enough distance anymore. 755 00:51:51,730 --> 00:51:52,980 Then it's not an MDS code. 756 00:51:52,980 --> 00:51:56,630 757 00:51:56,630 --> 00:51:59,850 All right, so it's indeed q minus 1. 758 00:51:59,850 --> 00:52:02,870 Within each d positions, we have one dimensional space. 759 00:52:02,870 --> 00:52:06,184 It's just one dimension. 760 00:52:06,184 --> 00:52:07,434 AUDIENCE: [INAUDIBLE] 761 00:52:07,434 --> 00:52:09,940 762 00:52:09,940 --> 00:52:11,250 far off minimum weight code words? 763 00:52:11,250 --> 00:52:14,230 764 00:52:14,230 --> 00:52:15,690 PROFESSOR: Yeah, yeah, definitely. 765 00:52:15,690 --> 00:52:18,440 766 00:52:18,440 --> 00:52:20,665 Any other code must have less, so it would have less. 767 00:52:20,665 --> 00:52:23,950 768 00:52:23,950 --> 00:52:25,310 But every other code would have a 769 00:52:25,310 --> 00:52:26,560 smaller minimum distance. 770 00:52:26,560 --> 00:52:31,803 771 00:52:31,803 --> 00:52:34,629 AUDIENCE: [INAUDIBLE]. 772 00:52:34,629 --> 00:52:40,172 Suppose we let the last k minus 1 position zero, and the 773 00:52:40,172 --> 00:52:43,670 one before that, [UNINTELLIGIBLE PHRASE]. 774 00:52:43,670 --> 00:52:47,540 And you said that we can do it for any of [UNINTELLIGIBLE] 775 00:52:47,540 --> 00:52:48,490 total field? 776 00:52:48,490 --> 00:52:49,786 PROFESSOR: Sure. 777 00:52:49,786 --> 00:52:55,770 AUDIENCE: Since it's a linear code, some of those code words 778 00:52:55,770 --> 00:52:59,220 should be in the linear code, right? 779 00:52:59,220 --> 00:53:01,240 PROFESSOR: Sure. 780 00:53:01,240 --> 00:53:05,030 AUDIENCE: So because it's a field, also we are going to 781 00:53:05,030 --> 00:53:06,700 [INAUDIBLE] 782 00:53:06,700 --> 00:53:12,830 there exists an inverse [UNINTELLIGIBLE]? 783 00:53:12,830 --> 00:53:14,230 PROFESSOR: Absolutely. 784 00:53:14,230 --> 00:53:20,740 AUDIENCE: So if we add those two code words, we should have 785 00:53:20,740 --> 00:53:22,510 all zero, [UNINTELLIGIBLE] 786 00:53:22,510 --> 00:53:27,590 k minus 1, and have inverse at the one before. 787 00:53:27,590 --> 00:53:34,910 We get that code word which has a minimum weight, which is 788 00:53:34,910 --> 00:53:37,862 less that the one we have here? 789 00:53:37,862 --> 00:53:39,030 PROFESSOR: Good question. 790 00:53:39,030 --> 00:53:40,670 There is a trick out. 791 00:53:40,670 --> 00:53:43,640 There's a way out of this. 792 00:53:43,640 --> 00:53:45,610 Great argument. 793 00:53:45,610 --> 00:53:46,895 But there's a trick out. 794 00:53:46,895 --> 00:53:47,990 AUDIENCE: There's gotta be an upper bound 795 00:53:47,990 --> 00:53:49,390 PROFESSOR: No, no, there is a trick, there 796 00:53:49,390 --> 00:53:51,220 is a way out here. 797 00:53:51,220 --> 00:53:55,580 Namely, so let's put it like this. 798 00:53:55,580 --> 00:53:59,880 Right here we put in a 1, just for simplicity, let's assume 799 00:53:59,880 --> 00:54:01,550 all the other positions are also 1. 800 00:54:01,550 --> 00:54:04,160 801 00:54:04,160 --> 00:54:07,540 And then you say, this would be another code word, which 802 00:54:07,540 --> 00:54:11,300 has here an omega. 803 00:54:11,300 --> 00:54:12,300 I say, you know what? 804 00:54:12,300 --> 00:54:12,850 What's going to happen? 805 00:54:12,850 --> 00:54:15,330 All the other positions are going to be omega 2. 806 00:54:15,330 --> 00:54:18,110 807 00:54:18,110 --> 00:54:20,790 There's no way to combine these two guys to get an 808 00:54:20,790 --> 00:54:23,650 additional 0, unless you get all 0's. 809 00:54:23,650 --> 00:54:24,800 Unless you get to 0. 810 00:54:24,800 --> 00:54:26,250 That's what I said, it's a one-dimensional 811 00:54:26,250 --> 00:54:28,030 space in these positions. 812 00:54:28,030 --> 00:54:29,280 When it's a soft code . 813 00:54:29,280 --> 00:54:31,889 814 00:54:31,889 --> 00:54:33,139 AUDIENCE: [UNINTELLIGIBLE PHRASE] 815 00:54:33,139 --> 00:54:39,710 816 00:54:39,710 --> 00:54:43,870 PROFESSOR: It tells you that if you write down the minimum 817 00:54:43,870 --> 00:54:49,520 weight code words in the q minus 1 times d matrix, that 818 00:54:49,520 --> 00:54:53,620 is, you have a Latin square, basically. 819 00:54:53,620 --> 00:54:54,790 That's what it tells you. 820 00:54:54,790 --> 00:54:58,460 There's in no position, if you have anywhere in here an 821 00:54:58,460 --> 00:55:03,390 element alpha and element omega, the same omega pops up 822 00:55:03,390 --> 00:55:04,640 nowhere else. 823 00:55:04,640 --> 00:55:07,770 824 00:55:07,770 --> 00:55:12,410 There's ramifications of MDS codes in combinatorics left 825 00:55:12,410 --> 00:55:14,060 and right, so this would be a Latin square. 826 00:55:14,060 --> 00:55:18,390 827 00:55:18,390 --> 00:55:21,320 You know, you can learn a lot a lot about MDS codes if you 828 00:55:21,320 --> 00:55:23,750 think a little bit about that, and about combinatorics 829 00:55:23,750 --> 00:55:25,220 altogether. 830 00:55:25,220 --> 00:55:26,760 OK, where was I? 831 00:55:26,760 --> 00:55:29,608 So we know that's fun. 832 00:55:29,608 --> 00:55:38,250 And actually, in the homework, you going to do n d plus 1. 833 00:55:38,250 --> 00:55:40,900 834 00:55:40,900 --> 00:55:42,900 So the next one. 835 00:55:42,900 --> 00:55:49,100 But once you do n d plus 1, do all of them. 836 00:55:49,100 --> 00:55:52,370 In a sense it's just inclusion and exclusion from then on. 837 00:55:52,370 --> 00:55:54,430 The first one is sort of the toughest one, the rest is 838 00:55:54,430 --> 00:55:56,140 inclusion exclusion. 839 00:55:56,140 --> 00:56:00,610 And just for the heck of it, when you go home and do the 840 00:56:00,610 --> 00:56:02,950 homework, write them all out. 841 00:56:02,950 --> 00:56:06,910 It's a pretty looking formula, in the end. 842 00:56:06,910 --> 00:56:10,030 OK, so far, so good. 843 00:56:10,030 --> 00:56:12,955 So we have still talked about MDS codes without knowing if 844 00:56:12,955 --> 00:56:15,840 they exist. 845 00:56:15,840 --> 00:56:17,430 Except for the trivial ones here. 846 00:56:17,430 --> 00:56:21,250 847 00:56:21,250 --> 00:56:31,310 And the existence of MDS codes is actually not known for 848 00:56:31,310 --> 00:56:33,830 which parameters they exist. 849 00:56:33,830 --> 00:56:39,270 So I give you a research problem. 850 00:56:39,270 --> 00:56:56,580 The research problem is the main conjecture on MDS codes. 851 00:56:56,580 --> 00:57:00,320 852 00:57:00,320 --> 00:57:01,520 And it's always sort of tricky. 853 00:57:01,520 --> 00:57:04,330 When a research problem has a name, then 854 00:57:04,330 --> 00:57:07,590 that signifies danger. 855 00:57:07,590 --> 00:57:12,340 Then it means that it's not trivial. 856 00:57:12,340 --> 00:57:20,790 The question is, for which k d and q, for which sets of 857 00:57:20,790 --> 00:57:28,000 parameters n k d q, do MDS codes exist? 858 00:57:28,000 --> 00:57:44,250 And the conjecture this is that n k q, because in MDS 859 00:57:44,250 --> 00:57:46,930 code we can actually get rid of the d here. 860 00:57:46,930 --> 00:58:03,022 e, the longest length of an MDS code. 861 00:58:03,022 --> 00:58:17,150 The longest length of an MDS code, I mentioned k over an 862 00:58:17,150 --> 00:58:26,582 alphabet of size q. 863 00:58:26,582 --> 00:58:45,530 The conjecture is that n, k, d is less than q plus 1 for k at 864 00:58:45,530 --> 00:58:48,360 least 2, unless -- 865 00:58:48,360 --> 00:58:50,310 I always have to look that up -- 866 00:58:50,310 --> 00:58:53,394 867 00:58:53,394 --> 00:58:54,644 I think 2q. 868 00:58:54,644 --> 00:58:58,010 869 00:58:58,010 --> 00:59:07,480 And k plus 1 for k greater than q. 870 00:59:07,480 --> 00:59:11,350 871 00:59:11,350 --> 00:59:22,680 We talk about it in a second, except that n, there's 872 00:59:22,680 --> 00:59:29,380 a 3, 2 to the s. 873 00:59:29,380 --> 00:59:33,830 So if the alphabet is a power of 2, alphabet size an 874 00:59:33,830 --> 00:59:35,185 extension field of 2, basically. 875 00:59:35,185 --> 00:59:38,460 876 00:59:38,460 --> 00:59:49,640 q plus 2 and q minus 1 q to the s. 877 00:59:49,640 --> 00:59:52,740 q plus 2. 878 00:59:52,740 --> 00:59:55,680 OK, so this is the main conjecture on MDS codes. 879 00:59:55,680 --> 00:59:58,700 880 00:59:58,700 --> 01:00:03,970 Basically, it says that the length can essentially be as 881 01:00:03,970 --> 01:00:08,212 large as the alphabet size, but not larger. 882 01:00:08,212 --> 01:00:10,190 AUDIENCE: [INAUDIBLE] 883 01:00:10,190 --> 01:00:12,410 PROFESSOR: This q, yeah? 884 01:00:12,410 --> 01:00:13,490 Oh, yeah, n k q, sorry. 885 01:00:13,490 --> 01:00:15,650 It doesn't make sense otherwise. 886 01:00:15,650 --> 01:00:18,780 887 01:00:18,780 --> 01:00:22,210 So the lengths can be in the same order of magnitude as the 888 01:00:22,210 --> 01:00:23,140 alphabet size. 889 01:00:23,140 --> 01:00:27,700 That gives enough room, enough choices, to fill up this 890 01:00:27,700 --> 01:00:31,390 matrix with the information set, with the MDS property on 891 01:00:31,390 --> 01:00:34,240 the information sets. 892 01:00:34,240 --> 01:00:39,390 This is the parity-check code, this row is just taken out as 893 01:00:39,390 --> 01:00:40,640 a trivial code. 894 01:00:40,640 --> 01:00:43,970 895 01:00:43,970 --> 01:00:49,750 And then, the demon of mathematics conspired that 896 01:00:49,750 --> 01:00:51,940 this would also be true. 897 01:00:51,940 --> 01:00:55,810 So if you have an extension field of 2, and you want to 898 01:00:55,810 --> 01:01:00,720 give it a dimension three, MDS code, they exist for q plus 2. 899 01:01:00,720 --> 01:01:05,130 900 01:01:05,130 --> 01:01:05,241 Right. 901 01:01:05,241 --> 01:01:08,130 There are, of course, reasons for this, but they go pretty 902 01:01:08,130 --> 01:01:11,400 deep, why they exist for those parameters, and this is just 903 01:01:11,400 --> 01:01:14,150 mysterious. 904 01:01:14,150 --> 01:01:15,830 One can give reasons, so on another hand, 905 01:01:15,830 --> 01:01:17,080 it's just so, right. 906 01:01:17,080 --> 01:01:20,610 907 01:01:20,610 --> 01:01:22,870 There are exceptionally enough that they have names. 908 01:01:22,870 --> 01:01:29,370 The first one is the Hexacode, it's something with a 909 01:01:29,370 --> 01:01:42,230 generator matrix, and this goes over F4. 910 01:01:42,230 --> 01:01:49,110 So that's an MDS code of length six, so this is a n6, 911 01:01:49,110 --> 01:01:54,380 3, 4, MDS code over alphabet size 4. 912 01:01:54,380 --> 01:01:58,650 That's the first one, in that sequence here. 913 01:01:58,650 --> 01:02:00,790 Anyway. 914 01:02:00,790 --> 01:02:02,180 Otherwise, we have this conjecture. 915 01:02:02,180 --> 01:02:05,850 If you solve this, you are going to be rich and famous, 916 01:02:05,850 --> 01:02:10,660 you're going to live in Hollywood, and 917 01:02:10,660 --> 01:02:11,760 maybe, maybe not. 918 01:02:11,760 --> 01:02:15,530 But you're going to be probably not rich, you're 919 01:02:15,530 --> 01:02:17,730 going to be famous about a couple of hundred people who 920 01:02:17,730 --> 01:02:23,160 know about this MDS conjecture, but very smart 921 01:02:23,160 --> 01:02:25,475 people have been looking for this for a long, long time. 922 01:02:25,475 --> 01:02:26,725 OK. 923 01:02:26,725 --> 01:02:29,356 924 01:02:29,356 --> 01:02:31,260 All right, 20 minutes left. 925 01:02:31,260 --> 01:02:34,840 So it's better we define, we make sure those codes exist. 926 01:02:34,840 --> 01:02:36,964 Do we have any question about this MDS conjecture? 927 01:02:36,964 --> 01:02:46,220 928 01:02:46,220 --> 01:02:50,070 OK, last 20 minutes, let's at least make sure 929 01:02:50,070 --> 01:02:51,410 those things exist. 930 01:02:51,410 --> 01:02:52,660 Reed-Solomon codes. 931 01:02:52,660 --> 01:03:00,550 932 01:03:00,550 --> 01:03:07,210 So Reed-Solomon codes cover this case. 933 01:03:07,210 --> 01:03:09,970 They are examples of codes which lie, which 934 01:03:09,970 --> 01:03:12,690 satisfy this equality. 935 01:03:12,690 --> 01:03:14,840 OK, so how do we define Reed-Solomon codes? 936 01:03:14,840 --> 01:03:20,780 937 01:03:20,780 --> 01:03:24,590 Now, just in a true mathematician spirit, write 938 01:03:24,590 --> 01:03:27,010 down consider the following. 939 01:03:27,010 --> 01:03:31,593 Consider the following code. 940 01:03:31,593 --> 01:03:37,510 941 01:03:37,510 --> 01:03:38,760 See? 942 01:03:38,760 --> 01:03:49,100 943 01:03:49,100 --> 01:04:01,135 Beta 0 beta q minus 1. 944 01:04:01,135 --> 01:04:09,808 945 01:04:09,808 --> 01:04:14,520 The beta i are the distinct field elements, the distinct 946 01:04:14,520 --> 01:04:17,310 elements in the finite field. 947 01:04:17,310 --> 01:04:21,075 f is a polynomial. 948 01:04:21,075 --> 01:04:24,590 949 01:04:24,590 --> 01:04:33,810 f is a polynomial, and the degree is less than k. 950 01:04:33,810 --> 01:04:34,950 OK, good. 951 01:04:34,950 --> 01:04:37,280 So we have defined a code. 952 01:04:37,280 --> 01:04:40,540 So what that means is we start from polynomials. 953 01:04:40,540 --> 01:04:46,496 The set of all polynomials of degree at most k. 954 01:04:46,496 --> 01:04:48,160 So what do we know about that set? 955 01:04:48,160 --> 01:04:50,520 It's a vector space, right? 956 01:04:50,520 --> 01:04:53,000 The set of all polynomials of degree at most k. 957 01:04:53,000 --> 01:04:55,640 We can add them to get a polynomial of degree at most 958 01:04:55,640 --> 01:04:58,320 k, we can multiply them with a scalar to get a polynomial 959 01:04:58,320 --> 01:05:00,390 with degree at most k. 960 01:05:00,390 --> 01:05:03,790 It's a vector space. 961 01:05:03,790 --> 01:05:10,820 So we take this vector space and evaluate for any element 962 01:05:10,820 --> 01:05:12,090 in that vector space. 963 01:05:12,090 --> 01:05:21,740 This element in all non-zero elements of the field and we 964 01:05:21,740 --> 01:05:22,270 get a code. 965 01:05:22,270 --> 01:05:31,160 We get a set of vectors, so we get a set of 966 01:05:31,160 --> 01:05:33,830 vectors, and that -- 967 01:05:33,830 --> 01:05:35,830 AUDIENCE: [INAUDIBLE] 968 01:05:35,830 --> 01:05:37,180 PROFESSOR: Yeah, I took all elements. 969 01:05:37,180 --> 01:05:38,310 Why not? 970 01:05:38,310 --> 01:05:40,260 Why not all elements? 971 01:05:40,260 --> 01:05:43,910 Strictly speaking, I should have taken one more in order 972 01:05:43,910 --> 01:05:46,070 to get the one here. 973 01:05:46,070 --> 01:05:47,800 We can talk about that in a sec. 974 01:05:47,800 --> 01:05:49,420 But this one more element would be -- 975 01:05:49,420 --> 01:05:53,980 976 01:05:53,980 --> 01:05:55,170 so it's a code. 977 01:05:55,170 --> 01:05:56,130 First of all, it's a code. 978 01:05:56,130 --> 01:05:56,370 Right? 979 01:05:56,370 --> 01:05:58,410 We all see it's a code. 980 01:05:58,410 --> 01:06:01,200 And once you see it's a code, we ask, what are the 981 01:06:01,200 --> 01:06:02,450 parameters? 982 01:06:02,450 --> 01:06:09,310 983 01:06:09,310 --> 01:06:10,560 The parameters. 984 01:06:10,560 --> 01:06:16,640 985 01:06:16,640 --> 01:06:22,440 So length, length is the easy one. 986 01:06:22,440 --> 01:06:25,450 Well, it's q. 987 01:06:25,450 --> 01:06:26,700 What is dimension? 988 01:06:26,700 --> 01:06:29,810 989 01:06:29,810 --> 01:06:33,660 Dimension of C. What's the dimension? 990 01:06:33,660 --> 01:06:38,810 991 01:06:38,810 --> 01:06:40,480 It's a little bit tricky, that question. 992 01:06:40,480 --> 01:06:45,020 993 01:06:45,020 --> 01:06:48,550 I actually, at Illinois, we have to 994 01:06:48,550 --> 01:06:51,560 take a class on teaching. 995 01:06:51,560 --> 01:06:53,840 How to become an effective teacher. 996 01:06:53,840 --> 01:06:58,070 And one of the things they told us is that if you ask a 997 01:06:58,070 --> 01:07:03,610 question, you have to wait for 12 seconds to get an answer. 998 01:07:03,610 --> 01:07:05,760 So what's the dimension? 999 01:07:05,760 --> 01:07:07,010 There you go. 1000 01:07:07,010 --> 01:07:12,560 1001 01:07:12,560 --> 01:07:16,700 This mapping, this mapping from a vector space to a 1002 01:07:16,700 --> 01:07:17,950 vector space. 1003 01:07:17,950 --> 01:07:20,170 1004 01:07:20,170 --> 01:07:22,070 This mapping, also called evaluation 1005 01:07:22,070 --> 01:07:25,300 map, is a linear map. 1006 01:07:25,300 --> 01:07:32,810 It's a linear map, meaning that, well, let's start 1007 01:07:32,810 --> 01:07:34,140 differently. 1008 01:07:34,140 --> 01:07:35,750 Let's start differently. 1009 01:07:35,750 --> 01:07:38,900 Do any two polynomials map to the same code word? 1010 01:07:38,900 --> 01:07:42,430 1011 01:07:42,430 --> 01:07:43,140 That you know. 1012 01:07:43,140 --> 01:07:46,760 That you cannot. 1013 01:07:46,760 --> 01:07:54,860 Are there any two codes, two polynomials, so are there f of 1014 01:07:54,860 --> 01:08:10,740 x, g of x, such that f of beta 0, so that they coincide in 1015 01:08:10,740 --> 01:08:13,520 all positions? 1016 01:08:13,520 --> 01:08:15,370 No, then they would be the same, right? 1017 01:08:15,370 --> 01:08:18,700 And the reason is because if there would be something like 1018 01:08:18,700 --> 01:08:25,880 that, then you could just look at h is f of x minus g of x, 1019 01:08:25,880 --> 01:08:29,130 which is just another polynomial of degree k. 1020 01:08:29,130 --> 01:08:33,100 And this would have to vanish in all positions. 1021 01:08:33,100 --> 01:08:40,950 If k is less than q, it could not possibly vanish in all 1022 01:08:40,950 --> 01:08:44,399 positions, because then the polynomial of degree k would 1023 01:08:44,399 --> 01:08:47,200 vanish in more than k positions. 1024 01:08:47,200 --> 01:08:48,720 Fundamental theorem of algebra. 1025 01:08:48,720 --> 01:08:50,430 The very beginning. 1026 01:08:50,430 --> 01:08:54,859 So the dimension of C is indeed k, the same as the 1027 01:08:54,859 --> 01:08:58,740 dimension of this vector space. 1028 01:08:58,740 --> 01:09:00,850 The dimension of the vector space of polynomials of the 1029 01:09:00,850 --> 01:09:03,000 degree k minus 1. 1030 01:09:03,000 --> 01:09:05,950 And the distance, if k is less than q, the 1031 01:09:05,950 --> 01:09:11,072 distance is equal to q. 1032 01:09:11,072 --> 01:09:18,920 The distance, what is it? 1033 01:09:18,920 --> 01:09:21,770 Same argument, roughly the same argument. 1034 01:09:21,770 --> 01:09:27,120 I think that's a linear code, so if it's a linear code, the 1035 01:09:27,120 --> 01:09:29,490 minimum distance of the code is the same as the minimum 1036 01:09:29,490 --> 01:09:32,580 weight of a non-zero word. 1037 01:09:32,580 --> 01:09:34,580 What's the minimum weight of a non-zero word? 1038 01:09:34,580 --> 01:09:37,200 1039 01:09:37,200 --> 01:09:41,319 These are polynomials of degree k minus 1. 1040 01:09:41,319 --> 01:09:43,920 What's the minimum weight of a non-zero word? 1041 01:09:43,920 --> 01:09:46,910 Well, we start out with the weight 1, and whenever the 1042 01:09:46,910 --> 01:09:52,300 polynomial evaluates to 0, one of the weights drops out. 1043 01:09:52,300 --> 01:09:55,470 So I claim the minimum distance as the minimum 1044 01:09:55,470 --> 01:10:12,100 weight, weight of the non-zero word, and this is n minus, 1045 01:10:12,100 --> 01:10:16,090 well, if any of these polynomials vanishes in all, 1046 01:10:16,090 --> 01:10:19,990 it vanishes in at most, k minus 1 positions. 1047 01:10:19,990 --> 01:10:25,210 At most, k minus 1 of these vectors here, of these 1048 01:10:25,210 --> 01:10:27,220 entries, is equal to 0. 1049 01:10:27,220 --> 01:10:31,970 So it drops by, at most, k minus 1. 1050 01:10:31,970 --> 01:10:35,260 Drops by at most, k minus 1. 1051 01:10:35,260 --> 01:10:36,510 And there we have it. 1052 01:10:36,510 --> 01:10:39,454 1053 01:10:39,454 --> 01:10:40,380 There we have it. 1054 01:10:40,380 --> 01:10:43,114 There we have, oh, this is q. 1055 01:10:43,114 --> 01:10:46,150 1056 01:10:46,150 --> 01:10:46,960 There we have it. 1057 01:10:46,960 --> 01:10:53,280 There we have that the minimum distance of the code satisfies 1058 01:10:53,280 --> 01:10:54,598 this equation. 1059 01:10:54,598 --> 01:10:56,470 AUDIENCE: [INAUDIBLE] 1060 01:10:56,470 --> 01:10:57,640 PROFESSOR: What? 1061 01:10:57,640 --> 01:10:58,230 AUDIENCE: The dimension? 1062 01:10:58,230 --> 01:10:58,760 PROFESSOR: The dimension. 1063 01:10:58,760 --> 01:11:01,900 So, it's the same argument, roughly. 1064 01:11:01,900 --> 01:11:07,410 So I say, the dimension, so let's just say the size of the 1065 01:11:07,410 --> 01:11:11,710 code is q to the k. 1066 01:11:11,710 --> 01:11:15,730 When is the size of the q to the k, if no two elements in 1067 01:11:15,730 --> 01:11:18,540 the space evaluate to the same code word? 1068 01:11:18,540 --> 01:11:21,432 But if two of them would evaluate to the same code 1069 01:11:21,432 --> 01:11:26,930 word, then we would less size than the vector space had. 1070 01:11:26,930 --> 01:11:30,320 But if two of them evaluate to the same code word, that means 1071 01:11:30,320 --> 01:11:37,070 this is true for all four positions. 1072 01:11:37,070 --> 01:11:41,160 Then we could define a polynomial h of the 1073 01:11:41,160 --> 01:11:42,990 degree k minus 1. 1074 01:11:42,990 --> 01:11:49,645 which disappears in more than k minus 1 positions. 1075 01:11:49,645 --> 01:11:52,510 I mean, all positions. 1076 01:11:52,510 --> 01:11:56,230 Cannot be, hence the size of the code is q to the k, so 1077 01:11:56,230 --> 01:11:58,370 this is a linear map, dimension is k. 1078 01:11:58,370 --> 01:12:01,660 1079 01:12:01,660 --> 01:12:03,120 OK. 1080 01:12:03,120 --> 01:12:03,660 So cool. 1081 01:12:03,660 --> 01:12:04,600 So we have it, right? 1082 01:12:04,600 --> 01:12:06,050 We have our MDS codes. 1083 01:12:06,050 --> 01:12:07,195 They exist. 1084 01:12:07,195 --> 01:12:08,080 Here they are. 1085 01:12:08,080 --> 01:12:10,610 They are Reed-Solomon codes. 1086 01:12:10,610 --> 01:12:15,840 Not all MDS codes are Reed-Solomon codes, but the 1087 01:12:15,840 --> 01:12:19,220 ones we are interested in, they are. 1088 01:12:19,220 --> 01:12:21,400 AUDIENCE: [INAUDIBLE] 1089 01:12:21,400 --> 01:12:25,000 PROFESSOR: Well, the distance is at least this, but the MDS 1090 01:12:25,000 --> 01:12:30,270 bounds is at most this, so it's equal to this. 1091 01:12:30,270 --> 01:12:43,720 But the MDS bounds, so the MDS bound has this is. 1092 01:12:43,720 --> 01:12:46,760 So with that. 1093 01:12:46,760 --> 01:12:52,140 So it's indeed, they lie exactly bang on to this. 1094 01:12:52,140 --> 01:12:54,840 There are MDS codes, Reed-Solomon codes. 1095 01:12:54,840 --> 01:12:55,910 So that is good. 1096 01:12:55,910 --> 01:12:57,612 So we know what they are. 1097 01:12:57,612 --> 01:13:03,280 So incidentally, where do you think this one more point is 1098 01:13:03,280 --> 01:13:04,980 that you would evaluate our polynomials in? 1099 01:13:04,980 --> 01:13:11,120 1100 01:13:11,120 --> 01:13:12,400 You've heard about projective geometries? 1101 01:13:12,400 --> 01:13:16,840 1102 01:13:16,840 --> 01:13:20,400 There's one more point, it's infinity. 1103 01:13:20,400 --> 01:13:24,940 You have, basically, if you look at the numbers, in order 1104 01:13:24,940 --> 01:13:27,410 to close it up, you want to add infinity to that, too. 1105 01:13:27,410 --> 01:13:32,020 1106 01:13:32,020 --> 01:13:37,690 In order to get this one more, this one addition in length, 1107 01:13:37,690 --> 01:13:40,450 you want to evaluate this also at infinity. 1108 01:13:40,450 --> 01:13:43,440 You will have opportunity to do that in the homework. 1109 01:13:43,440 --> 01:13:45,420 I looked at the homework and I was pleased to see this 1110 01:13:45,420 --> 01:13:47,220 problem there. 1111 01:13:47,220 --> 01:13:50,190 I hope you will be pleased, too. 1112 01:13:50,190 --> 01:13:53,780 OK, all right. 1113 01:13:53,780 --> 01:13:55,030 Any questions about this? 1114 01:13:55,030 --> 01:13:58,616 1115 01:13:58,616 --> 01:14:00,200 Let's see what else I wanted to say. 1116 01:14:00,200 --> 01:14:10,300 1117 01:14:10,300 --> 01:14:14,240 Because it just gives me a few minutes to talk about a few 1118 01:14:14,240 --> 01:14:28,570 properties of Reed-Solomon codes, a few properties of 1119 01:14:28,570 --> 01:14:29,820 Reed-Solomon codes. 1120 01:14:29,820 --> 01:14:31,890 1121 01:14:31,890 --> 01:14:34,035 And what did I want to say there? 1122 01:14:34,035 --> 01:14:46,430 1123 01:14:46,430 --> 01:14:59,620 On nested codes, so an RS code with parameters n k, maybe we 1124 01:14:59,620 --> 01:15:00,945 define them [UNINTELLIGIBLE] like this. 1125 01:15:00,945 --> 01:15:09,140 1126 01:15:09,140 --> 01:15:21,965 q is properly contained, k minus 1, minus 1. 1127 01:15:21,965 --> 01:15:26,730 1128 01:15:26,730 --> 01:15:31,840 This is pretty straight from the definition of RS codes. 1129 01:15:31,840 --> 01:15:34,850 1130 01:15:34,850 --> 01:15:39,480 The set of polynomials of degree at most k minus 1 1131 01:15:39,480 --> 01:15:41,990 contains the set of polynomials of degree at 1132 01:15:41,990 --> 01:15:44,170 most k minus 1. 1133 01:15:44,170 --> 01:15:51,270 So they are nested codes, property one. 1134 01:15:51,270 --> 01:15:54,030 1135 01:15:54,030 --> 01:15:55,960 You will see this is important, that they are 1136 01:15:55,960 --> 01:15:59,170 nested codes, for various constructions where 1137 01:15:59,170 --> 01:16:01,160 Reed-Solomon codes take part in later on. 1138 01:16:01,160 --> 01:16:04,000 1139 01:16:04,000 --> 01:16:15,990 A punctured RS code is again an MDS code. 1140 01:16:15,990 --> 01:16:20,050 1141 01:16:20,050 --> 01:16:22,460 Why is that so? 1142 01:16:22,460 --> 01:16:23,710 Why is that so? 1143 01:16:23,710 --> 01:16:27,610 1144 01:16:27,610 --> 01:16:30,510 Well, you see it? 1145 01:16:30,510 --> 01:16:33,670 1146 01:16:33,670 --> 01:16:36,490 Say if you puncture a Reed-Solomon code. 1147 01:16:36,490 --> 01:16:41,640 That means we just choose to not evaluate our code in this, 1148 01:16:41,640 --> 01:16:43,450 this position. 1149 01:16:43,450 --> 01:16:44,610 And this field element. 1150 01:16:44,610 --> 01:16:45,960 Well, we just drop that coordinate. 1151 01:16:45,960 --> 01:16:48,830 1152 01:16:48,830 --> 01:16:51,810 Does anything change in the arguments we have made? 1153 01:16:51,810 --> 01:16:58,450 Well, the length is now 1 less, the dimension, well, the 1154 01:16:58,450 --> 01:17:03,630 dimension is still the same, as long as k is not larger 1155 01:17:03,630 --> 01:17:05,680 than the length of the code. 1156 01:17:05,680 --> 01:17:10,650 The distance, still the same as the length, the distance is 1157 01:17:10,650 --> 01:17:16,240 at least the length minus the number of 0's. 1158 01:17:16,240 --> 01:17:18,630 So that equation still holds. 1159 01:17:18,630 --> 01:17:20,390 Well, but that's all we needed. 1160 01:17:20,390 --> 01:17:22,360 Still MDS code. 1161 01:17:22,360 --> 01:17:25,460 So there was really no -- it was not important. 1162 01:17:25,460 --> 01:17:28,640 It was not important if you took all field elements, or a 1163 01:17:28,640 --> 01:17:31,870 subset of the field elements with MDS property. 1164 01:17:31,870 --> 01:17:34,230 That has nothing to do with it. 1165 01:17:34,230 --> 01:17:36,220 In particular, we often in the end, we often 1166 01:17:36,220 --> 01:17:40,240 will drop the 0 element. 1167 01:17:40,240 --> 01:17:44,590 We often choose not to evaluate these polynomials in 1168 01:17:44,590 --> 01:17:48,270 the 0 of the field. 1169 01:17:48,270 --> 01:17:51,775 A punctured Reed-Solomon code is an MDS code. 1170 01:17:51,775 --> 01:17:56,030 1171 01:17:56,030 --> 01:17:58,350 So what else did I want to say about this? 1172 01:17:58,350 --> 01:18:01,801 1173 01:18:01,801 --> 01:18:04,270 What else did I want to say about this? 1174 01:18:04,270 --> 01:18:09,610 1175 01:18:09,610 --> 01:18:10,860 A generator matrix. 1176 01:18:10,860 --> 01:18:17,180 1177 01:18:17,180 --> 01:18:19,060 How would a generator matrix look like? 1178 01:18:19,060 --> 01:18:28,389 1179 01:18:28,389 --> 01:18:30,160 Yeah, how would it look like? 1180 01:18:30,160 --> 01:18:32,850 1181 01:18:32,850 --> 01:18:34,260 Basically, we can come from here, right? 1182 01:18:34,260 --> 01:18:37,085 We can take the generators of that space. 1183 01:18:37,085 --> 01:18:40,360 1184 01:18:40,360 --> 01:18:43,820 So basically, we say that one -- 1185 01:18:43,820 --> 01:18:52,070 1186 01:18:52,070 --> 01:19:02,780 generate the set of polynomials, that vector space 1187 01:19:02,780 --> 01:19:04,800 of polynomials with -- 1188 01:19:04,800 --> 01:19:09,910 1189 01:19:09,910 --> 01:19:12,060 so this is the basis of that vector space. 1190 01:19:12,060 --> 01:19:14,610 1191 01:19:14,610 --> 01:19:20,840 So if we map that basis, then we get a basis of the image of 1192 01:19:20,840 --> 01:19:22,090 the mapping. 1193 01:19:22,090 --> 01:19:24,200 1194 01:19:24,200 --> 01:19:27,610 And the mapping of that basis would give this. 1195 01:19:27,610 --> 01:19:32,080 So we evaluate the function 1 in all field elements -- 1196 01:19:32,080 --> 01:19:35,960 gives us 1. 1197 01:19:35,960 --> 01:19:39,410 We evaluate the function x in all field elements. 1198 01:19:39,410 --> 01:19:41,610 This gives us the next generator of the 1199 01:19:41,610 --> 01:19:43,220 Reed-Solomon code. 1200 01:19:43,220 --> 01:19:54,880 Well, 0 gives 0, 1 gives, oh, let's write like this. 1201 01:19:54,880 --> 01:19:57,160 We evaluate it in all field elements. 1202 01:19:57,160 --> 01:20:01,740 1203 01:20:01,740 --> 01:20:04,910 These are all the field elements. 1204 01:20:04,910 --> 01:20:23,103 The next one, and this goes up to beta -- 1205 01:20:23,103 --> 01:20:28,630 1206 01:20:28,630 --> 01:20:30,745 OK, so this would be a generator matrix. 1207 01:20:30,745 --> 01:20:34,150 1208 01:20:34,150 --> 01:20:35,320 That's fine. 1209 01:20:35,320 --> 01:20:45,030 So now, in order to make things a bit more interesting, 1210 01:20:45,030 --> 01:20:46,645 do you have to stop five minutes early? 1211 01:20:46,645 --> 01:20:47,895 We just started five minutes late? 1212 01:20:47,895 --> 01:20:50,980 1213 01:20:50,980 --> 01:20:54,435 OK then, I think that's over. 1214 01:20:54,435 --> 01:20:56,550 I think it's over. 1215 01:20:56,550 --> 01:21:02,150 One more thing for you guys to think about until you reach 1216 01:21:02,150 --> 01:21:06,250 home, then the rest you do next time. 1217 01:21:06,250 --> 01:21:18,820 So let beta 0 be equal to 0 beta 1, or beta i equal to 1218 01:21:18,820 --> 01:21:21,745 omega i minus 1 where omega is primitive in the field. 1219 01:21:21,745 --> 01:21:32,690 1220 01:21:32,690 --> 01:21:41,885 Then we can write the matrix v of omega. 1221 01:21:41,885 --> 01:22:03,310 1222 01:22:03,310 --> 01:22:07,410 I tend to see that the first k columns, the first k rows of 1223 01:22:07,410 --> 01:22:09,860 this matrix would be a generator matrix of a 1224 01:22:09,860 --> 01:22:10,830 Reed-Solomon code. 1225 01:22:10,830 --> 01:22:13,330 Of course it's the same as [UNINTELLIGIBLE]. 1226 01:22:13,330 --> 01:22:20,390 If we now delete the first position, we erase the first, 1227 01:22:20,390 --> 01:22:23,580 we puncture the first position all out, and we look at the 1228 01:22:23,580 --> 01:22:25,500 rest of the matrix. 1229 01:22:25,500 --> 01:22:26,370 This factor of the matrix. 1230 01:22:26,370 --> 01:22:30,180 Does this remind anybody of anything? 1231 01:22:30,180 --> 01:22:32,740 It's a DFT, it's a Fourier transform. 1232 01:22:32,740 --> 01:22:35,980 And that's what we start with next time. 1233 01:22:35,980 --> 01:22:41,780 So think about why this is a Fourier transform. 1234 01:22:41,780 --> 01:22:45,890 And maybe that's a nice analogy. 1235 01:22:45,890 --> 01:22:47,790 So we get the distance. 1236 01:22:47,790 --> 01:22:51,560 The distance is at least something, which means it's 1237 01:22:51,560 --> 01:22:52,360 not impulsive. 1238 01:22:52,360 --> 01:22:54,000 It's not a single 1 somewhere. 1239 01:22:54,000 --> 01:22:56,400 The vector that we get is not impulsive. 1240 01:22:56,400 --> 01:22:59,680 Maybe it has something to do with the bandwidth constraint 1241 01:22:59,680 --> 01:23:02,380 and the frequency domain. 1242 01:23:02,380 --> 01:23:03,990 That's what you have to think about on the way 1243 01:23:03,990 --> 01:23:05,690 home, and that's it. 1244 01:23:05,690 --> 01:23:06,940 Thanks so much. 1245 01:23:06,940 --> 01:23:15,114