Download presentation
Presentation is loading. Please wait.
1
前回の練習問題 a = とする(|a|=36) a を長さ2のブロックに区切り,x2値を計算せよ 長さ2のパターンは4種類,それぞれ出現の期待値は4.5 |00| = 5, |01| = 1, |10| = 6, |11| = 6 x2 = 0.52/ / / /4.5 = 3.78 長さ3,4のブロックに区切り,それぞれx2値を計算せよ 長さ 3, 期待値 1.5, x2 = 2.67 長さ 4, 期待値 0.56 , x2 = 14.11 線形合同法を実装し,擬似乱数を生成せよ 上記で生成した擬似乱数を,2次元平面にプロットせよ ⇒講義ページ掲載の excel ファイル参照 quiz - let a be given as the slide - divide the sequence into blocks of length two, and compute the chi-square value ... there are four different patterns of length two, and the expected number of occurrences of the pattern is 4.5 each. The actual number of occurrences of the patterns are given as in the slide, and the chi-square value is 3.78. - do the same thing for blocks with length three and four. ... for chi-square values are 2.67 and for block length three and four, respectively. - write a program of the linear congruence method - use the program to generate a pseudorandom sequence, and plot the obtained data in a two-dimensional array these exercises are easily learned by using Excel. sample excel file can be obtained from the lecture home page.
2
第三部:情報を正確に伝える 通信途中で発生する誤りを検出し,訂正する 第二部で扱った符号化方式:情報源符号化 (source coding)
情報源に近いところで,通報を効率よく符号化(圧縮) これから扱う符号化方式:通信路符号化 (channel coding) 信頼性の低い情報伝達媒体を介した情報通信を考える 効率も大事だが,情報を正確に相手に伝えることが最大課題 part 3 of the lecture: transmit information correctly We learn techniques to detect and correct errors which occur in a communication channel. In the second part of the lecture, we learned source coding, which is to represent information in a compact manner. The coding we learn in this part 3 is a channel coding. Assume that we have an unreliable communication channel which may fail to transmit information correctly. The main concern of ours in this situation is the “correctness” of the received information.
3
誤り制御方式について:動機 現実世界における情報伝達: 有線・無線の通信,ハードディスク,CD,メモリ...
送った(記録した)ものが,正確に受け取られる保証はない 伝達過程で誤りが発生し間違った情報が相手に届くことも 誤りが発生しないようにするのがベスト...現実には難しい ⇒ 誤りの影響を,できるだけ小さくするようにできないか? 誤りの発生を検出して,再送信を要求する 前後の文脈から本来の情報を推測し,誤り訂正する motivation In this real world, there are plenty of different communication channels but most of them are erroneous. There is no guarantee that the transmitted information is received correctly at the receiver’s side. Errors which occur in the channel alter the transmitted data. It will be nice if we can improve the channel so that it does not cause any error, but it is not practical unfortunately. We need to accept the fact that errors do occur, and we should investigate techniques which are to minimize the affects of errors. We may be able to request re-transmission of data if we can detect errors. Moreover, we may be able to correct errors if we devise smarter coding techniques.
4
基本原理:いろはの「い」 電話などでの符丁:聞き間違い防止のための「符号化」 いろはの「い」: 本当に伝えたい情報は,「い」一文字だけ
「いろはの」の部分は,誤り訂正のための冗長な情報 principle: Alpha Bravo, Charlie Phonetic codes are sometimes used to prevent possible errors over telephone conversation. Instead of saying “a”, “b” and “c”, you may say Alpha, Bravo and Charlie. Note that this is NOT an efficient way to transmit information. The word “Alpha” contains one important symbol “A” and also contains four redundant symbols “lpha”. These redundant information contributes to increase the “distance” between two valid representation of information. “a” and “b” might be easily confused but “alpha” and “bravo” will not be confused because there is “big distance” between the two words. い り いろはの「い」 りんごの「り」 距離が近い 距離が遠い 「正しい表現」間の距離が大きくなるよう,冗長情報を付加 冗長情報は,誤りの検出・訂正に利用される
5
0と1の世界での誤り訂正符号 2進数2ビットの情報を送信したい そのまま送ると... 受信側で01を受信したとしても,
もともと01が送信されたのか 誤りが発生して01になったのか 判別できない 00 01 10 11 We would like to realize the phonetic code for binary data. Assume that we want to transmit two bits information. If the two bits are transmitted as is, then there is no way to distinguish the erroneous transmission. Even if you receive “01”, you cannot sure if really 01 was transmitted, or, something different data was transmitted and 01 is the result of the alternation by errors. Consider that we add three redundant symbols. This operation is called an “encoding”, and the obtained sequence is a “codeword”. The set of codewords is a “code”. By appending redundant symbols, the distance between two codewords become large, which contributes to prevent the “confusion” caused by noises over a communication channel. 人工的に冗長情報を付加すると... 00 → 00000 01 → 01011 10 → 10101 11 → 11110 00000 01011 01010 00011 00010 このような操作を符号化, 符号化して得られる系列を符号語, 符号語の集合を符号と呼ぶ
6
復号と誤りの訂正 どのような符号化をするか,あらかじめ合意しておく 送信側:送りたい情報を符号化して,符号語を送信
受信側:誤りを含んでいる可能性のある系列から, 送信された可能性の高い符号語を推測 ⇒ 復号 00000 01011 01010 00011 00010 decoding and correction of errors The sender sends one of codewords. The transmitted codeword might be altered by errors, and the receiver can obtain something different from a codeword. From the received sequence, we can estimate the codeword which has big probability of transmission. This operation is “decoding”. Assume that is received. This is not a correct codeword (of the code defined in the previous slide), but we can find that the codeword 00000 is very close to 00010, and we can estimate that the transmitted codeword was Note that the “decoding” is an estimation, and the estimation is not always correct. Error correcting codes and error detecting codes which we will consider are not “almighty” technology, unfortunately. 00010を受信した場合 00010は正しい符号語ではない 最も「近い」符号語は00000 送信符号語は00000と推測する 誤り訂正符号を使うデメリット? 復号結果が常に正しいとは限らない 長い系列を送るので,系列中の誤り発生確率は増える
7
誤り訂正符号は,本当にトクか? 一文字当たり 0.1 の確率で,誤った情報が相手に届く場合 符号化を行わない場合 正しく伝わる確率
誤って伝わる確率 0.92 = 0.81 1 – 0.81 = 0.19 00 → 00000 01 → 01011 10 → 10101 11 → 11110 Are error correcting codes useful? Assume that we have a communication channel which can transmit information correctly with probability 0.9, and incorrectly with probability 0.1. Also assume that we use the code which we have considered in previous slides. If we do not use this code and send two-bit information as it is, then the receiver will be able to receive the information correctly with probability 0.9^2 = 0.81. With the remaining probability 0.19, the receiver obtains incorrect information. Next assume that we use the code. This code can correct one-bit error which occurs in a codeword. Therefore, the probability that the receiver obtains correct data equals to the probability that 1 or less error occurs in the transmitted five symbols. This probability is and this is more than 0.81. In this case, using the code is a good choice. But this is not always the case. Consider, for example, that the bit error probability of the channel is 0.4 instead of 0.1. In that case, using the code reduces the probability of the correct data transmission. 符号化を行う場合 正しい復号結果が得られる確率 = 5ビットのうち,1ビット以下で誤りが発生する確率 0.95 + 5C10.94・0.1 = 誤った復号結果が得られる確率 1 – = 符号や通信路によっては,符号化で損する場合もある
8
通信路のモデル 情報の送受信: 「信頼のできない通信路を解した情報通信」とモデル化可能 片方から情報を「入力」、他方から情報が「出力」
入力と出力が同じとは限らない We consider a communication channel which is not reliable. If we give one symbol to the channel as an input, then the channel produces one symbol as an output. The input symbol and the output symbol are not always the same. In this framework, we do not consider the communication delay, the erasure of transmitted symbol, and so on. 通信路 入力 (送信側) 出力 (受信側) 簡単のため、以下の仮定をおく 入力情報、出力情報は「記号(文字)」である 一個の記号を入力すると、必ず一個の記号が出力 遅延や記号の紛失、順番の入れ替わりは考えない
9
通信路の例1 (ビット誤り率 p の)2元対称通信路(通常 p < 0.5) 入力記号の集合 = 出力記号の集合 = {0, 1}.
1 p 1 – p 入 力 出 力 channel example 1 This slide sketches a binary symmetric channel, BSC, with bit error probability p. The input and the output have the same set of symbols {0, 1}. If 0 is given as an input, then the output is 0 with probability 1-p, and 1 with probability p. If 1 is given as an input, then the output is 0 with probability p, and 1 with probability 1-p. The BSC is said to be memoryless and stable communication channel. If is transmitted, then is received with probability (1-p)^3 p^2. Binary Symmetric Channel, BSC 誤りの発生は,完全に独立 (記憶のない通信路) 誤り発生する確率は,常に一定 (定常通信路) 00000 送信時,受信系列が となる確率 = (1–p)3p2
10
通信路の例2 消失通信路:情報が途中で「消失する」通信路 入力記号は {0,1},出力記号は {0, 1, X}
1 入 力 出 力 X channel example 2 This slide shows an erasure channel. The output symbols contain a special symbol “X” which acts as a place holder of an erased symbol. Synchronous communication channels and storage devices are well explained by this model. 現実世界では... 送信側と受信側で同期が取れている通信方式 決まった位置に値を記録するような記録装置 などに対応
11
通信路の例3 記憶のある通信路 誤りの発生確率に,時間的な相関があるもの バースト誤り通信路が代表的
channel example 3 There are different types of communication channels. Typical channel includes, for example, channels with memory. In this case, the occurrences of errors are not independent. A channel with “burst” errors is one of well known memory channel. Sometimes channels are not stable, but we do not consider those unstable channel in this class. 非定常な通信路 時間経過に伴い,誤り発生の確率分布が変化する 楕円軌道を取る人工衛星など
12
簡単な準備 長さ m の2進系列を Vm と表記する 情報の集合 I = Vk 符号 C ⊂ Vn (n > k とする) V 3
e 00 01 10 11 000 011 101 110 系列 b1b2...bm は (b1, b2, ..., bm) と表記する場合もある preliminary Let V^m denote the set of binary sequences of length m. We consider that the set of data to be encoded is chosen from I = V^k, and that the code C is a subset of V^n, where n > k. We may sometimes write a sequence b_1...b_m as a tuple. Usually we consider additions over a binary field, and the addition of sequences is consider in a symbol-wise. Do not confuse it with a normal addition of binary-represented integers. C とくに断らないかぎり,加減算は2進体上で行う 0 + 0 = 0, = = 1, = 0, a – b = a + b,... 系列の加減算は,各記号ごとに行う (繰り上/下がりなし) = 100 系列を2進数と解釈して計算するのではない点に注意
13
符号選択の規準 集合 I = Vk に対し,符号長 n と符号 C ⊂ Vn を定める Vn: 2n 個の系列を含む
C: 2k 個の系列を含む どのような n, C を選ぶのが良いのか? n は小さいほうが望ましい C は十分な誤り訂正能力を持つこと 符号化および復号が簡単に行えることが望ましい ⇒ C の選び方: To determine a code is to determine the code length n and then to determine the subset C of V^n. Note that V^n contains 2^n sequences, but C contains 2^k sequences. We have many different ways for choosing C. Then which C is good? - n should be small - C must have enough ability for the error correction - the encoding and the decoding must be realizes easily Linear codes are advantageous to establish the last goal. 符号長 n の全ての符号 線形符号 線形符号(linear code): 少し特殊な符号のクラス ...まずは簡単な例から
14
偶パリティ符号 偶パリティ符号(even parity code)の符号化操作:
情報 (a1, a2, ..., ak) ∈ Vk に対し... p = a1 + a ak を計算する (a1, a2, ..., ak , p) を (a1, a2, ..., ak) の符号語とする k = 3 の場合 The even parity code is one of the simplest error detecting codes. Given k symbols a_1, ..., a_k, we compute p as in the slide and append p to the sequence. This is the encoding of the even parity code. This figure sketches the case with k = 3. 000 001 010 011 100 101 110 111 0000 0011 0101 0110 1001 1010 1100 1111 p = = 0 符号 C
15
偶パリティ符号の特徴 1010 符号長は n = k + 1 符号語は, もともと送りたかった情報そのもの(情報記号)
あとで付け足した記号(冗長記号,検査記号) からなる(組織符号, systematic code) 0000 0011 0101 0110 1001 1010 1100 1111 (information symbol) (parity symbol) The code length of an even parity code is k + 1. A symbol in a codeword is either of - information symbol, which was included in the original information - parity symbol, which was appended by the encoding operation In the case of the previous example, the first three symbols are information symbols and the last symbol is the parity symbol. If symbols in a codeword is clearly distinguished as above, then the code is said to be a systematic code. Another important feature of the parity check code is that each codeword contains even number of 1s. If, at a receiver’s end, the number of 1s is not even, then it means that something wrong has happened. We can say that parity check codes can detect odd numbers of errors. 1010 情報記号 冗長記号 符号 C 符号語に含まれる1の個数は,かならず偶数個 受信系列中に1 が奇数個⇒異常事態 奇数個のビット誤りを検出できる線形符号である
16
水平垂直パリティ検査符号 水平垂直パリティ検査符号の符号化: 情報記号を長方形状に並べる 水平方向および垂直方向に偶パリティ符号化を行う
冗長記号を一次元に配列し,情報記号に付加する k = 9 の場合: (a1, a2, ..., a9) の符号化 p1 = a1 + a2 + a3 p2 = a4 + a5 + a6 p3 = a7 + a8 + a9 q1 = a1 + a4 + a7 q2 = a2 + a5 + a8 q3 = a3 + a6 + a9 r = a1 + a a9 The next example we consider is a horizontal and vertical parity check codes. In this coding, we place information symbols in a rectangle manner. Then we do the encoding of the even parity code in the horizontal direction, and then in the vertical direction. In equations, seven parity symbols of this code is defined according to these equations. a1 a2 a3 p1 a4 a5 a6 p2 a7 a8 a9 p3 q1 q2 q3 符号語は (a1, a2, ..., a9, p1, p2, p3, q1, q2, q3, r) r
17
符号化の例 を符号化すると,符号語は 情報記号 冗長記号 example Encode , and the codeword is This is also a systematic code because the first nine symbols are information symbols, and the remaining seven symbols are parity symbols. If k = ab, then the code length is n = ab + a + b + 1. 特徴 水平垂直パリティ検査符号は,組織符号 k = ab のとき,符号長は n = ab + a + b + 1
18
水平垂直パリティ検査符号の性能(1) 1 010110101 011110101 符号語中の1ビット誤りを訂正可能
受信語を,送信語と同じ順番で並べ, 各行および列の1の個数を数える 誤りが発生しなかったとき ⇒ 全ての行,列は偶数個の1を含む 1ビット誤りが発生したとき, ⇒ 1が奇数個ある行,列が,それぞれ1個ずつ存在 ⇒ その行と列が交差するところが誤りを含む We see that the code can correct one-bit error. At the receiver’s end, consider the received symbols according to the pre-determined order. If there is no error, then all rows and all columns contain even number of 1s. If one symbol is flipped by an error, then the receiver will find one row and one column which contain odd number of 1s. The intersecting point of the found row and column involves an error. 1 偶 奇 受信語 誤りを訂正
19
水平垂直パリティ検査符号の性能(2) 2ビット誤りが発生したとき, 実際の誤り どちらのパターンかわからない 1が奇数個の行,列が
What happens if two errors occur in one codeword? In this case, the receiver found two rows and two columns which contain odd number of 1s. These rows and columns must contain errors, but the receiver cannot know which error pattern has occurred actually. The receiver detects that two or more errors have occurred, but it cannot determine the position of errors and correct them. どちらのパターンかわからない 1が奇数個の行,列が 2つずつ存在することになる 誤り検出はできるが,訂正はできない...性能評価は次回以降
20
線形符号の定義 線形符号とは... すべての冗長記号が,情報記号の線形式で与えられる符号 線形式: p = a1x1 + a2x akxk (a1, a2, ..., ak ∈ {0, 1}) 偶パリティ符号 We have seen two examples of linear codes. A linear code is a code whose parity symbols are defined by linear equations of information symbols, where the coefficients of the equation is either of 0 or 1. The yellow boxes show the definition of parity symbols of the two example codes. We can see that the equations are all linear, and the codes are both linear codes. (x1, ..., xk, p) p = x xk 水平垂直パリティ検査符号 (x1, ..., xab, p1, ..., pb, q1, ..., qa, r) ...どちらも線形符号
21
線形符号の性質 定理:任意の符号語 u, v ∈ C に対し,u + v も符号語である
補題:任意の線形式 f(x1, x2, ..., xk) = a1x1 + a2x akxk に対し, f(u1, u2, ..., uk) + f(v1, v2, ..., vk) = f(u1 + v1, u2 + v2, ..., uk + vk) 定理の証明概略 Linear codes have an important property. Theorem: If u and v are codewords of a linear code C, then u + v is also a codeword of C. To show this theorem, we need a technical lemma: For an arbitrary linear function f(x_1, ..., x_k), the equation holds. By using the lemma, the theorem is shown as sketched in the slide. 符号語 (u1, u2, ..., uk, ..., f(u1, u2, ..., uk) , ...) +) 符号語 (v1, v2, ..., vk, ..., f(v1, v2, ..., vk) , ...) (u1 + v1, u2 + v2, ..., uk + vk, ..., f(u1, u2, ..., uk) + f(v1, v2, ..., vk), ...) = (u1 + v1, u2 + v2, ..., uk + vk, ..., f(u1 + v1, u2 + v2, ..., uk + vk), ...) 符号語
22
線形符号の構成例 k=3 の場合:情報記号列は (x1, x2, x3). 検査記号を適当に定義する. p1 = x1 + x2
⇒ 符号語を (x1, x2, x3, p1, p2) と定義 The theorem in the previous slide holds for any linear codes. For example, define a linear code by choosing linear functions at random. This code has eight codewords, and we can see that addition of two codewords is again a codeword.
23
単位ベクトルに対応する符号語 情報記号 x1, x2, ..., xk に対し,m 個の検査記号を定義する
j 番目の検査記号の定義式を,以下の線形式で与える pj = aj,1x1 + aj,2x aj,kxk (aj,i = 0 or 1) (0, 0, ..., 0, 1, 0, ..., 0) (i 番目だけ 1)に対応する符号語 ci は, ci = (0, ..., 1, ..., 0, a1,i, a2,i, ..., am,i) 1 i k ^ ^ ^ We consider another important property of linear codes. Assume that we define m parity symbols for k information symbols x_1, ..., x_k. Also assume that the j-th parity symbol p_j is defined by the equation where the coefficients are either of 0 or 1. Now consider to encode information symbols which are all zero except the i-th symbol. In this case, p_j equals to a_{j,i} and the codeword c_i is defined as in the slide. Consider the example in the previous slide. For three “unit” information symbols, we have three different codewords c_1, c_2 and c_3. Note the transpose relation between the parity symbols of the three codewords and the coefficients of the linear equations which define the parity symbols. c1 = c2 = c3 = 係数を転置 前ページの例: k = 3. 情報記号 (x1, x2, x3) に対し, p1 = x1 + x2 p2 = x2 + x3 (a1,1 a1,2 a1,3) = (1 1 0) (a2,1 a2,2 a2,3) = (0 1 1)
24
線形符号における基底ベクトル 線形符号=線形空間: 基底符号語 b1, b2, ..., bk ∈ C が存在し,任意の u ∈ C は
u = u1b1 + u2b ukbk (ui ∈ {0, 1}) と表現可能 Mathematically saying, a linear code is a linear space: there exist basis codewords b_1, ..., b_k in C, and any codeword in C is represented as a linear combination of those k basis codewords. For a single linear code, there are many different ways to choose the basis codewords. Interestingly, the codewords c_1, ..., c_k which are defined for “unit” information symbols can be a basis. Any codeword in C is represented as a linear combination of those k basis codewords. 定理: (前ページで定義した)c1, c2, ..., ck は C の基底符号語となる 任意の符号語 u ∈ C は, u = u1c1 + u2c ukck (ui ∈ {0, 1}) と表現される.
25
22ページの例では... p1 = x1 + x2 p2 = x2 + x3 情報記号 (x1, x2, x3) に対し,
c1 = c2 = c3 = 符号語 00000 00101 01011 01110 10010 10111 11001 11100 = 0・ ・ ・00101 = 0・ ・ ・00101 = 0・ ・ ・00101 = 0・ ・ ・00101 = 1・ ・ ・00101 = 1・ ・ ・00101 = 1・ ・ ・00101 = 1・ ・ ・00101 This slide shows that, in the code given in slide 22, any codeword can be represented by a linear combinations of c_1, c_2 and c_3 which correspond to the “unit” information symbols. c1, c2, c3 は基底符号語となっていることがわかる.
26
生成行列 検査記号の定義が pj = aj,1x1 + aj,2x2 + ... + aj,kxk のとき,
ci = (0, ..., 1, ..., 0, a1,i, a2,i, ..., am,i) となる c1, c2, ..., ck は符号 C の基底符号語をなす 任意の符号語 u は u = u1c1 + u2c ukck と表現可能 We have seen that - the parity symbols of a codeword c_i for a “unit” information symbols has strong relation with the coefficients of the linear definition of the parity symbols. - those c_1, ..., c_k is a basis of the code, and - any codeword can be obtained as a linear combination of c_1, ..., c_k. The last combination can be written in a vector-matrix format. The codeword (u_1, ..., u_k, p_1, ..., p_m) is obtained by multiplying (u_1, ..., u_k) to the matrix given in the slide. This matrix is called a generator matrix of the code C. With this vector-matrix equation, we can realize the encoding operation in a simple manner. 符号 C の生成行列 この式により,情報記号列と符号語とが一対一対応 ⇒ 符号化の操作が効率的に行える
27
生成行列の構造 生成行列 各行は 単位ベクトルに 対応する符号語 検査記号の定義式の係数を k × k 単位行列 転置して得られる行列
Review the structure of the generator matrix. The left submatrix is a k by k identity matrix, and the right submatrix is a transpose of the coefficients of linear functions which are to define parity symbols. Each row of the generator matrix is a basis vector which corresponds to a unit information symbol. 検査記号の定義式の係数を 転置して得られる行列 k × k 単位行列
28
生成行列の例 偶パリティ符号 p = x1 + x2 + x3 水平垂直パリティ検査符号 x1 x2 x3 x4 p1 p2 q1 q2 r
This slide shows generator matrices of the even parity check code and that of the horizontal and vertical parity check code. p1 = x1 + x2 p2 = x3 + x4 q1 = x x3 q2 = x x4 r = x1 + x2 + x3 + x4
29
生成行列を用いた符号化操作 水平垂直パリティ検査符号 0111 を符号化するには... 符号語は 011110101.
This slide shows that the encoding is a single multiplication of a vector and a generator matrix. 0111 を符号化するには... 符号語は
30
論理回路による符号化器の実現 u1 u2 u3 u4 p1 p2 q1 q2 r 情報記号 符 号 語 排他的論理和
符 号 語 Using the matrix is easy for human to understand, and it makes clear the circuit of the encoder. This example shows that the encoder of a linear code can be realized by a simple combinatorial circuit. 排他的論理和
31
本日のまとめ 線形符号は... 符号語集合がベクトル空間をなすような符号 検査記号が,情報記号の線形式で定義される符号
単位ベクトルを符号化して得られる符号語 ⇒基底ベクトル 生成行列 基底符号語を行とする行列 符号化操作を効率的に行うための道具 生成行列に基づく組合回路により,符号化器を実現可能 summary of today’s class A linear code: - linear space - parity symbols are defined by linear equations The codewords for a unit information symbols become a basis. A generator matrix -each row is a codeword for a unit information symbols - good tool to realize efficient encoder - good also for hardware-realization.
32
練習問題 情報記号 (x1, …, xk) に対し,検査記号 p = x1+…+xk+1として与えられる奇パリティ符号を考える.この符号が線形符号とならないことを証明せよ. 6個の情報記号を,以下のように配置し,水平垂直パリティ検査符号を構成する.この符号の生成行列を求めよ quiz: - Show that odd parity check codes are not linear codes. - Show the generator matrix of the given horizontal and vertical parity check code. a1 a2 a3 a4 a5 a6 p1 p2 q1 q2 q3 r (a1, ..., a6) → (a1, ..., a6, p1, p2, q1, q2, q3, r)
Similar presentations
© 2024 slidesplayer.net Inc.
All rights reserved.