前回の練習問題 6個の情報記号を,以下のように配置し,水平垂直パリティ検査符号を構成する. この符号の検査行列を求めよ

Slides:



Advertisements
Similar presentations
ベイズの定理と ベイズ統計学 東京工業大学大学院 社会理工学研究科 前川眞一. 2 Coffe or Tea 珈琲と紅茶のどちらが好きかと聞いた場合、 Star Trek のファンの 60% が紅茶を好む。 Star Wars のファンの 95% が珈琲を好む。 ある人が紅茶を好むと分かったとき、その人が.
Advertisements

だい六か – クリスマスとお正月 ぶんぽう. て form review ► Group 1 Verbs ► Have two or more ひらがな in the verb stem AND ► The final sound of the verb stem is from the い row.
て -form - Making て -form from ます -form -. With て -form, You can say... ~てもいいですか? (= May I do…) ~てください。 (= Please do…) ~ています。 (= am/is/are doing…) Connecting.
第 5 章 2 次元モデル Chapter 5 2-dimensional model. Contents 1.2 次元モデル 2-dimensional model 2. 弱形式 Weak form 3.FEM 近似 FEM approximation 4. まとめ Summary.
Essay writing rules for Japanese!!. * First ・ There are two directions you can write. ・よこがき / 横書き (same as we write English) ・たてがき / 縦書き (from right to.
VE 01 え form What is え form? え? You can do that many things with え form?
前回の練習問題 12ページの例において,3次,4次のエントロピーを求めよ.可能であれば,n 次エントロピーを計算するプログラムを書け.
英語特別講座 疑問文 #1    英語特別講座 2011 疑問文.
All Rights Reserved, Copyright (C) Donovan School of English
The Bar バー.
第1回レポートの課題 6月15日出題 今回の課題は1問のみ 第2回レポートと併せて本科目の単位を認定 第2回は7月に出題予定
日本語の文法 文型(ぶんけい)をおぼえよう!
Dont’ Ask Me That Question!
Chapter 11 Queues 行列.
日本語... ジェパディー! This is a template for you to use in your classroom.
[復習]通信路符号化の限界 通信路符号化定理(Shannonの第2符号化定理)
Chris Burgess (1号館1308研究室、内線164)
What did you do, mate? Plain-Past
Verb Plain Negativeform
Training on Planning & Setting Goals
前回の練習問題 a = とする(|a|=36) a を長さ2のブロックに区切り,x2値を計算せよ
画像特徴(点、直線、領域)の検出と識別-2 呉海元@和歌山大学 2007年5月14日
Only One Flower in the World
日本人の英語文章の中で「ENJOY」はどういうふうに使われているのか
Noun の 間(に) + Adjective Verb てform + いる間(に) during/while.
How do you talk about Positions/ Locations?
前回の練習問題について 情報記号 (x1, …, xk) に対し,検査記号 p = x1+…+xk+1として与えられる奇パリティ符号を考える.この符号が線形符号とならないことを証明せよ. 解答例: 線形符号とならない反例を示せばよい. x1=1, x2=x3=...=xk=0 ⇒ p = 0,対応する符号語は
A 02 I like sushi! I like origami!
10.通信路符号化手法2 (誤り検出と誤り訂正符号)
にほんご JPN101 Oct. 26, 2009 (Monday).
十年生の 日本語 Year 10 Writing Portfolio
Reasonので + Consequence clause
The future tense Takuya Mochizuki.
Chapter 4 Quiz #2 Verbs Particles を、に、で
The Sacred Deer of 奈良(なら)
Who Is Ready to Survive the Next Big Earthquake?
Did he/she just say that? Get your head out of the gutter! Oh wait….
“You Should Go To Kyoto”
know / knows(s) / ___________
VTA 02 What do you do on a weekend? しゅうまつ、何をしますか。
Nihongo Japanese 日本ご ‘Numbers ’ & ‘Hiragana Revision’
Chapter 1 Hamburger History
ストップウォッチの カード ストップウォッチの カード
Topics on Japan これらは、過去のインターンが作成したパワポの写真です。毎回、同じような題材が多いため、皆さんの出身地等、ここにない題材も取り上げるようにしてください。
て みる.
前回の練習問題 A B C D E F G H 確率 2元ハフマン符号を構成せよ 平均符号長を求めよ 0.363× × ×5= answers.
Causative Verbs Extensively borrowed from Rubin, J “Gone Fishin’”, Power Japanese (1992: Kodansha:Tokyo) Created by K McMahon.
-Get test signed and make corrections
くれます To give (someone gives something to me or my family) くれました くれます
Term paper, Report (1st, first)
Where is Wumpus Propositional logic (cont…) Reasoning where is wumpus
Question Words….
クイズやゲーム形式で紹介した実例です。いずれも過去のインターン作です。
いくらですか?.
9.通信路符号化手法1 (誤り検出と誤り訂正の原理)
2019/4/22 Warm-up ※Warm-up 1~3には、小学校外国語活動「アルファベットを探そう」(H26年度、神埼小学校におけるSTの授業実践)で、5年生が撮影した写真を使用しています(授業者より使用許諾済)。
Genetic Statistics Lectures (4) Evaluation of a region with SNPs
知能ソフトウェア特論 Intelligent Software
知能ソフトウェア特論 Intelligent Software
ー生命倫理の授業を通して生徒の意識に何が生じたかー
The difference between adjectives and adverbs
Created by L. Whittingham
東北大 情報科学 田中和之,吉池紀子 山口大 工 庄野逸 理化学研究所 岡田真人
Cluster EG Face To Face meeting
Grammar Point 2: Describing the locations of objects
Apply sound transmission to soundproofing
線形符号(10章).
Improving Strategic Play in Shogi by Using Move Sequence Trees
2012年度 情報数理 ~ ハミング距離 ~.
2010年度 情報数理 ~ ハミング距離 ~.
Presentation transcript:

前回の練習問題 6個の情報記号を,以下のように配置し,水平垂直パリティ検査符号を構成する. この符号の検査行列を求めよ 110111001010 を復号せよ a1 a2 a3 a4 a5 a6 p1 p2 q1 q2 q3 r p1 = x1 + x2 + x3 p2 = x4 + x5 + x6 q1 = x1 + x4 q2 = x2 + x5 q3 = x3 + x6 r = x1 + x2 + x3 + x4 + x5 + x6 quiz in the previous class - Consider the illustrated code - determine the check matrix - decode the given sequence The check matrix is easily derived from the coefficients of the definition (linear equations) of the parity check symbols. To decode the sequence, multiply the sequence (as a column vector) to the check matrix from the right. The obtained syndrome suggests the place of the error. In this case, 011001 equals to the fourth column of H, which means that the forth symbol in the sequence is in error. Flip the 1 to 0, and we obtain a codeword. 検査行列 H(1 1 0 1 1 1 0 0 1 0 1 0)T = (0 1 1 0 0 1)T これは4列目と等しい ⇒ 4ビット目が誤っている 復号結果は 110011001010

前回の練習問題 (15, 11)ハミング符号について, 検査行列(のひとつ)を作成せよ 生成行列を求めよ (右の行列) quiz in the previous class - Consider a (15, 11) Hamming code - give one of check matrices - give the generator matrix of constructed check matrix You can change the order of column vectors of the check matrix. In that case, the generator matrix also changes accordingly.

本日の講義:符号の性能評価 与えられたパラメータ (符号長,情報記号数)に対し, 非常に多くの線形符号が存在し得る 性能の良い符号もあれば,悪い符号もある 符号の性能を測る指標が必要 本日の内容: 最小ハミング距離を使った性能評価 符号の重み分布を使った性能評価 today’s class: performance evaluation of error correcting codes There are many error correcting codes, some are good and some are bad. In today’s class, we learn how to measure the ability of codes. Two techniques are introduced today. First, we consider a technique to estimate the performance by using the minimum Hamming distance. Then another technique is considered in which we make use of weight distribution of the code.

系列の類似度と符号の能力 符号語 u を送信,途中で誤りが発生し,系列 v が受信 常識的な通信路であれば... u と v の違いは少しだけのはず ビット誤り率 0.1 の BSC で 000 を送信; 000 となる確率:0.729 001 となる確率:0.081 011 となる確率:0.009 111 となる確率:0.001 符号語 u 受信系列 v similarity of two sequences Assume that a codeword u is transmitted and a sequence v is received. The sequence v is not always the same as u because of the errors, but we can expect that the difference between u and v is not so much if the communication channel is a practical one. For example, consider a binary symmetric channel with bit error probability 0.1 If we send 000, then we will receive 000 with probability 0.729, 001 with probability 0.081 and so on. We can see that the received sequence is something close to the originally transmitted codeword. Here we consider two scenarios. In the first case, there is a codeword which is very close to the transmitted codeword u. In this case, it can happen with non-negligible probability that the received codeword happens to coincide with that another codeword. There is no way for the receiver to distinguish if v was the originally sent codeword or any other codeword such as u is altered and become v. In the second scenario. codewords in a code are keeping away from each other. In this case, there is little chance for the error to make v coincide with another codeword. The receiver will be able to detect or correct errors. error on the channel alters u to a different c もし,符号語どうしが接近していると,誤りの影響を受けやすい u 誤り検出不能 u v v 誤り検出可能

系列と系列の「距離」を定義する dH(0100,1101) = 2 a=(a1, a2,..., an) と b=(b1, b2,..., bn) を,同じ長さの系列とする. a と bのハミング距離 dH(a, b) : 系列中で異なる記号の個数 dH(0100,1101) = 2 dH(000, 011) = 2 dH(000, 111) = 3 dH(010, 110) = 1 dH(011, 011) = 0 To proceed the discussion, we need a formal definition of the “distance” between two sequences. Let a and b be sequences with the same length. The Hamming distance between a and b is defined as the number of symbols which are different in a and b. For example, the Hamming distance between 0100 and 1101 is 2, because the first and the fourth symbols are different in the sequences. If the two sequences are the same, then the Hamming distance is 0 obviously. If we send a sequence u of length n over a binary symmetric channel with bit error probability p, then a sequence v with dH(u,v) = i will be received with probability (1-p)^{n-i}p^i. This is monotonically decreasing in the region p < 0.5, meaning that the received sequence v is similar to u with high probability. 厳密に書くと, ただし, ビット誤り率 p の BSCに長さ n の系列 u を送信したとき, dH(u, v) = i である系列 v が受信される確率は (1-p)n-ipi p < 0.5 のとき,i に関して単調減少:大規模な誤りは発生しにくい

符号語と符号語の距離 長さ4の符号を考える... 4次元超立方体を考え,各頂点を系列に対応付けると, 符号 = 頂点の部分集合 C1={0000, 1100, 0011, 1111} C2={0000, 0001, 1110, 1111} Consider a code of length four. We can define a natural correspondence between the sequences of length four and vertices of a four-dimensional cube. First consider the code C_1 given in the slide. It has four codewords and vertices which correspond to codewords are painted in red in the figure. You can see that the distance between two codewords are always two or more. On the other hand, in the code C_2, some codewords are positioned very closely. Generally spealing, C_1 will be a good code, while C_2 will be a bad code. どの符号語間も,ハミング 距離が2以上離れている 符号語間のハミング距離が 1のところがある 良い 悪い

最小ハミング距離 符号の「良さ」の一指標として,最小ハミング距離を考える 符号 C の最小ハミング距離 The minimum Hamming distance is the smallest Hamming distance between two different codewords. The minimum Hamming distance of C_1 is two, and the minimum Hamming distance of C_2 is one. 最小ハミング距離は2 最小ハミング距離は1

最小ハミング距離の例 から生成される線形符号の最小ハミング距離は3 線形符号ならば,もっと簡単に最小ハミング距離を計算できる 000000 001011 010101 011110 100110 101101 110011 111000 3 4 Here is an example of the computation of the minimum Hamming distance. Consider the code with the given generator matrix. This code has eight codewords, and the distance between codewords are summarized in the table. The smallest nonzero value in the table is 3, and the minimum Hamming distance of this code is 3. It seems that the complexity for computing the minimum Hamming distance in the O(N^2) where N is the number of codewords, but we have a smarter way if the code is a linear code. 線形符号ならば,もっと簡単に最小ハミング距離を計算できる

最小ハミング重み 系列 u のハミング重み:u に含まれる1の個数 符号 C の最小ハミング重み: 符号語 000000 001011 010101 011110 100110 101101 110011 111000 ハミング重み 3 4 000000 001011 010101 011110 100110 101101 110011 111000 3 4 The idea is to make use of the minimum Hamming weight. A Hamming weight of a sequence u is defined as the number of nonzero symbols contained in the sequence u. The minimum Hamming weight is defined as the minimum among Hamming weights of the nonzero-codewords. The table in the left shows the Hamming weights of the codewords of the code in the previous slide. It is the same as the first column of the table given in the previous slide, because the Hamming weight can be regarded as a Hamming distance between the sequence and the all-zero sequence. 同一

最小ハミング距離と最小ハミング重み 定理 : 線形符号の最小ハミング距離は,最小ハミング重みに等しい 証明 : u, v を最小ハミング距離にある符号語とする (dH(u, v)=dmin) 線形符号なので,u + v も C の符号語となる u と v で記号が異なるところ…u + v の記号は1 u と v で記号が同じところ…u + v の記号は0 よって u + v の重み(出現する 1の個数)は,dmin. a useful property of linear codes: Theorem : The minimum Hamming distance is always the same as the minimum Hamming weight. proof. Let u and v be codewords which give the minimum Hamming distance. Now we are considering a linear code and u + v is again a codeword. The symbols in the codeword u + v are such that… - 1, if the corresponding symbols in u and v are different - 0, if the corresponding symbols in u and v are the same This means that the Hamming weight of the codeword u + v is d_min, the minimum Hamming distance. Therefore the minimum Hamming distance is d_min or less. We can also show that the minimum Hamming distance equals to or less than the minimum Hamming weight, and the proof completes. u v u + v 01100 +) 01001 00101

線形符号の最小ハミング距離 (9, 4) 水平垂直パリティ検査符号:最小ハミング距離は4 (7, 4) ハミング符号:最小ハミング距離は3. 000000000 010010011 4 100010101 110000110 000101011 010111000 100111110 6 110101101 001001101 011011110 101011000 111001011 001100110 011110101 101110011 111100000 We can see that the minimum Hamming distance of the (9, 4) horizontal and vertical parity check code is four. For the (7, 4) Hamming code, the minimum Hamming distance is three. (7, 4) ハミング符号:最小ハミング距離は3. 0000000 0001011 3 1000101 1001110 4 0100111 0101100 0010110 1101001 0011101 1010011 1011000 0110001 0111010 1110100 1111111 7

一般のハミング符号の最小ハミング距離 定理:ハミング符号の最小ハミング距離は3である 証明方針:検査行列を H=(h1, h2, ..., hn) とする hi は互いに異なる非ゼロ列ベクトル 全ての非ゼロベクトルが H の列ベクトルとして出現 i1, i2, ..., iw ビット目が1 ,他が0の重み w の系列 u を考える. u が符号語 ⇔ HuT = hi1 + hi2 + ... + hiw= 0. w = 1 の場合:hi1 = 0 となり,hi1 が非ゼロであることに矛盾 w = 2 の場合:hi1 + hi2 = 0より hi1 = hi2 で,hi1 ≠ hi2 に矛盾 以上より,ハミング符号は重み2以下の符号語を含まない. We have seen in the previous slide that the minimum Hamming distance of the (7, 4) Hamming code is three. We also learned in the previous class that we can construct longer Hamming codes. You may think that longer Hamming codes have bigger minimum Hamming distances, but that is wrong expectation. Theorem: The minimum Hamming distance of the Hamming code is 3 (independent from the code length). sketch of the proof Let H be the check matrix and h_i be the i-th column matrix of H. Because of the definition of the Hamming code, columns vectors of H are nonzero vectors and they are all different. It also holds that an arbitrary nonzero vector occurs in H as a column vector. Assume that there is a codeword u which has 1 at the i_1, ..., i_w symbol positions (i.e. the Hamming weight of u is w). Since u is a codeword, its syndrome, which is defined as the vector sum of h_{i_1}, ..., h_{i_w} must be 0. If w = 1, then the equation is h_{i_1} = 0, which contradicts to the fact that the column vectors of H are all nonzero, a contradiction. If w = 2, then the equation become h_{i_1} + h_{i_2} = 0. This is equivalent to saying that h_{i_1} = h_{i_2}, but this is also contradicting the fact that all column vectors of H are different. We have seen that the Hamming code cannot contain a codeword whose Hamming weight is two or less.

証明の続き 定理:ハミング符号の最小ハミング距離は3である [証明の続き] ハミング符号は重み2以下の符号語を含まない. w = 3 の場合: HuT = hi1 + hi2 + hi3 hi1 + hi2 は非ゼロなので, h = hi1 + hi2 となるような 列ベクトル h が検査行列の中に必ず存在する hi1 + hi2 = hi3 となる i3 が必ず存在する i1, i2, i3 ビット目が 1 である系列 u に対し,HuT = 0 すなわち,重み3の符号語は必ず存在する. ⇒ ハミング符号の最小ハミング距離は3である. proof continued If w = 3, then the syndrome is a sum of three column vectors. Note that the sum of the first two column vectors cannot be zero. Because H contains all nonzero vectors as its column, h = h_{i_1} + h_{i_2} must exist in H as one of column vectors. Let i_3 be the column position of the vector h in H, then h_{i_1} + h_{i_2} = h_{i_3}. This implies that the syndrome is 0 and the code must contain a codeword with weight three.

最小ハミング距離と誤りの個数 最小ハミング距離 dmin=3: どの符号語も,互いに最低3ビット以上異なる. 送信符号語が誤って他の符号語になるのは, 3ビット以上の誤りが発生したときのみ. 「ある符号語から1ビットだけ誤って得られる系列」が, 「他の符号語から1ビットだけ誤って得られる系列」と 一致することはない. In the Hamming code, the minimum Hamming distance is three. This means that, if we compare two different codewords, then there are at least three symbols which are differ in the codewords. A transmitted codeword is mistakenly become a different codeword at the receivers end only if errors on the channel alter three or more symbols in the codeword. This can be also viewed from a different direction. A sequence which is obtained by a one-bit error of a codeword cannot coincide with any other sequences which are obtained by a one-bit error of different codewords. 誤り 誤り

dmin=3 ならば,復号領域の半径は最大でも1までしか取れない 最小ハミング距離と復号領域 dmin=3:各符号語を中心として,領土を設定する. 領土の半径を1にしても,領土どうしは重ならない. 領土の半径を2にすると,領土どうしが重なってしまう. Now consider that the minimum Hamming distance is three, and that each codeword have its own kingdom. The kingdom is actually a set of vectors consists of the codeword and sequences around the codeword. If the radius of the kingdom is 1 (i.e. the kingdom contains all sequences which are one or less away from the codeword with respect to the Hamming distance), then the kingdoms will not overlap. If the radius of the kingdom is 2, then the kingdoms overlap and some conflict will occur. The “decoding” can be understood in this context. Each codeword has a kingdom, or a decoding region, around the codeword itself, and claims that if a received sequence falls within the decoding region, then the sequence belongs to that codeword. If the minimum Hamming distance d_min = 3, then the radius cannot be more than one. This is the upper-bound limit of the ability of the code. 復号の原理:dmin=3 の場合, 各符号語を中心に半径1の領土を設定 受信系列が符号語 u の領土に入ったら,その系列は u に復号 復号領域 dmin=3 ならば,復号領域の半径は最大でも1までしか取れない 復号領域の半径が1 ⇒ 1ビット誤りを訂正可能

最小ハミング距離と誤り訂正能力 とする は x 以下の最大整数) ( dmin=7, tmax=3 dmin=8, tmax=3 tmax Consider more general case. For d_min, the minimum Hamming distance, define t_max as in the slide where we are using a floor function (floor of x is the maximum integer which is not greater than x). t_max is the maximum radius of the decoding region with which no conflict occurs between codewords. If d_min = 7, then t_max = 3. You can see that the kingdomes do not overlap with radius three. If d_min = 8, then t_max is again 3. In this case, there is a neutral zone between two kingdoms, but we cannot make the radius more than three. The radius of the decoding region equals to the number of bit errors correctable in a codeword. The minimum Hamming distance defines the upper-bound limit on the number of correctable errors. tmax tmax 復号領域の半径をtmaxにしても,復号領域どうしは重ならない ⇒ 符号 C は,tmaxビットの誤りを訂正可能 誤り訂正可能ビット数

最小ハミング距離と誤り訂正能力の例 dmin tmax 3 1 4 1 5 2 6 2 7 3 8 3 Here are some examples. 5 2 6 2 7 3 8 3

復号領域の大きさ は,符号の持つ最大誤り訂正能力. 復号領域の半径をtmax未満に設定してもOK:限界距離復号法 tmax t t_max gives the maximum number of correctable errors, but we can be modest and make the decoding region smaller than we are allowed. Decoding strategy based on this idea is sometimes called a bounded-distance decoding. In this case, we have more neutral zone. Vectors in the neutral zone do not belong to any decoding region. If a received vector is in the neutral zone, then we can detect that errors have occurred, but we do not look for the codeword which is the closest to the received vector. tmax t どの復号領域にも属さない これらの系列に対しては誤りの検出 だけを行い,訂正を行わない

復号領域の大きさと誤り確率 送信符号語 受信語 正しく復号 誤り検出 誤って復号 復号領域の半径と各種復号結果の確率 t 受信語 Here gives another sketch of the bounded-distance decoding. Assume that we are sending a codeword which is marked by the red circle. If the received sequence falls with in the decoding region of the transmitted codeword, then it will be decoded to a correct codeword (case A). If more errors occur, then the received vector falls in the neutral zone, and only error detection is done for the sequence (case B). If further errors occur, then the received sequence goes to the decoding region of the neighbor codeword. If this happens, then the decoder tries to correct errors in the sequence but the result of the decoding will be completely incorrect (case C). If the radius of the decoding region is large, then it increases the probability of the above case A. This is nice but we need to remark that the probability of the unfortunate case C is also increased. So, this is a high-risk and high-return strategy in decoding. If the radius of the decoding region is small, then the probability of the case A decreases. This also decreases the probability of the case C, and we can avoid the worst scenario in which the receiver accepts wrong information. This is a conservative approach than the above, but this choice might be better if we can request the sender to re-transmit the codeword. 正しく復号 誤り検出 誤って復号 復号領域の半径と各種復号結果の確率 半径 正しく復号 誤り検出 誤って復号 大 大 小 大 小 小 大 小 環境と要求により,復号領域を適切に設計する必要がある

直感的なイメージ:復号領域の大きさ A B C D A B C D 狙った的に当たる確率…大 他の的に当たる確率…大 的から外れる確率…小 You may be able to understand what the previous slide means by thinking of this kind of games. If you have only one arrow, and you must hit the arrow to the target C, then you will go for the game with big targets. This is analogous to consider the error correcting in a storage system in which we cannot request re-transmission of data. If you have so many arrows, then you will choose to make the targets small. You can retry many times and finally you will be able to let an arrow correctly hit the correct target. This is analogous of communication system in which re-transmission is allowed. 狙った的に当たる確率…大 他の的に当たる確率…大 的から外れる確率…小 狙った的に当たる確率…小 他の的に当たる確率…小 的から外れる確率…大

ここまでのまとめ 最小ハミング距離を使った性能評価 誤り訂正可能ビット数は,最小ハミング距離から決まる 最小ハミング距離が大きければ,誤り訂正能力も大きい summary of the first part of the class The minimum Hamming distance determines the maximum number of correctable errors in a codeword. The bigger the minimum Hamming distance is, the bigger the error correcting performance is.

ハミング距離による性能評価 最小ハミング距離 d のとき... 符号語中に発生する (d – 1)/2 ビット以下の誤りを訂正可能 同じ符号長で同じ最小ハミング距離なら,同じ性能か? 結論から言うと,同じとは限らない より精密な性能評価を行う手段が必要 Then, our next question: If two codes have the same minimum Hamming distance, then, do they have the same performance? Answer: No We will learn more sophisticated technique to evaluate the performance in detail.

着眼点:符号語の分布と符号の性能 符号語が近接しているほど,誤って復号する可能性が高い 符号語の近くにある他の符号語は,少ないほうが良い A B We already know that the distance between codewords is essential in considering the performance of the code. Assume that the code A and B have the same minimum Hamming distance, say d_min. In the code A, there are many codewords in the orbit which is d_min away from a codeword. In the code B, the number of codewords in the same orbit is small. Which, do you think, is the better code? You will think that B should be better. Yes, that’s right, and we learn techniques to justify your intuition. 最小ハミング距離が同じでも,A より B のほうが望ましい

重み分布を用いた特性評価 C={000000,100111,010110,110001, 001011,101100,011101,111010}. 重み0の符号語; 1個 重み3の符号語; 4個 重み4の符号語; 3個 ビット誤り率 p の2元対称通信路 限界距離復号法による1ビット誤り訂正 符号の重み分布 (weight distribution) Consider this simple linear code with eight codewords. We assume that the communication channel is a binary symmetric channel with bit error probability p, and we employ a bounded-distance decoding up to one-bit error in a codeword. The weight distribution of a code is a profile of numbers of codewords with respect to the Hamming weight. In this simple code, there are one codeword with Hamming weight 0, four codewords with weight 3, and 3 codewords with weight 4. This is the weight distribution of this code. Assume that we transmit an all-zero codeword 000000, and determine the probabilities of - the decoder correctly decodes the received sequence - the decoder mistakenly decodes the received sequence to a wrong codeword, and - the decoder detects errors 線形符号の場合... ゼロ系列を送信して得られる系列を復号し, 正しく復号される確率 誤って復号される確率 訂正不可能な誤りが検出される確率 を評価すればよい.

正しく復号される確率 ビット誤り率 p の通信路に 000000 を送信した場合: 正しく復号される:受信語が 000000 の復号領域に属するとき  000000 の復号領域は, {000000, 000001, 000010, 000100, 001000, 010000, 100000} 復号領域中の個数 小計 the probability of the decoder correctly decodes the received sequence The decoding region of the transmitted codeword 000000 contains seven vectors given in the slide. 000000 is received with probability (1-p)^6 One of codewords with weight one will be received with probability p(1-p)^5, and there are six such codewords in the decoding region. The probability of the decoder correctly decodes the received sequence is given as P_c 000000 が受信される確率 (1 – p)6 1 (1 – p)6 重み 1 の系列が受信される確率 p( 1 – p)5 6 6p( 1 – p)5 正しく復号される確率は Pc = (1 – p)6 + 6p(1-p)5

誤って復号される確率(1) ビット誤り率 p の通信路に 000000 を送信した場合: 符号語 001011 に誤って復号される: 受信語が 001011 の復号領域に属するとき {001011, 001010, 001001, 001111, 000011, 011011, 101011} 復号領域中の個数 小計 the probability of the decoder mistakenly decodes the received sequence to a wrong codeword Consider an incorrect codeword 001011 for example. The decoding region of this codeword is given in the slide. Similar to the previous slide, we can compute the probability that the decoder mistakenly decodes the received sequence into 001011. This is the probability that one of codewords of weight three is output by the decoder. 001011 が受信される確率 p3(1 – p)3 1 p3(1 – p)3 重み 2 の系列が受信される確率 p2( 1 – p)4 3 3p2( 1 – p)4 重み 4 の系列が受信される確率 p4( 1 – p)2 3 3p4( 1 – p)2 001011 に誤って復号される確率:3p2(1 – p)4 + p3(1 – p)3 + 3p4(1 – p)2 重み 3 の符号語のひとつに誤って復号される確率

誤って復号される確率(2) ビット誤り率 p の通信路に 000000 を送信した場合: 符号語 100111 に誤って復号される: 受信語が100111 の復号領域に属するとき {100111, 100110, 100101, 100011, 101111, 110111, 000111} 復号領域中の個数 小計 the probability of the decoder mistakenly decodes the received sequence to a wrong codeword Consider another codeword 100111 whose Hamming weight is four. Similar to the previous slide, we can compute the probability that the decoder mistakenly decodes the received sequence into this codeword. This is the probability that one of codewords of weight four is output by the decoder. 100111 が受信される確率 p4(1 – p)2 1 p4(1 – p)2 重み 3 の系列が受信される確率 p3( 1 – p)3 4 4p3( 1 – p)3 重み 5 の系列が受信される確率 p5( 1 – p) 2 2p5( 1 – p) 100111 に誤って復号される確率: 4p3(1 – p)3 + p4(1 – p)2 + 2p5(1 – p) 重み 4 の符号語のひとつに誤って復号される確率

誤って復号される確率(3) C={000000,100111,010110,110001, 001011,101100,011101,111010}. 重み0の符号語; 1個 重み3の符号語; 4個 重み4の符号語; 3個 誤って復号される確率 Pe = 4×(重み 3 のひとつの符号語に誤って復号される確率) + 3×(重み 4 のひとつの符号語に誤って復号される確率) = 4×(3p2(1 – p)4 + p3(1 – p)3 + 3p4(1 – p)2 ) + 3×(4p3(1 – p)3 + p4(1 – p)2 + 2p5(1 – p)) = 12p2(1 – p)4 + 16p3(1 – p)3 + 15p4(1 – p)2 + 6p5(1 – p). the probability of the decoder mistakenly decodes the received sequence to a wrong codeword The code contains four codewords with weight 3, and three codewords with weight 4. Multiply these numbers of codewords to the probabilities determined in the previous two slides, and we can obtain the probability of the decoder mistakenly decodes the received sequence to a wrong codeword, denoted by P_e. The probability P_d that the decoder detects errors is derived easily by usiing P_C and P_e. 訂正不可能な誤りが検出される確率 Pd = 1 – Pc – Pe.

特性評価の結果 This graph plots the probabilities which are derived in previous slides. The x-axis is the bit error probability of the channel, and the y-axis is the magnitude of the probabilities.

性能比較例 (7,4)ハミング符号: (9, 4)水平垂直パリティ検査符号 重み 3 4 7 符号語数 1 重み 4 6 符号語数 1 9 符号語: 0000000, 1000101, 0100111, 0010110, 0010110, 1010011, 0110001, 1110100, 0001011, 1001110, 0101100, 1101001, 0011101, 1011000, 0111010, 1111111 (9, 4)水平垂直パリティ検査符号 000000000, 000101011, 001001101, 001100110, 010010011, 010111000, 011011110, 011110101, 100010101, 100111110, 101011000, 101110011, 110000110, 110101101, 111001011, 111100000 重み 3 4 7 符号語数 1 Use the learned technique to evaluate the Hamming code and the horizontal and vertical parity check code. The weight distributions of these codes are given here and you can compute the probabilities P_c, P_e and P_d by yourself. 重み 4 6 符号語数 1 9

ハミング符号と水平垂直パリティ検査符号 この環境では,ハミング符号のほうが良いように思えるが... The red curves show the performance of the Hamming code, and blue curves show the performance of the horizontal and vertical parity check code. Solid lines are P_c, dashed lines are P_e and dotted lines are P_d. For the Hamming code, P_d is always zero because it is a perfect code and there is no “neutral zone”. On the other hand, the minimum Hamming distance of the horizontal and vertical parity check code is 4, and it has large neutral zone. Those observations can be confirmed in the graph. It seems that the Hamming code seems better for this environment, but the story might go different if the environment is different. この環境では,ハミング符号のほうが良いように思えるが...

再送が可能な環境での評価 誤りを検出した場合,一回だけ再送を許すとすると... 再送が可能で,ビット誤り率が 0.28 以下ならば, Assume that it is allowed for the receiver of the data to ask the sender re-transmit the data only one time. If the receiver detects an error, then it can ask the receiver the re-transmission of the same data. Such a mechanism does nothing to do with the Hamming code because the Hamming code decoder will never detect errors. On the other hand, the mechanism help improving the performance of the horizontal and vertical parity check code. The graph now have another curve with dirk blue color. The curve shows the probability that the decoder correctly decoder the received sequence at the first trial, plus, the decoder detects an error in the first trial but succeeds in decoding in the second trial. You can see that the performance is better than that of the Hamming code (red solid line) with bit error probability less than 0.28. 再送が可能で,ビット誤り率が 0.28 以下ならば, 水平垂直パリティ検査符号のほうが有利

本日のまとめ 最小ハミング距離を使った性能評価 簡単でわかりやすいが,大雑把な評価しかできない 符号の重み分布を使った性能評価 少し複雑な計算が必要,大規模符号の評価は困難 与えられた環境設定で,定量的な性能評価が可能 summary estimate the performance by using the minimum Hamming distance simple, but just sketch estimate the performance by using the weight distribution of the code detailed, but complicated

練習問題 以下の検査行列を持つ線形符号 C を考える C の重み分布を求めよ C の誤り特性評価を行い,式とグラフで各確率を示せ quiz determine the weight distribution of the code determine the probablities P_c, P_e and P_d of this code and plot the performance curves