Presentation is loading. Please wait.

Presentation is loading. Please wait.

前回の練習問題 ABACABADACABADを,LZ77法により符号化せよ 上記問題により得られた符号語系列を復号せよ

Similar presentations


Presentation on theme: "前回の練習問題 ABACABADACABADを,LZ77法により符号化せよ 上記問題により得られた符号語系列を復号せよ"— Presentation transcript:

1 前回の練習問題 ABACABADACABADを,LZ77法により符号化せよ 上記問題により得られた符号語系列を復号せよ
4文字前 長さ3 A B AC ABAD ACABAD 6文字前 長さ6 2文字前 長さ1 符号語系列は (0, 0, A), (0, 0, B), (2, 1, C), (4, 3, D), (6, 6, *) quiz in the last class - encode the sequence using LZ77 algorithm the figure illustrates the computation. We need two numerical information to represent one block; how many symbols we need to look back, and how many symbols we need to copy. - decode the result of the above encoding do the converse of the above encoding procedure. 上記問題により得られた符号語系列を復号せよ 符号化と逆の手順を行えばよい 4文字前 長さ3 A B AC ABAD ACABAD 6文字前 長さ6 2文字前 長さ1

2 前回の練習問題 LZWアルゴリズムの特許に関し,どのような問題が 発生したか調べよ UNISYS社が特許権を保有していた
当初は特許権の行使をしないと言明 ⇒ 後に方針転換 GIF 画像を扱うソフトウェアの取り扱いに混乱 2003年に米国,2004年に日本での特許失効 非可逆符号化を実現するアルゴリズムにはどのような ものがあるか調べよ (アルゴリズムとデータフォーマットの区別が明確でないが) JPEG, MPEG, GIF 等 quiz in the last class (cnt’d) - survey what has happened concerning the patent of LZW algorithm Originally, UNISYS holds the patent of LZW. At the beginning, UNYSYS said that they will not charge for the use of the LZW algorithm, because they considered that the algorithm will not have commercial importance. Some years later, the web become popular and the people in UNISYS found that good algorithms for images data play important roles in many on-line business. They changed the policy and said that they will charge for the use of LZW algorithm. This made a pretty confusion because the GIF encoding which uses the LZW algorithm were so popular that time. The patent has been expired in 2003 in the US, and in 2004 in Japan. - survey lossy coding algorithms which are used in practical applications there are many algorithms, though the distinction of the “algorithm” and “data format” is not clear...

3 本日の講義 乱数,擬似乱数について...「情報理論」からは少し寄り道 乱数,擬似乱数 Kolmogorovによる乱数の定義
擬似乱数の統計的検定 x2検定 擬似乱数の生成法 today’s class random sequence and pseudo-random sequence this is not on the main-stream of the information theory - random numbers and pseudo random numbers (sometimes “numbers” and “a sequence” are used confusingly in this slide) - the definition of randomness by Kolmogorov - test of randomness - chi-square test - algorithms for generation random numbers

4 乱数とは 乱数,乱数系列(random numbers, random sequence) 広辞苑による記述:
0から9までの数字から無作為復元抽出を繰り返して得られる数 の列.完全に無秩序でかつ全体としては出現の頻度が等しい. 工学の立場からは,上記の記述では不十分 「0から9までの数字」というのは本質的でない ⇒ 0 と 1 だけからなる乱数を考える場合も多い 「完全に無秩序」が,どのような状態を示すのか明確でない what is a random number? according to a dictionary, random numbers is a sequence of numbers, between 0 to 9, which are chosen with no special intensions or rules. It is a disordered sequence, and the probabilities of each number in the sequences are almost all equal. This definition is not “engineering” - the restriction “0 to 9” is not essential, and we often consider random numbers with 0s and 1s. - it is not clear what “disorder”means.

5 「無秩序さ」の定式化 「無秩序さ」:統計的に記述するのが一般的だが... 計算機の出現により,「無秩序さ」にも新しい定式化が出現
有限系列 x を出力するプログラムの集合を (x) とする プログラム...入力なしの決定性プログラム(毎回同じ動作) プログラム pの記述長を |p| と書く(文字数,行数等) x のコルモゴロフ複雑さ(Kolmogorov complexity) formalization of the “disorder” The disorder might be formalized by statistical means, but a new formalization has been investigated after the invention of computers and computer programs. Let x be a sequence with finite-length, and let Pi(x) denote the set of programs which outputs x, where a program in this context is a deterministic procedure which takes no inputs. The size of a program is denoted by |p| (|p| can be a number of lines or characters of the program). The Kolmogorov complexity of a sequence x is defined as the size of the smalless program in Pi(x). (x を出力する最小のプログラムのサイズ)

6 コルモゴロフ複雑さの例(1) x1 = “ ”... 40文字 x1 を出力するプログラム プログラムp1 : printf(“ ”); ...51文字 プログラム p2 : for(i=0;i<20;i++)printf(“01”); ...30文字 ⇒K(x1)  30 < 40 x2 = “ ”... 40文字 x2 を出力するプログラム プログラムp1 : printf(“ ”); ...51文字 ⇒K(x2)  51 Here are some examples. Note that the Kolmogorov complexity is an ideological notion, and it is difficult to determine the Kolmogorov complexity as a numerical value. Consider these examples as just sketch of ideas of Kolmogorov. Let x_1 be the sequence given in the slide. It consists of 40 characters. We can consider two programs which output x_1. The program p_1 outputs the string by a single printf sentence, and the program will contain 51 characters (ignore headers) . We can also consider another program p_2 which makes use of the “for” sentence. The size of this program is 30 characters. This suggests that the Kolmogorov complexity of the sequence x_1 is 30 or less. Next consider x_2. For this sequence we can consider a program (again) referred as p_1, and the program contains 51 characters. Unfortunately I could not find any other program which outputs x_2 and which is smaller than this p_1. The Kolmogorov complexity of x_2 is therefore 51 or less.

7 コルモゴロフ複雑さの例(2) x3 = “11235813213455891442333776109871597...”... 100万文字
プログラムp1 : printf(“ ”); 万+11文字 プログラム p2 : フィボナッチ数列を計算して出力 ⇒ 数百文字で十分 ⇒K(x3) は数百文字以下 コルモゴロフ複雑さ K(x): x の統計量ではなく,x の持つ構造・規則性の指標となっている Here is another example. Assume that there is a sequence x_3 which starts as illustrated and one million characters follow. For this sequence we can consider a program with one printf sentence whose size become one-million plus 11 characters. For this sequence, we can find another program if we realize that the sequence is actually a Fibonacci sequence. A program which computes the Fibonacci sequence will be realized by several hundred characters only, and the Kolmogorov sequence of x_3 is about several hundred bits. An interesting aspects of the Kolmogorov complexity is that, it is a measure of the “structure” and “regularity” rather than statistics.

8 コルモゴロフ複雑さと無秩序さ コルモゴロフ複雑さが大きいほど「無秩序」の程度が大きい ただし,コルモゴロフ複雑さを実際に計算することは困難
「無秩序さ」を定式化するための概念的なもの エントロピーとの違いは? エントロピーは,記号発生の統計的側面だけに着目 コルモゴロフ複雑さは,記号発生のメカニズムも考慮に Kolmogorov complexity and the “disorder” Larger Kolmogorov complexity implies that the sequence is more disorder. However, it is impossible determine the actual values of the Kolmogorov complexity. It is a “conceptual” measure rather than concrete one. You may wonder what is the difference between Kolmogorov complexity and the entropy. The entropy is a measure of disorder but it is defined according to the statistical aspect of the sequence. On the other hand, Kolmogolov complexity suggests the “complicatedness” of the mechanism which generates the sequence.

9 コルモゴロフ的な乱数の定義 コルモゴロフ複雑さに基づく乱数の定義:
系列 x について,K(x)  |x| のとき,x を乱数(系列)という x を記述するには,x を「ベタ書き」するしかない x をコンパクトに表現することができない(圧縮不能) プログラムもデータも2進系列で表現すると考えると... 長さ n の2進系列...2n 個存在 長さ n 未満の2進系列...2n – 1 個しか存在しない ⇒ 必ず,K(x)  x である系列(圧縮不能な系列)が存在する ⇒ 任意長の(コルモゴロフの意味での)乱数が必ず存在する The Kolmogorov complexity is used to formalize randomness. A sequence x is said to be random if K(x), the Kormogolov complexity of x, equals to or more than the length of the sequence x. In other words, to write x “as is” (by using one printf sentence”) is the most efficient way to represent x. There is no way to represent x more compactly than x itself. This definition allows us discussing how many random sequences exist in the world. Remind that a program is also a sequence, and assume naturally that a program is written as a binary sequence. For a positive integer n, there exist 2^n sequences whose length is n. On the other hand, the number of sequences which are shorter than n is only 2^n – 1. This suggests that there exists a sequence x with K(x) >= x, and that there exists a random sequence with an arbitrary length.

10 乱数の存在確率 Vn:長さ n の2進系列集合,|Vn| = 2n, |V1|+ |V2|+...+|Vn–1| = 2n–1
Next we consider “how common” random sequences is. Let V^n be the set of binary sequence with length n. The first line of the slide shows basic properties of the numbers of sequences in V^n. Assume that, 2^n -1 sequences out of 2^n sequences in V^n can be compressed (i.e. there is a shorter representation), then each of those sequences has a program which generates the sequences. The size of the program is less than n, and the program is a sequence in either of V^1, ..., V^{n-1}. This implies that no sequence in V^1, ..., V^{n-1} can be compressed, and that the sequences in V^1, ..., V^{n-1} are random. By extending this discussion, we can show that almost all sequences are random in the sense of Kolmogorov complexity. ⇒ ほとんどすべての系列は圧縮不能(乱数である)ことを導ける

11 乱数の計算可能性 ほとんどすべての系列は乱数である: 乱数の存在は保証しているが,具体的な作り方は不明
サイコロを振る,無作為に生成されるデータを利用 etc. 十分に無秩序で良い乱数を得ることができる 大量の乱数を得るには効率が悪い 専用の乱数生成装置を利用 大量の乱数を効率よく生成できる 特殊なハードウェアが必要でコスト高 真の乱数(真性乱数)を,大量かつ安価に生成するのは困難 Asymptotically saying, almost all sequences are random. However, we do not know how to generate random sequences. One common procedure to generate random sequences is to use a dice. This gives us random sequences with very good quality, but unfortunately it is not very efficient way. We will not be able to obtain mega bytes of random data by casting dices by ourselves. We may devise special hardware for generating random numbers, but such hardware must be expensive. Unfortunately, it is very difficult to generate “real” random sequences economically and efficiently.

12 擬似乱数 擬似乱数 (pseudorandom numbers): ある種の規則性にしたがって生成される「乱数のようなもの」
コルモゴロフ的な意味での乱数ではない コンピュータプログラム等により,大量かつ安価に生成可能 ⇒ 各種の実験やシミュレーション等の実施に有益 擬似乱数にも良いものと悪いものがある ⇒ 各種の検定により,擬似乱数の品質を評価することが必要 (検定法を頭に入れた後で,具体的な擬似乱数生成法を学ぶ) a pseudorandom number s are numbers which seem like random numbers, but generated according to a certain rule. In this sense, the Kolmogorove complexity of pseudorandom numbers is smaller than the original sequence, and they are not random numbers. However, such pseudorandom numbers are useful in various kinds of experiments and computer simulations. One note we need to remind is that there are “good” pseudorandom numbers and “bad” pseudorandom numbers. We would like to distinguish the good one bad one. After reviewing those “test” scheme, we consider algorithms for generating pseudorandom numbers.

13 統計的検定 擬似乱数の評価尺度: 予測が困難か? 暗号等の用途では重要だが,評価が難しい 十分に無秩序か?
数列の統計的性質を検定すれば良い...比較的容易 本講義では,統計的検定法について紹介 「あるプログラムから生成された十分長い系列が, どれだけ真性乱数に近い統計的性質を有するか」を議論 「特定の系列が乱数か否か」を判定する問題ではない statistical test there are several aspects in evaluating pseudorandom numbers. - is it difficult to guess the forthcoming sequences? this property is important if we use pseudorandom numbers in cryptographic purpose, but this is rather difficult subject. We do not discuss this issue in this class. - does the sequence have certain statistical biases? this can be evaluated by comparing statistics of the sequence with that of the real random sequences. In this class, we consider statistical test to evaluate pseudorandom numbers. We assume that we have a very long sequence, and evaluate how the statistics of the sequence show similar properties to real random sequences. Unfortunately this study does not give a clear answer to a naive question “is this sequence random?”

14 復習:確率,確率密度関数 確率変数:ある試行にともなって,なんらかの値を取る変数 確率分布:
確率変数が,どの確率でどの値を取るかをあらわしたモノ 確率変数が離散的な場合,確率分布は表の形で与えられる 確率変数が連続的な場合,確率分布は関数により与えられる review of statistics. remind the notion of a random variable and probability distribution function. If the random variable is not continuous, then the probability distribution function might be given by a table. 0.3 0.4 0.2 0.1 確率密度関数

15 確率密度関数について 確率密度関数(probability density function, pdf )
連続的な確率変数の確率分布を表現する関数 確率変数が x 以上 y 未満の値を取る確率= x y ここの面積 x → 最小値,y → 最大値のとき,面積=1 The probability distribution function, pdf is a function which represents the distribution of a certain continuous random variable. The probability that a random variable x takes a certain value between x and y is given by the integral. If x goes to the minimum value and y goes to the maximum value, then the integral approaches to one.

16 x2値の定義 確率変数Xの取る値を,有限個のクラス C1, C2, ..., Cl に分割する
C1 = [0, 10), C2 = [10, 30), C3 = {30以上の偶数}, C4 =その他 理想的なケースにおいて,X が Ci に属する確率は pi とする 実際に観測された値の系列を a = a1, a2, ... an とする 系列 a が,どれだけ理想に近いか評価したい ⇒ 以下の x2 値(x2 value, Chi square value)を評価すれば良い This slides define x^2 values, where “x” is a Greek symbol “chi”. Assume that we partition the domain of a random variable X into l classes C_1, C_2, ..., C_l. If X takes integer values between 0 to 100, then one possible partition is C_1, C_2, C_3 (even numbers with 30 or more) and C_4 (others). In X follows an ideal distribution, then we can determine p_i which is the probability that X takes a value in C_i. In the real observations, X does not follow the ideal distribution. Let a_1, ..., a_l be the actual sequence which is observed as X in a real trial. To evaluate how a_1, .., a_l are close to the ideal case, we deifne x^2 square values as given in the slide, where n_i is the number of observations that X takes a value in C_i. (カイ2乗値) ni...系列 a において Ci に属する要素の個数

17 x2値の解釈 ni...系列 a において Ci に属する要素の個数 npi...長さnの理想的系列における ni の期待値
⇒ 分子部分は 0 に近づく ⇒ x2 値も 0 に近づく Review the equation. n_i is the number of observations that X takes a value in C_i. np_i is the expected number of observations that ideal X takes a value in C_i. If the sequence a_1, ..., a_n is close to the ideal scenario, there it is expected that is not so much difference between n_i and np_i. In this case, the chi-square value approaches to zero.

18 x2値の計算例 1, 2, ...6 からなる42文字の系列(n = 42)を考える
これが真性乱数なら,pi = 1/6 ⇒ npi = 7 a1 = n1 = 10, n2 = 5, n3 = 8, n4 = 6, n5 =8, n6 = 5 x2 = 32/7 + 22/7 + 12/7 + 12/7 + 12/7 + 22/7 = 20/7 a2 = n1 = 25, n2 = 2, n3 = 3, n4 = 8, n5 =4, n6 = 0 x2 = 182/7 + 52/7 + 42/7 + 12/7 + 32/7 + 72/7 = 424/7 a1 のほうが a2 よりも理想的なケースに近い Here are examples of computation of chi-square values. We consider several sequences of integers from 1 to 6, where the length of the sequences is n = 42. If this were a truly random sequence, then it is expected that each integer occurs with the same probability of p_i = 1/6, and the expected number of occurrences of each integer is np_i = 7. Consider a_1 first. In this sequence, there are ten “1”, five “2” and so on. The chi-square value of a_1 is computed as given in the slide. Next, consder a_2. This sequence contains so many “1” but no “6”, so biased. The chi-square value of a_2 is 424/7, much larger than the chi-square value of a_1. This suggests that a_1 is close to the truly random sequence than a_2.

19 x2値の計算例(続) a3 = (長さ 72) n1 = 12, n2 = 12, n3 = 12, n4 = 12, n5 =12, n6 = 12 x2 = 02/ / / / / /12 = 0/12 理想的な乱数? 当然 no 長さ2のブロック化を考える...n = 36ブロック 理想は,“11”, ... “66” が1/36 の確率で発生,npi = 1 n11 = 6, n12 = 0, ..., n22 = 6, ... x2 = (6 – 1)2/1 + (0 – 1)2/ = 180 ⇒ かなり大きい 様々なクラス分割により,多面的に評価する必要がある Next, consider a sequence a_3 defined as this. For this example, the length of the sequence is 72. This is a quite unnatural sequence as you can see, but the chi-square value of a_3 is zero. Does this mean that a_3 is a “random” sequence? Obviously no. For example, consider to divide the sequence into blocks of size two. In this case, there are 36 patterns, and each pattern occurs equally likely. The sequence has 36 blocks and the expected number of each block occur is one. However, in the sequence a_3, we have six occurrences of the pattern “11”, zero occurrence of “12” and so on. The chi-square value of this “blocked” sequence is then determined to be 180. This is a pretty big value, suggesting that this sequence is not a random sequence. This example tells us one important fact. We need to evaluate a sequence using different partitions and different definitions of the classes.

20 x2値の評価 理想的な系列であっても,x2 値が必ず0になるとは限らない 理想的系列における x2 値の分布と,実測値とを比較すべき
[定理] 分割クラス数が l のとき,理想的系列における x2 値の分布は, 自由度が l – 1 のx2 分布 (Chi square distribution)にしたがう We remark that truly random sequences do not always make the chi-square value to zero. To evaluate the randomness of pseudorandom numbers, we need to known the distribution of the chi-square values of truly random sequence. This distribution is known as the Chi-square distribution. Theorem: If the number of classes is l, then the chi-square values of truly random sequences obeys the chi-square distribution with degree l-1. The graph sketches the distributions for several degrees. 自由度 2 自由度 4 自由度 6 x2 O

21 x2値の比較 a3 = (長さ 72) x2 = 02/ / / / / /12 = 0/12 クラス数が6 ⇒ 自由度が 5 の x2 分布に従うはず x2 O 自由度 5 Remind a_3 which we used previously. Without the “block” partition, we had considered six classes with each class corresponds to an integer between 1 to 6. If the sequence were truly random, then the chi-square value obeys the chi-square distribution with degree 5, which is illustrated as this. This graph suggests that it is quite rare that the chi-square value of a truly random sequence become zero. Too small chi-square values suggest that the sequence is unnaturally “ordered”. 理想系列では,x2 値が 0 になる確率はきわめて小さい 観測系列の x2 値が 0 ⇒ 理想系列では考えにくい振る舞い x2 値の大小だけでなく,理想的なx2分布と比較することが重要

22 x2値検定の実施例 x2 値の大小だけでなく,理想的なx2分布と比較することが重要 実際には...
元のデータをいくつかのデータセット(部分集合)に分割 各データセットに対して x2 値を計算 得られた x2 値の集合の妥当性を検証 再び統計的検定を行う 上下パーセント点と比較する 上側1%点... 100回に1回くらいは,この値を超えるかもという点 Here are some suggestions to make use of chi-square tests. Basically, the chi-square should be considered together with the chi-square distributions. In a typical scenario, the chi-square test is done as follows. 1. Divide the original sequence into several subsequences. Each subsequence is called a “data set” in this slide. 2. For each data set, compute the chi-square value. We have a set of chi-square values. 3. Compare the distribution of the chi-square values of the data sets and the chi-square distribution. If the two distributions are similar, then we can say that the original sequence has good randomness. The similarity of distribution can be defined in several different ways. - use a statistical test again - use percentile points: upper one-percentile point is a value (of chi-square values) such that one out of one hundred samples can have a chi-square value above this point.

23 その他の統計的検定手法 KS検定 (Kolmogorov-Smirnov検定) x2 検定の連続変数バージョン(クラス分割不要) ラン長検定
長さ k のラン発生確率 = 0.5×長さ k – 1 のラン発生確率 (2元の無記憶定常情報源で,通報発生が等確率のとき) ランの発生個数について,x2検定を行う手法 ポーカー検定,衝突検定,間隔検定 etc. いずれの方法も,十分長い系列に対して行う必要がある (短い系列では,統計的性質が出現しにくい) This slide shows several other statistical tests. KS test can be regarded as a continuous version of the chi-square test. We don’t have to define the family of classes in advance. The run-length test makes use of the distribution of runs in a sequence. In binary sequences, the probability that “a run with length k occurs” is half of the probability that “a run with length k – 1 occurs”. This allows us estimating the expected numbers of runs in a sequence. We can make use of this expected numbers to make the chi-square test based on run-length. There are several more test, but one common remark that they are useless if the length of the original sequence is too short.

24 擬似乱数生成アルゴリズムについて 様々な擬似乱数生成アルゴリズムが提案されている
その多くが,少量の乱数の種(シード, seed)から,多量の 擬似乱数系列を生成するもの 線形合同法 (linear congruent method) 漸化式に基づいて擬似乱数を生成する方式 i 番目の値が Xi のとき,Xi+1 = aXi + c mod M とする a, c, M は適当なパラメータ(選び方にはコツがある) 得られる値は 0 から M – 1の整数値 algorithm for generating pseudorandom sequences. There are several algorithms for generating pseudorandom sequences (such algorithms are sometimes called pseudorandom sequence generator (PSG). PSG usually accepts a seed information, and generates plenty number of pseudorandom numbers by using the seed. The linear congruent method is one of typical PSG algorithm. In this algorithm, random numbers are determined according to a certain recursion. For example, if the i-th value of the sequence is X_i, then the next number is determined as X_{i+1} = aX_i + c mod M, where a, c and M are appropriately chosen parameters. This recursion gives us pseudorandom numbers between 0 to M – 1.

25 線形合同法の性質について 得られる擬似乱数列の周期は,必ず M 以下となる M は,ある程度大きな値にすべき
M の選択がまずいと,Xiの下位ビットのランダム性が悪化する M は素数にするのが良い 処理を簡便にするため,M = 2e とする実装もある パラメータ a や c の選択についても,ある程度経験則が存在 Properties of linear congruent method have been studied widely. - The length of the cycle of the sequence is M or less. Thus M must be a big number. - If the choice of M is poor, then lower bits of X_i will not be random. It is recommended to choose a prime number for M, but it is OK in some cases to use M as a power of two, which is advantageous in implementation. - There are some heuristics for choosing parameters a and c.

26 線形合同法の問題点 (Xi, Xi+1) で与えられる2次元平面上の点を考える (Xi, Xi+1) の点が,平面全体に広がることが理想的
⇒ 直線 x = a 上には,高々一個の点しか配置されない One serious problem of the linear congruence method is that the randomness is poor if the algorithm is used in a high-dimensional context. Consider that two consecutive numbers X_i and X_{i+1} are used to choose one point in a two-dimensional plane. When one uses a PSG, he or she expects that the sampled points distribute all over the plane. However, if we employ the linear congruence method as PSG algorithm, then this expectation is betrayed. In the linear congruence method, X_{i+1} is uniquely determined from X_i. This means that the y-value is uniquely determined for the x-value in the plane. The sampled points align on a certain line on the plane. The example shows hat happens if we choose the equation in the slide is used. This alorithm is not appropriate if we would like to sample a point over a plane. (初期値が 5のとき) Xi+1 = 5Xi + 1 mod 7 のとき... 6 5 4 3 2 平面上の点を一個選んで... という用途には使えない 1 O 1 2 3 4 5 6

27 M系列法 M系列法 (M-sequence method) 線形フィードバックシフトレジスタを利用して擬似乱数を生成
p 段シフトレジスタ ⇒ 状態数は 2p 通り存在する 結線方法によっては,周期2p–1の擬似乱数系列が生成可能 特異点(全ゼロ状態)以外のすべての状態を経由 ⇒ 2p–1 という周期は,レジスタ数 p 段では最大値(Max) Another well known algorithm is the M-sequence method. In this method, we generate sequences by making use of a linear feedback shift register. If the depth of the register is p, then the circuit has 2^p different states. By carefully designing the value of the feedback switches, the register can generate a sequence whose cycle length is 2^p -1. In this case, the registers experience all the possible states except the all-zero singular state. Note that the cycle length is the maximum among all sequences which can be generated by shift registers with depth p. Xi Xi–1 Xi–p Xi–p+1 Xi–p+2 結線 or 断線

28 M系列法について 結線方法は,原始多項式の係数にしたがって決定(省略) 生成される系列は,非常に優れた統計的性質を持つ
レジスタの初期値の違い ⇒ 生成される系列の位相の違い シフト加法性 高い自己相関性を持つ ⇒ CDMA などの通信方式でも利用 properties of the M-sequence method - the values of feedback switches can be determined algebraically - the generated sequence has superior statistical properties - the difference of the initial contents of the registers does not make different sequences. It just makes a phase shift of an identical sequence - shift additive property: if we shift an M-sequence and add it to the original sequence, then the obtained sequence is again the M-sequence, just phase shifted - strong self correlation property: compute XOR of two sequences which are obtained by phase shifting an identical sequence. The number of zeros in the XOR become small if the two sequences are similar. If the sequence is the M-sequence, then the number of zeros shows strong peak only if the two shift phases are the same. This property is used in some communication systems.

29 その他の擬似乱数生成法 メルセンヌ・ツイスタ法 (MT法,Mersenne Twister method)
松本眞氏(現広島大),西村拓士氏(現山形大)により提案 メルセンヌ素数を使い,シフトレジスタ法を一捻りした方 高品質の擬似乱数を高速に生成することが可能 予測不能な擬似乱数生成法 いわゆる「記憶のない」乱数生成 ⇒ 暗号的用途に向く Blum 法など 線形合同法,M系列法,MT法は,暗号用途には使えない other algorithms - Mersenne Twister (MT) method - invented by Drs. Matsumot and Nishimura - give one twist of a shift-register based scheme - efficiently generates high-quality pseudorandom numbers - unpredictable PSG - some algorithms generate sequences which seems “memoryless”. This property is contributing in some cryptographic applications. - Blum algorithm is one typical algorithm - linear congruence, M-sequence and MT schemes are not of this type.

30 まとめ コルモゴロフ的な乱数の定式化 擬似乱数の統計的検定 x2検定 具体的な擬似乱数生成アルゴリズム 線形合同法,M系列法
summary of the class - Kolmogorov complexity - statistical test - algorithms

31 練習問題 a = とする(|a|=36) a を長さ2のブロックに区切り,x2値を計算せよ 長さ3,4のブロックに区切り,それぞれx2値を計算せよ 線形合同法を実装し,擬似乱数を生成せよ 上記で生成した擬似乱数を,2次元平面にプロットせよ (スライドの26ページ参照) レポート課題: 参照,5/10提出 report assignment: available from the above URL, due by May 10 quiz - let a be given as the slide - divide the sequence into blocks of length two, and compute the chi-square value - do the same thing for blocks with length three and four. - write a program of the linear congruence method - use the program to generate a pseudorandom sequence, and plot the obtained data in a two-dimensional array as in page 26 of this slide. Report assignment is given in the URL. The due date is May 10.


Download ppt "前回の練習問題 ABACABADACABADを,LZ77法により符号化せよ 上記問題により得られた符号語系列を復号せよ"

Similar presentations


Ads by Google