4.1 Chapter Overview. 4.1 Chapter Overview 4.2 The History of the 80x86 CPU Family Intel製CPUの歴史.

4.1 Chapter Overview

4.2 The History of the 80x86 CPU Family
Intel製CPUの歴史

4004, 4040 Intel 商用の初CPU まだ4ビットだった

8008, 8080, 8085 Intel 4004を8ビットにしたものこれを使って、Altair 8800 という初の“Personal” Computerが作られた競合他社 Motorola 6800: プログラムしやすい Zilog Z80: 8080との互換性

8086 Intel 競合他社プログラムしやすさで勝っていた 16ビットCPU メモリが16Kで$200の時代当時としてはずば抜けた性能
メモリ節約のため、8bitのコードを実行可能に当時としてはずば抜けた性能競合他社 Zilog Z8000 Motorola 68000: 32-bit register National Semiconductor 16032 (renamed to 32016) : 32-bit register プログラムしやすさで勝っていた

80186 Intel 80186, 80188 (16-bit) iAPX432 (32-bit) 周辺回路を多数内蔵 8086の倍の速度
現在の主要なCPUはこれとの互換性をもつ１MBのメモリ制限（普通のPCは4~64KB) iAPX432 (32-bit) Adaという言語のコンパイラのみ IBMのPC/ATが人気に →自然消滅

80286 Intel 80286 (16-bit) 1982年登場プロテクトモード採用（メモリが最大16Mに）
IBMのPC/ATに採用される→互換機普及

80386 競合他社 Intel RISCを開発 32ビットCPU 命令一つにかかるサイクルを小さく統一
現在のPentium 4もこの命令セットの拡張を使っています 4GBまでメモリが扱える（レジスタ的に） FPUコプロセッサの80387が存在する CISC 命令にかかるサイクルはばらばら性能は良いが、開発が難しい競合他社 RISCを開発命令一つにかかるサイクルを小さく統一 CPUをシンプルに

80486 Intel 競合他社 FPU内蔵パイプライン導入 80386からの追加命令は少ない RISCでスーパースカラを実現
命令を予め処理しておく分岐に弱い 80386からの追加命令は少ない競合他社 RISCでスーパースカラを実現 1サイクルで命令を複数実行浮動小数点演算に強かった CADなどに採用される

Pentium Intel 命名規則を変更他社が同じ数字をつけるためデスクトップ用途で強かった

Pentium Pro Intel 32ビットソフトウェアに最適化サーバ用途向けの命令セットを追加マルチプロセッサ可
とはいえ、当時はまだまだ16ビットアプリケーションが多く、実力を発揮できずに消えた

MMX Pentium Intel MMX命令をPentiumに追加ただし整数しか扱えないマルチメディア用途向け

Pentium II Intel ハイエンド用にXeonを発表 Pentium Pro += MMX + 16ビットの性能UP
パフォーマンス主義に→高価になり客減少ローエンド用のCeleronを発表 2次キャッシュなしパフォーマンスは半分ハイエンド用にXeonを発表キャッシュ増量 2つ以上のマルチプロセッサをサポート

Pentium III Intel SSE命令を追加 1GHz達成はAMDに2日遅れた・・・いつの間にか性能でRISCを追い抜いていた
浮動小数点も扱える 1GHz達成はAMDに2日遅れた・・・いつの間にか性能でRISCを追い抜いていた

Pentium 4 Intel 競合他社 Pentium Pro 以来の完全に新しく設計されたコアこのテキスト執筆時は2GHzが最速
その後追加の機能 Hyper Threading Dual Core 競合他社 RISCを64ビットCPUに４GBを越えるメモリ搭載も珍しくなくなった

Itanium (アイテニアム） Intel + HP 競合他社 Intel初の64ビットCPU 結局64ビット化はAMD64が成功
IA-64という全く新しい命令セットを使う従来のx86はエミュレートで利用（遅い！）当初の周波数は1GHz そのときPentium 4は2GHzを達成 64ビット用コンパイラも未成熟だった競合他社結局64ビット化はAMD64が成功 IntelはEM64Tによって後追いになった

4.3 A History of Software Development for the x86
Intelが重視したもの互換性 8085→8086 バイナリ互換はないその代わり、CPU命令はほとんど同じその頃のOSは CP/M、言語はMicrosoft BASICが使用されていたこれらの移行がもっとも大事

重要な出来事 VisiCalc（世界初の表計算ソフト）の登場 IBM PCの登場 Appleはこれで急激に成長
趣味ではなく、ビジネス用に本格的にコンピュータが使われるようになった多くのソフトウェアが作成されるようになった

8086のセグメントアーキテクチャ 4ビットのセグメントアドレスと16ビットのオフセットアドレスを用いて20ビット（１MB)のメモリを扱う方法オフセットの値を間違えると、すぐにメモリの内容が破壊される IBM PC では640KBのメモリを搭載するものも悪名高かったとはいえ従来の互換性＋ハードウェア技術では仕方なかった Z8000や68000の方が仕組みは優れていたが、既に8086には大きなソフトウェア資産があった

80286 プロテクトモードの登場利点欠点結局あまり使われなかった 16MBに緩和メモリ保護機能
プロテクトモードとリアルモードの扱いが全く異なる一度プロテクトモードになると、リセットしなければリアルモードに戻れない結局あまり使われなかった

MS-DOS 1.0 MS-DOS 2.0 当時のOSはフロッピ起動（IBMが紹介）
Microsoft はQD-DOSを買収し、MS-DOSとして$50で売り出した CP/Mは $595だった→誰も使わなくなった MS-DOS 2.0 UNIXから色々機能をとってきて追加ハードウェアへのたいしたサポートがないシステムリソースの管理はOSではなくアプリケーションの責任

80386 32ビットCPUになり、メモリがフラットに4GBまで扱えたしかし、高価だったので8086や80286を使い続ける人も多かった
結局Windows95が出るまで、セグメントアドレスは不平を言われ続けた

その後 MMX、SSE 486, PentiumなどのCPUが普及動画再生や3Dゲーム音声認識や画像認識 32ビットOSが本格普及
4GBまでメモリが扱える（Pentium Proは64GBまで） MMX、SSE 動画再生や3Dゲーム音声認識や画像認識

4.4 Basic CPU Design 初期のプログラミング方法物理的に結線していた難しいので、ソケット方式に変更

レジスタ追加後

4.5 Decoding and Executing Instructions: Random Logic Versus Microcode

4.6 RISC vs. CISC vs. VLIW

4.7 Instruction Execution, Step-By-Step

4.8 Parallelism - the Key to Faster Processors
処理の並列化互いに影響しない命令は同時処理できる例：EIPのアップデートはいつでも良い Fetch the instruction byte from memory. Update the EIP register to point at the next byte. Decode the instruction to see what it does. Fetch the source register. Store the fetched value into the destination register Fetch the instruction byte from memory. Decode the instruction to see what it does. Fetch the source register and update the EIP register to point at the next byte. . Store the fetched value into the destination register

場合によっては、命令の順序を入れ替える Fetch the instruction byte from memory.
Decode the instruction and update EIP to point at the next byte. Fetch a displacement for use in the effective address calculation Update EIP to point beyond the displacement value. Fetch the constant value from memory and send it to the ALU. Compute the address of the memory operand (EBX+disp). Get the value of the source operand from memory and send it to the ALU. Instruct the ALU to add the values. Store the result back into the memory operand and update the flags register with the result of the addition operation and update EIP to point beyond the constant's value. Fetch the instruction byte from memory. Decode the instruction and update EIP to point at the next byte. Fetch a displacement for use in the effective address calculation and update EIP to point beyond the displacement value. Compute the address of the memory operand (EBX+disp). Fetch the constant value from memory and send it to the ALU. Get the value of the source operand from memory and send it to the ALU. Instruct the ALU to add the values. Store the result back into the memory operand and update the flags register with the result of the addition operation and update EIP to point beyond the constant's value.

4.8.1 The Prefetch Queue - Using Unused Bus Cycles
CPUとメモリ間のバスを有効利用 1サイクルに一つのデータしか転送できないレジスタへの値代入時はバスを使用しない →プリフェッチすると良い

JNZなどでアドレスが突然変わると対応できない

4.8.2 Pipelining - Overlapping the Execution of Multiple Instructions
80486から対応しました

A Typical Pipeline この命令を普通に4回行うと24クロックパイプラインを使うと9クロックで済む

Stalls in a Pipeline パイプライン処理を困難にする3つの問題構造ハザード制御ハザードデータハザード

構造ハザード同じタイミングで同じ回路を使用する時がある前の処理が終わるまで待つ必要あり

制御ハザードデータハザード JNZなどで条件分岐する場合におきる
ある値を先に計算してしまうことにより、計算結果が変わってしまう（4.8.4でまたやります）

4.8.3 Instruction Caches - Providing Multiple Paths to Memory

4.8.4 Hazards

4.8.5 Superscalar Operation- Executing Instructions in Parallel

4.8.6 Out of Order Execution

4.8.7 Register Renaming

4.8.8 Very Long Instruction Word Architecture (VLIW)

4.8.9 Parallel Processing

4.8.10 Multiprocessing これまで紹介したテクニック Multi Processing
fine-grained parallelism プログラマは特に意識しなくても高速になる Multi Processing Coarse-grained parallelism プログラマはマルチプロセッサを意識しなければいけない片方のCPUでの処理をもう片方に移動させるのは大きなコスト OSがマルチプロセッサ対応の必要キャッシュの一貫性

4.9 Putting It All Together
CPUの進化は著しい X86アーキテクチャは10000倍の速度アップいつまでもIA-32が続くとは限らないいずれIA-64などにとってかわられるかも？

4.1 Chapter Overview. 4.1 Chapter Overview 4.2 The History of the 80x86 CPU Family Intel製CPUの歴史.

Similar presentations

Presentation on theme: "4.1 Chapter Overview. 4.1 Chapter Overview 4.2 The History of the 80x86 CPU Family Intel製CPUの歴史."— Presentation transcript:

Similar presentations

About project

フィードバック

ログインする

Auth with social network:

4.1 Chapter Overview. 4.1 Chapter Overview 4.2 The History of the 80x86 CPU Family Intel製CPUの歴史.

Similar presentations

Presentation on theme: "4.1 Chapter Overview. 4.1 Chapter Overview 4.2 The History of the 80x86 CPU Family Intel製CPUの歴史."— Presentation transcript:

Similar presentations

About project

フィードバック