Koji Inoue Kyushu University

Slides:

Advertisements

Similar presentations

だい六か – クリスマスとお正月ぶんぽう. て form review ► Group 1 Verbs ► Have two or more ひらがな in the verb stem AND ► The final sound of the verb stem is from the い row.

Advertisements

て -form - Making て -form from ます -form -. With て -form, You can say... ～てもいいですか？ (= May I do…) ～てください。 (= Please do…) ～ています。 (= am/is/are doing…) Connecting.

Essay writing rules for Japanese!!. ＊ First ・ There are two directions you can write. ・よこがき / 横書き (same as we write English) ・たてがき / 縦書き (from right to.

米国セキュリティ調査（2002 CSI/FBI調査攻撃場所）

07. 値予測五島正裕.

バッファ・オーバフロー・アタックを動的に検出するセキュア・キャッシュ～安全性と消費エネルギーのトレードオフ～

07. 値予測五島正裕.

五段動詞の歌ごだんどうしのうた.

英語勉強会.

　辞書系（じしょけい）.

Chapter 11 Queues 行列.

日本語... ジェパディー！ This is a template for you to use in your classroom.

2010年7月9日　統計数理研究所　オープンハウス確率モデル推定パラメータ値を用いた市場木材価格の期間構造変化の探求 Searching for Structural Change in Market-Based Log Price with Regard to the Estimated Parameters.

Chris Burgess (1号館1308研究室、内線164)

じょし Particles.

What did you do, mate? Plain-Past

Verb Plain Negativeform

Training on Planning & Setting Goals

Object Group ANalizer Graduate School of Information Science and Technology, Osaka University OGAN visualizes representative interactions between a pair.

Explorations in Symbiosis on two Multithreaded Architectures

Only One Flower in the World

日本人の英語文章の中で「ENJOY」はどういうふうに使われているのか

Tohoku University Kyo Tsukada

V 03 I do NOT eat sushi. I do NOT do sumo.

Estimating Position Information by Detecting Network-Connection

十年生の日本語 Year 10 Writing Portfolio

Chapter 4 Quiz #2 Verbs Particles を、に、で

定期考査２　英語.

The Sacred Deer of 奈良(なら)

Who Is Ready to Survive the Next Big Earthquake?

Did he/she just say that? Get your head out of the gutter! Oh wait….

CRLA Project Assisting the Project of

VTA 02 What do you do on a weekend? しゅうまつ、何をしますか。

Chapter 1 Hamburger History

What is the English Lounge?

Air Pen -- an introduction of my recent result --

ストップウォッチのカードストップウォッチのカード

P4-21 ネットワーク上の経路に対する回帰問題について

New accessory hardware Global Platform Division

Session 8: How can you present your research?

非レイテンシ指向レジスタ・キャッシュ・システム

Causative Verbs Extensively borrowed from Rubin, J “Gone Fishin’”, Power Japanese (1992: Kodansha:Tokyo) Created by K McMahon.

Cache Organization for Memory Speculation メモリ投機を支援するキャッシュの構成法

L3. Ｓｅａｒｃｈ for Decisions in Games

WLTC Mode Construction

全国粒子物理会桂林 2019/1/14 Implications of the scalar meson structure from B SP decays within PQCD approach Yuelong Shen IHEP, CAS In collaboration with.

-Get test signed and make corrections

WELCOME TO THE WORLD OF DRAGON BALL

Where is Wumpus Propositional logic (cont…) Reasoning where is wumpus

豊田正史（Masashi Toyoda）福地健太郎（Kentarou Fukuchi)

My Favorite Japanese Rock

第24回応用言語学講座公開連続講演会後援：国際言語文化研究科教育研究プロジェクト経費

クイズやゲーム形式で紹介した実例です。いずれも過去のインターン作です。

研究会「LHCが切り拓く新しい素粒子物理学」

2019/4/22 Warm-up ※Warm-up 1～3には、小学校外国語活動「アルファベットを探そう」（H26年度、神埼小学校におけるSTの授業実践）で、５年生が撮影した写真を使用しています（授業者より使用許諾済）。

Genetic Statistics Lectures （４） Evaluation of a region with SNPs

Tag question Aoyama Shogo.

The difference between adjectives and adverbs

Created by L. Whittingham

東北大情報科学田中和之,吉池紀子山口大工庄野逸理化学研究所岡田真人

英語音声学（７）音連結.

MO装置開発 Core part of RTR-MOI Photograph of core part.

Cluster EG Face To Face meeting

BW: 英語で書いて下さい１）小さくする２）うるさく話す３）大きく書く４）上手になる（なる＝become)

Apply sound transmission to soundproofing

Cluster EG Face To Face meeting 3rd

アノテーションガイドラインの管理を行うアノテーションシステムの提案

Improving Strategic Play in Shogi by Using Move Sequence Trees

自己ルーティングによるラベル識別コリニア音響光学効果を用いたラベル識別スケジューリング経路制御ラベルラベル識別ラベル処理

Presentation transcript:

Koji Inoue Kyushu University Security Aware System Design ~Architectural Support for Secure Computing~ Koji Inoue Kyushu University

IoT Processors MUST be Secure! Architectural Support for Security ? Pipelining SuperScalar Branch Prediction Value Prediction On-chip Cache OOO Exe. ILP TLP MLP Clock Gating Signal Gating DVS Resizing Drowsy Operation Selective Activation High Performance Low Power/Energy

Our Approach (#1) #1 Secure Cache Security High Low Performance Architectural Support for Security #1 Secure Cache Pipelining SuperScalar Branch Prediction Value Prediction On-chip Cache OOO Exe. ILP TLP MLP Clock Gating Signal Gating DVS Resizing Drowsy Operation Selective Activation High Performance Low Power/Energy

Buffer-Overflow Attack R.B.Lee, D.K.Karig, J.P.McGregor, and Z.Shi, “Enlisting Hardware Architecture to Thwart Malicious Code Injection,” Proc. of the Int. Conf. on Security in Pervasive Computing, Mar. 2003. Well-Known vulnerability Exploited by Blaster@2003 Caused by unexpected operations writing an inordinately large amount of data into a buffer This vulnerability exists in the C standard library (e.g. strcpy) Leads to a stack smashing An attack code is inserted The return address is corrupted Highjacks the program execution control

Function Call/Return Start f( ) Call g( ) Execute strcpy( ) int f ( ) { … g (s1); } int g ( char *s1) { char buf [10]; strcpy(buf, s1); Program code Start f( ) Call g( ) Execute strcpy( ) Return to f( )

Function Call/Return Return Address Start f( ) Call g( ) Stack Growth s1 Return Address Saved FP Local Variable buf FP SP Higher Addr. Lower Addr. int f ( ) { … g (s1); } int g ( char *s1) { char buf [10]; strcpy(buf, s1); The Next PC of Call g( ) Program code Program code Start f( ) Call g( ) Execute strcpy( ) Return to f( )

Function Call/Return Return Address String Start f( ) Call g( ) Higher Addr. int f ( ) { … g (s1); } int g ( char *s1) { char buf [10]; strcpy(buf, s1); FP s1 The Next PC of Call g( ) Return Address Program code Stack Growth Saved FP String Local Variable buf SP Lower Addr. Start f( ) Call g( ) Execute strcpy( ) Return to f( )

Function Call/Return Return Address String Start f( ) Call g( ) Stack Growth s1 Return Address Saved FP Local Variable buf FP SP Higher Addr. Lower Addr. The Next PC of Call g( ) String int f ( ) { … g (s1); } int g ( char *s1) { char buf [10]; strcpy(buf, s1); Program code Start f( ) Call g( ) Execute strcpy( ) Return to f( )

Function Call/Return Return Address String Start f( ) Call g( ) Stack Growth s1 Return Address Saved FP Local Variable buf FP SP Higher Addr. Lower Addr. The Next PC of Call g( ) String int f ( ) { … g (s1); } int g ( char *s1) { char buf [10]; strcpy(buf, s1); Program code Start f( ) Call g( ) Execute strcpy( ) Return to f( )

Stack Smashing Return Address Return Address String Start f( ) Growth s1 Return Address Saved FP Local Variable buf FP SP Higher Addr. Lower Addr. Stack Growth s1 Return Address Saved FP Local Variable buf FP SP Higher Addr. Lower Addr. The Next PC of Call g( ) String int f ( ) { … g (s1); } int g ( char *s1) { char buf [10]; strcpy(buf, s1); The Next PC of Call g( ) Program code Start f( ) Call g( ) Execute strcpy( ) Return to f( )

Stack Smashing Insert the attack code! Corrupt the return address! Higher Addr. Stack Growth s1 Return Address Saved FP Local Variable buf FP SP Higher Addr. Lower Addr. The Next PC of Call g( ) String int f ( ) { … g (s1); } int g ( char *s1) { char buf [10]; strcpy(buf, s1); FP s1 Attack Code To the Attack Code Return Address The Next PC of Call g( ) Program code Stack Growth Saved FP Local Variable buf SP Lower Addr. Start f( ) Call g( ) Execute strcpy( ) Return to f( ) Insert the attack code! Corrupt the return address!

Stack Smashing Insert the attack code! Corrupt the return address! Higher Addr. Stack Growth s1 Return Address Saved FP Local Variable buf FP SP Higher Addr. Lower Addr. The Next PC of Call g( ) String int f ( ) { … g (s1); } int g ( char *s1) { char buf [10]; strcpy(buf, s1); FP s1 Attack Code To the Attack Code Return Address The Next PC of Call g( ) Program code Stack Growth Saved FP Local Variable buf SP Lower Addr. Start f( ) Call g( ) Execute strcpy( ) Return to f( ) Insert the attack code! Corrupt the return address! Hijack the program execution!

Secure Cache Architecture Protect return-address (RA) values in the cache!

Secure Cache Architecture Protect return-address (RA) values in the cache! Generate one or more “Replicas” on each RA store

Secure Cache Architecture Protect return-address (RA) values in the cache! Generate one or more “Replicas” on each RA store Compare the original RA with a replica on the RA load

Secure Cache Architecture Protect return-address (RA) values in the cache! Generate one or more “Replicas” on each RA store Compare the original RA with a replica on the RA load If they are not the same, we know that the popped RA has been corrupted!

Security Issue The replica lines are also evicted from the cache Miss the opportunity to check the RA value if no replica lines reside in the cache So… Good for many applications (w/ high cache-hit rates) Bad for memory intensive applications (w/ high cache-miss rates)

Lock & Unlock Approach Prohibit the eviction of replica lines until they are loaded later! (Lock) Keep the replicas in the cache until they are evicted due to the replacement policy! (Unlock) Pros. Avoid premature replica-line invalidation Cons. Waste cache resource

Experimental Set-Up Processor Simulator SCache Model SimpleScalar3.0 4-way OoO Superscalar 4-way 16KB L1 D-Cache SCache Model LRU1-L&U: Lock & Unlock (w/ LRU replica placement) Benchmark Programs SPEC2000 7 integer programs, 4 fp programs Small input

Vulnerability LRU1-L&U Vulnerability Benchmark Programs (Nv-rald / Nrald) * 100 Total #of issued RA load Insecure issued Benchmark Programs

Performance Overhead LRU1-L&U Performance Overhead Benchmark Programs

#2 Dynamic Program Authentication Our Approach (#2) Architectural Support for Security #2 Dynamic Program Authentication Pipelining SuperScalar Branch Prediction Value Prediction On-chip Cache OOO Exe. ILP TLP MLP Clock Gating Signal Gating DVS Resizing Drowsy Operation Selective Activation High Performance Low Power/Energy

Prohibit Malicious Codes from Executing Search Malicious Codes Allow to Execute Only Authenticated Codes Virus Scan （Pattern Matching）（Rule Matching） Static Dynamic Certificate (w/ Key) Static Dynamic ? White List Black List

Concept of Dynamic Program Signature Determine an execution behavior for the program authentication from a secret key E.g., Memory-Access Pattern Control the execution behavior at compile time Monitor the execution behavior at run time E.g., Hardware Profiler Program object code compile Secret Key Behavior Model Behavior Model ? CPU

Dynamic Execution Behavior for Program Authentication Address X is Accessed in Every N Instructions! Program HW Profiler End of Execution Attack Code Hijacking of Execution Control N Instructions HW Profiler

How Can We Control The Execution Behavior At Compile Time? Basic Block A special instruction to generate a dynamic signature Unified original Branches Unify the size of basic blocks Speculative/OOO Execution Monitor the sequence of committed instructions

Security Strength We can NOT detect malicious attacks if The attacker knows the secret key information and algorithm how to determine the pre-defined behavior The attack codes are inserted at the compile time The attacker can predict the pre-defined behavior Provability: 1/(2ADDRbit-width+2N) where N is the candidate of the unified basic-block size The size of malicious code is smaller than that of unified basic-block E.g., less than 5 or 30 instructions

Performance/Cost Overhead (1/2) Evaluation Code size (Cost) Execution time (Performance) Experimental Environment ARMSimplescalar StrongARM model Support the dynamic signature Benchmark programs SPECint95（129.compress） Dynamic Signature Unified basic-block size：from 5 to 25 instructions #of key-load instructions in each basic block: 1

Performance/Cost Overhead (2/2) Load for Key Nop Original Code size x5 in maximum, and x1.7 in minimum Execution time overhead 50% in max. and 10% in min.

Apply Machine Learning (AI) Tech. Training Program HW Profiler L1D/I misses Instruction Counts #Loads/Stores Etc. Trusted Execution Behavior Model Input-A Input-A Input-A Execution Program HW Profiler Time Window (e.g., = 0.1 sec)

Preliminary Experiments Training lbm SVM (Support Vector Machine) Trusted Execution Behavior Model (for lbm) bzip2 mcf gcc wrf from SPEC2006 Execution Program Accuracy[%] Pos. Example lbm 100.00 Neg. Example bzip2 gcc mcf wrf Program Accuracy[%] Un-Known hmmer 100.00 libquantum perlbench sphinx3 bwaves 95.09

What’s Next? ~Energy-Security Tradeoff~ Don't you have such an experience? P You locked me! Go to your office! Let’s Work! Did I Lock the Door? ? Very Tired Did I Lock The Door? ? I Don’t Care!

Example of Energy-Security Tradeoff ~Case for White-List Execution~ Low energy, but insecure Program HW Profiler Time Window (e.g., = 1.0 sec) Execution High energy, but secure Program HW Profiler Time Window (e.g., = 0.1 sec)

Conclusions IoT processors MUST be… Low cost, low power/energy, high performance, and SECURE! Architectural support is a key technology Adaptive tradeoff optimization Our collaboration w/ IITD focuses on… Energy-security tradeoff in architecture perspective (including CPU/Memory/VMs)

Backup Slides

Example of Energy-Security Tradeoff ~Case for Secure-Cache~ 6.0% 5.4% LRU1L LRU1 LRU2 MRU1 MRU2 ALL MRU1: more than 88.5%of RA load ALL: more than 99.3% of RA load

Example of Energy-Security Tradeoff ~Case for Secure-Cache~ LRU1L LRU1 LRU2 MRU1 MRU2 ALL ALL: 23% of energy overhead MRU1: 9.9% of energy overhead

Example of Energy-Security Tradeoff ~Case for Secure-Cache~ 2.4% 2.3% E2VP 164.gzip 176.gcc 188.ammp 175.vpr 197.parser EV2P 164.gzip 176.gcc 188.ammp 175.vpr 197.parser EVP 164.gzip 176.gcc 188.ammp 175.vpr 197.parser Normalized to LRU1 MRU1: good for Energy-Oriented Applications MRU2: good for Security-Oriented Applications LRU2 MRU1 MRU2 ALL

Example of Energy-Security Tradeoff Security/Energy/Performance Energy SimpleScalar3.0 16KB 4-way D-cache Line size：32B OOO execution SPEC2000 7 integer programs 4 fp programs Small input 4KB SRAM design 0.18μm CMOS technology One way of the 16KB cache Hspice simulation w/ extracted load capacitances Measure the energy consumed for 1-bit accesses

AI技術を活用する！特徴量の可視化とその評価環境 SPEC2006から10種 INT：bzip2, gcc, perlbench, hmmer, libquantum FP ：sphinx3, bwaves, lbm, wrf, mcf パフォーマンスカウンタを特徴量として使用プログラムの実行全体を0.1秒ごとに取得以下の8つのイベントを取得命令(L1)/データキャッシュミス回数、実行命令数、load/store命令の実行回数、分岐命令の実行回数、　　　　浮動小数点除算器使用サイクル数、分岐命令予測ミス数主成分分析(PCA)による特徴量の可視化識別のためには特徴量の分布が重要 8次元の特徴量を2次元に削減実験のために用いるプログラムは、整数演算を用いるSPEC2006 INTから5種、浮動小数点演算を用いるSPEC2006 FPから5種、合計10種選びました。特徴量として、先行研究で用いられたパフォーマンスカウンタを使用しました。本実験で用いたCPUはパフォーマンスカウンタ用のレジスタが8つあるので、同時に8つのイベントを取得しました。イベントは数百種類の中から、先行研究で用いられたものなどを参考に、スライドに載っている8つのイベントを取得しました。プログラムの実行開始1秒後から終了まで、0.1秒ごとに8つのイベントの回数を取得し、プログラムの特徴量として用いました。ここで、ホワイトリスト方式によるプログラム識別のためには、特徴量の分布が重要となります。そのため、まずは各プログラムの特徴量がどのように広がっているかを主成分分析により可視化します。 0.1秒ごとに取得した8次元のパフォーマンスカウンタの値を、主成分分析により2次元に次元削減することでマッピングを行いました。

Visualization of Feature Vector 可視化を行った結果、このようになりました。各プログラムの特徴量に、凡例のように色をつけてマッピングしています。横軸は主成分分析後の第１主成分、縦軸は第２主成分を表しています。なお、それぞれの主成分は0から1の間で正規化しています。

Visualization of Feature Vector lbm 多くのプログラムで特徴量が重なっていますが、青い丸で囲ったプログラムlbmの特徴量は独立です。従って、lbmを識別するような識別器は実現可能であると考えられます。

機械学習による識別とその評価特徴量機械学習アルゴリズム lbmの識別実験特徴量可視化の実験で用いたパフォーマンスカウンタ使用サポートベクトルマシン lbmの識別実験正例：lbm 負例：bzip2, gcc, mcf, wrf プログラム正解率[%] 正例 lbm 100.00 負例 bzip2 gcc mcf wrf プログラム正解率[%] 未知 hmmer 100.00 libquantum perlbench sphinx3 bwaves 95.09 特徴量の可視化を行ったことで、lbmを識別する識別器は実現可能であり、bzip2を識別する識別器は実現が難しいと考察しました。そこで、可視化に用いた特徴量と同じパフォーマンスカウンタを特徴量として採用し、サポートベクトルマシンを用いた機械学習により識別器を作成して評価を行いました。 lbmとbzip2それぞれの識別器の学習に用いた、正例と負例のプログラムはスライドに載せているものを採用し、残りは評価に用いました。各プログラムの実行全体における、0.1秒ごとの8次元のパフォーマンスカウンタの値に、正例もしくは負例のラベルをつけて学習を行いました。ーーーーー Q. 負例はどのような基準で選んだの？ lbmの負例は、lbmの特徴量から離れているプログラムを選びました。また、bzip2の場合はbzip2と特徴量が似ているプログラムを選びました。どのように負例を選ぶべきかは今後考えるべきことかと思います。今後の参考にさせていただきます。