Accelerated Gradient Methods for Stochastic Optimization and Online Learning (Hu, Kwok and Pan, NIPS2009) 二宮崇機械学習勉強会 2010 年 6 月 17 日 1.

Slides:

Advertisements

Similar presentations

Maxent model への挑戦 - 驚きとドキドキ感の理論 - 大野ゆかり Phillips et al. (2006) Maximum entropy modeling of species geographic distributions. Ecological Modeling 190:

Advertisements

esc/images/journal200404/index.html How the OFES (OGCM for the Earth Simulator) simulates the climatological state of the.

オンライン学習定式化評価法： Regret などパーセプトロン Passive Aggressive Algorithm ( アルゴリズムと損失の限界の評価） Confidence Weighted Algorithm Pegasos Coordinate Descent バッチ、オンライン、ストリームの.

人工知能特論 II 第 11 回二宮崇二宮崇 1. 今日の講義の予定確率的識別モデル最大エントロピーモデル ( 多クラスロジスティック回帰、対数線形モデル ) パラメータ推定自然言語処理での素性ベクトル教科書 Yusuke Miyao (2006) From Linguistic Theory.

到着時刻と燃料消費量を同時に最適化する船速・航路計画

オンライン学習 Prediction Learning and Games Ch2

機械学習勉強会～強化学習～ 11/18 江原遥.

英語勉強会.

稲葉一浩 (k.inaba) Python とプログラミングコンテスト稲葉一浩 (k.inaba)

論文紹介 “Data Spectroscopy: Learning mixture models using eigenspaces of convolution operators” (ICML 2008) ─ by Tao Shi, Mikhail Belkin, and Bin Yu IBM東京基礎研究所.

in affine algebraic geometry アファイン代数幾何学におけるエキゾチック構造

2010年7月9日　統計数理研究所　オープンハウス確率モデル推定パラメータ値を用いた市場木材価格の期間構造変化の探求 Searching for Structural Change in Market-Based Log Price with Regard to the Estimated Parameters.

先端論文紹介ゼミ Tracking control for nonholonomic mobile robots: Integrating the analog neural network into the backstepping technique 非ホロノミック移動ロボットのための追従制御:

多数の疑似システムを用いたシステム同定の統計力学三好誠司岡田真人神戸高専東大，理研

Probabilistic Method.

A Stochastic approximation method for inference in probabilistic graphical models 機械学習勉強会 2010/05/20 森井正覚.

How to Become a Supply Chain Analyst with Free

大規模データの線形識別 Recent Advances of Large-scale Linear Classification

先端論文紹介ゼミ Role-based Context-specific Multiagent Q-learning

Memo for S-2S simulation Toshi Gogami 2014/7/25. Contents Missing mass resolutions with S-2S / SKS.

3. 線形回帰および識別クラシックな機械学習の入門 by 中川裕志（東京大学）線形回帰のモデル正則化項の導入 L2正則化 L1正則化

To appear in ACM Transactions on Graphics (Proc. SIGGRAPH 2015)

Group meeting 2016/5/13 Katsuhiro Umeda.

Study of high gradient acceleration

データ構造とアルゴリズム論第７章探索のアルゴリズム

What is the English Lounge?

ロジスティクス工学第7章配送計画モデル東京商船大学久保幹雄

P4-21 ネットワーク上の経路に対する回帰問題について

Spectral Clustering による語義曖昧性解消のための教師あり類似度学習

[IBIS2011 企画セッションプレビュー] 大規模最適化およびリスク指向最適化の最新解法

正規分布におけるベーテ近似の解析解と数値解東京工業大学総合理工学研究科知能システム科学専攻　渡辺研究室　　　西山　悠，　渡辺澄夫.

Photometric properties of Lyα emitters at z = 4

根津利三郎・OECD科学技術産業局長　　　　　プレゼンテーション資料（Technology, Innovation, 　　　　　　　ICT and Economic Performance）より抜粋　平成１３年４月１７日（火）ブラウンバッグランチ（BBL)セミナーにて使用.

CG特論　論文読破 04ki104　松原　典子.

Astro-E2 Ascent Profile

機械翻訳勉強会 NTCIR-7について 2007年10月16日奈良先端大D1小町守.

Vector 4 = [Vector 3, packet_size]

Large Margin Component Analysis by Lorenzo Torresani, Kuang-chih Lee

人工知能特論９．パーセプトロン北陸先端科学技術大学院大学　鶴岡慶雅.

巡回セールスマン問題への招待東京商船大学久保幹雄.

Online Decoding of Markov Models under Latency Constraints

named by かしまさん（ＩＢＭ）読む人：藤巻（ＮＥＣ）

情報理工学系研究科数理情報学専攻数理第四研究室博士三年指導教員：駒木文保准教授鈴木大慈 2008年8月14日

Songzhu Gao, Tetsuya Takiguchi, Yasuo Ariki (Kobe University)

訓練データとテストデータが異なる分布に従う場合の学習

Session 17: Privacy and Protection

大規模なこと Large scale.

Trading Convexity for Scalability

半構造化テキストに対する文字列照合アルゴリズム

視野を広げる遠隔学習：その分野、技術、コミュニケーション

Data Clustering: A Review

東北大学大学院情報科学研究科応用情報科学専攻田中和之(Kazuyuki Tanaka)

First Course in Combinatorial Optimization

Nightmare at Test Time: Robust Learning by Feature Deletion

出典：小島尚人、舘智士：インターネット環境下で稼働する画像幾何学的歪補正システムの構築、土木情報技術論文集、Vol 出典：小島尚人、舘智士：インターネット環境下で稼働する画像幾何学的歪補正システムの構築、土木情報技術論文集、Vol.18、pp249～260、2009年10月.

法数学のための機械学習の基礎京大(医)　統計遺伝学分野山田　亮 2017/04/15.

北大ＭＭＣセミナー第97回附属社会創造数学センター主催 Date: 2019年3月5日（火） 11:00～12:00

Data Clustering: A Review

東北大情報科学田中和之,吉池紀子山口大工庄野逸理化学研究所岡田真人

Type Systems and Programming Languages ; chapter 13 “Reference”

``Exponentiated Gradient Algorithms for Log-Linear Structured Prediction’’ A.Globerson, T.Y.Koo, X.Carreras, M.Collins を読んで渡辺一帆（東大・新領域）

人工知能特論II　第8回二宮　崇.

国立天文台辰巳大輔，常定芳基他 TAMA Collaboration

Le Lu, Rene Vidal John Hopkins University (担当：猪口)

音響伝達特性モデルを用いたシングルチャネル音源位置推定の検討 2-P-34 高島遼一，住田雄司，滝口哲也，有木康雄（神戸大）研究の背景

ICML読む会資料（鹿島担当）教師ナシの構造→構造マッピング読んだ論文： Discriminative Unsupervised Learning of Structured Predictors Linli Xu (U. Waterloo) , … , Dale Schuurmans.

発表者: 稲葉一浩複雑ネットワーク・地図グラフセミナー 2017/1/19

自己縮小画像と混合ガウス分布モデルを用いた超解像

Software Process Evaluation: A Machine Learning Approach

アノテーションガイドラインの管理を行うアノテーションシステムの提案

Presentation transcript:

Accelerated Gradient Methods for Stochastic Optimization and Online Learning (Hu, Kwok and Pan, NIPS2009) 二宮崇機械学習勉強会 2010 年 6 月 17 日 1

論文 Chonghai Hu, James T. Kwok and Weike Pan (2009) Accelerated Gradient Methods for Stochastic Optimization and Online Learning, In Proc. of NIPS

はじめに Stochastic Gradient Descent (SGD) 大量のデータから学習可収束が遅い Acceleration (Nesterov, 1983) Smooth component & non-smooth component に対する Acceleration (Nesterov, 2007)(Beck&Teboulle, 2009) ヒンジロスや L1 正則化項 Convergence rate (N…iteration の回数 ) (Lan, 2009) 3 smooth componentnon-smooth component

この論文の contribution Strong Convexity に対する SGD と Acceleration Strong convexity の例 : Log loss, square loss, L2 正則化項など Convergence rate オンライン化リグレット 4

問題一般の問題 Stochastic optimization Problem ここでは次のように定式化しておく 5 ξ is a random vector f (x)≡ E [F(x, ξ)] is convex and differentiable (and L-Lipschitz) is convex but non-smooth is μ-strongly convex

Lipschitz Continuity と Strong Convexity Lipschitz Continuity Strong Convexity 6

Algorithm Generalized gradient update (Nesterov, 2007) 提案アルゴリズム SAGE 7

Convergence Analysis 8 良い L t と α t を選択することによって、期待値は正解に急速に収束する。ここではは convex とする。

Convergence Analysis 9 ここではは μ-strongly convex とする。

Online Algorithm SAGE-based Online Algorithm 10

Regret 11

Experiments Data Sets pcmac (subset of 20 newsgroup) PCV1 (a filtered collection of the Reuters) Loss function: square loss Regularizer: L1 regularizer 12

Experiments 13

Experiments 14

Experiments 15

Experiments 16