初心者による初心者のための「線形混合モデル」土屋政雄 2011/08/20 第 2 回心理・医学系研究者のためのデータ解析環境 R による統計学の研究会 ※著作権に関わりそうな画像類は削除.

初心者による初心者のための「線形混合モデル」土屋政雄 2011/08/20 第 2 回心理・医学系研究者のためのデータ解析環境 R による統計学の研究会 ※著作権に関わりそうな画像類は削除

本日の内容線形混合モデルって何？ – 名前がいろいろどういう時に使うの？ – 横断と縦断どれ位有名なの？ – 推奨してる論文や使われ具合今までの方法はだめなの？ – ANOVA や重回帰との違い

本日の内容線形混合モデルを使うとどんないいことがあるの？中はどんな風になってるの？ – かんたんにすうしき絵で描いてよ！ – グラフで図示 R でやってみてよ！ – データセット変換＆ library(HSAUR) 論文に，書きたいです

線形混合モデルって何？ Linear Mixed Model ざっくり説明すると，線形モデル（ ANOVA や重回帰）の進化版より細やかに – 式が増えたよ！より柔軟に – 必要な仮定が減ったよ！ ( イメージ ) 本の表紙画像 Multilevel analysis for applied research It's just regression! Robert Bickel

線形混合モデルって何？ Mixed models – 固定効果（ fixed effects ）とランダム効果 * （ random effects ）の混ざったモデルのこと。 – 回帰係数（切片や共変量効果）が上位のユニットでランダムにばらつくものとばらつかないものにわかれる – 一般的なマルチレベルモデルに対する特別なケースだと考えられるが，同じ意味で使われることも多い Diez Roux, 2002: Glossary for multilevel analysis. JECH;56:588-594 * 「変量効果」とも訳される今までの分析はこれだけ

線形混合モデルって何？一般線形モデル（ General linear model:GLM ）一般化線形モデル（ Generalized linear model:GLLM ）・ロジスティック回帰・ポアソン回帰・順序回帰等・ t 検定・分散分析・重回帰分析・共分散分析等ここにランダム効果が混ぜ込まれた図．伝統的な統計手法との関係ここにランダム効果が混ざったら，一般化線形混合モデルになる

線形混合モデルって何？名前がいろいろ – マルチレベル分析（ multilevel analysis ） – マルチレベルモデル・多水準モデル – 階層（線形）モデル（ hierarchical [linear] models ） →HLM – ランダム（効果 / 係数 / 回帰）モデル（ random [effects/coefficient/regression] models ） – 共分散要因モデル（ covariance components models ） – 分散要因モデル（ variance components models ） Diez Roux, (2002) ；小野寺編訳 2006 『基礎から学ぶマルチレベルモデル』

線形混合モデルって何？名前がいろいろ – 経験的ベイズモデル（ Empirical Bayes models ） – 成長曲線モデル（ growth curve models ） – 混合効果モデル（ mixed-effect model ）間違えやすいもの，似ているもの（土屋の印象） – 反復測定 ANOVA の between と within の混合 – 潜在成長曲線モデル →SEM – 一般化推定方程式（ GEE ） Gueorguieva & Krystal (2004)

どういう時に使うの？下位レベルの集団が上位レベルの集団に入れ子構造になっている時 – 横断でも縦断でも・・・・・・クラス A ・・・・・・クラス B ・・・・・・・・・・・・クラス C 学校 A （例）・学校・職場・医者と患者・・・・・・

「線形混合モデル」と言った時には，特に縦断デザインで使われることが多いかも → 今日のテーマ（例）・コホート研究・介入研究どういう時に使うの？・・・・・・・・・・・・学校とか病院とか・・・・・・ t1t1 t2t2 tntn ・・・・・・ t1t1 t2t2 tntn ・・・・・・ t1t1 t2t2 tntn 時点ごとの測定値

どれ位有名なの？色んな領域の journal で解説 & 紹介論文などが出ている – 発達研究・子ども（ Boyle & Willms, 2001 ） – 看護（ Shin, 2009 ） – 社会・パーソナリティ心理 (Nezlek, 2008; West, 2011) – 家族・カップル心理（ Atkins, 2005 ） – 精神生理（ Kristjansson et al., 2007 ）

今までの方法はだめなの？『無作為化試験の分析で ANOVA はたやすく誤って適用される：批判と代替統計アプローチの議論』 Vickers. （ 2005 ） Psychosom Med. 『 ANOVA の地位をゆずれ：反復測定データ分析の進歩と Archive of General Psychiatry の論文への反映』 Gueorguieva & Krystal (2004). Arch Gen Psychiatry. 『消えますよ・・・ほら消えた。：臨床試験での縦断フォローアップデータの解析における，従来の回帰 vs ランダム効果回帰モデルの比較』 Nich & Carroll, (1997) J Consult Clin Psychol ANOVA is old!

今までの方法はだめなの？ ANOVA の典型的な問題代替統計アプローチ F 値と p 値の報告だけ群ごとに平均と標準偏差を報告；平均値の差と 95% 信頼区間の報告 3 つ以上の群の試験の時，「全ての群は同等であるか？」の仮説のみへの言及効果や興味のある群比較の後，係数または平均値差と 95% 信頼区間と p 値の報告ベースラインで取られた指標による，試験への「群」効果，「時間」効果，「群 × 時間の交互作用」の報告ベースラインの得点を共変量とした ANCOVA を行い，群間の違いのみを報告するベースラインと治療中 or 後の何回かで取られた指標による，試験への「群 × 時間の交互作用」の意味は不明確ベースラインの得点を共変量とした ANCOVA を行い，従属変数の要約（例：フォローアップ指標の平均）を使う。または縦断混合効果モデルを使う Vickers. （ 2005 ） Psychosom Med.

今までの方法はだめなの？ Gueorguieva & Krystal (2004) エンドポイント分析 rANOVArMANOVAMixed –Effects Analysis 全対象者の完全データが必要 YN*N* YN 欠損のある対象者を取り除く影響サンプルバイアス非適用欠損値代入の影響推定バイアス非適用異なった時点での測定 YNNY 時間効果の記述単純柔軟個人の傾向の推定 NNNY 相関パターンの制約のある推定非適用 YNN 時間依存共変量 NYNY 実行の簡単さとても簡単簡単難しい計算の複雑さ低低中高 * 欠損値のある者はデータセットから削除される

今までの方法はだめなの？（特に rANOVA との比較）欠損への対処のため，欠損値のある者を解析から除外するか，値を代入する必要がある – 前回の観測値を代入（ last observation carried forward: LOCF ） → 推定バイアスのリスク全ての対象者が同じ時点でデータを得る必要がある Gueorguieva & Krystal (2004)

今までの方法はだめなの？（特に rANOVA との比較）球面性の仮定（ sphericity or circularity ）が必要 – 全ての測定値が同じばらつきで，同一個人内で全ての 2 つの連続した測定時点間で同じ相関であること – 測定時点間の時間が長いと相関が弱くなるので，この仮定が満たされることは少ないタイプ I エラーの増大 → 効果の過大評価 – だめなら， Greenhouse-Geisser などで自由度の修保守的なので効果の検出が難しくなる Gueorguieva & Krystal (2004)

線形混合モデルを使うとどんないいことがあるの？各対象者の利用可能なデータをすべて使えるランダムな欠測には影響されない時間の効果を柔軟にモデル化できる現実的で倹約的な分散・共分散パターンが使える（複合シンメトリー，自己回帰など）いくつかの異なったパターンのモデルを比較することができる – 適切な分散のパターンを選ぶとより正確な治療効果や SE の推定につながる Gueorguieva & Krystal (2004)

線形混合モデルを使うとどんないいことがあるの？正確さが up すると必要なサンプルサイズが減るので，検定力 up につながる時間依存共変量の扱いが簡単査読でつっこまれない！（ Hamer & Simpson, 2009 ） – "Mixed models are preferable... when the number of subjects is sufficiently large and the proportion of missing data is small enough" Gueorguieva & Krystal (2004)

中はどんな風になってるの？ Blackwell et al., (2006) 基本のすうしき ※本や論文の著者により使われる記号が違うので，混乱しないように！ Y ij : 時点 i ，個人 j のアウトカム指標 A ij ：時点 i ，個人 j のレベル 1 説明変数 β 0j ：個人 j における個人特有の切片 β 1j ：個人 j における個人特有の傾き R ij ：個人 j における個人特有の誤差項

中はどんな風になってるの？ Blackwell et al., (2006) γ 00 : 全ての個人間の平均切片 γ 10 : 全ての個人間の平均傾き U 0j ：レベル 2 誤差，つまり個人 j における γ 00 から説明できない逸脱 U 1j ：レベル 2 誤差，つまり個人 j における γ 10 から説明できない逸脱ランダム傾きランダム切片追加

絵で描いてよ！通常の回帰ランダム切片のみモデル

絵で描いてよ！ランダム傾きのみモデルランダム切片＆傾きモデル

R でやってみてよ！使用データセット – パッケージ “HSAUR” のデータセット “BtheB” – コンピューターによる認知行動療法プログラムの臨床試験データ（ Proudfoot et al., 2003 ） – 患者 100 名, 変数 8 個のデータフレーム ◆ R プログラム data(“BtheB”, package=“HSAUR”) # データを開く head(BtheB) # 最初の 6 名分のデータ表示

drug length treatment bdi.pre bdi.2m bdi.4m bdi.6m bdi.8m 1 No >6m TAU 29 2 2 NA NA 2 Yes >6m BtheB 32 16 24 17 20 3 Yes <6m TAU 25 20 NA NA NA 4 No >6m BtheB 21 17 16 10 9 5 Yes >6m BtheB 26 23 NA NA NA 6 Yes <6m BtheB 7 0 0 0 0 【 R 出力】【変数】 drug: 抗うつ薬を使用したかどうか（ No or Yes ） length: 現在の抑うつエピソードの長さ（ 6m ， 6 カ月以上） treatment ：治療群（ TAU ，通常治療； BtheB ， Beat the Blues ） bdi.pre ～ bdi.8m ：（ベック抑うつ尺度の治療前から 8 カ月フォローアップまでの値）

BtheB$subject <- factor(rownames(BtheB)) # id 番号付与 nobs <- nrow(BtheB) # オブザベーション数を nobs に格納 BtheB_long <- reshape(BtheB, idvar = "subject", varying = c("bdi.2m", "bdi.4m", "bdi.6m", "bdi.8m"), direction = "long") ◆ id 番号（ subject ）の付与と解析準備 ◆データセットを ”wide” から ”long” へ BtheB_long$time <- rep(c(2, 4, 6, 8), rep(nobs, 4)) ◆時間変数を作成 subset(BtheB_long, subject %in% c("1", "2", "3")) ◆ id が 1~3 の者のデータのみ表示

drug length treatment bdi.pre subject time bdi 1.2m No >6m TAU 29 1 2 2 2.2m Yes >6m BtheB 32 2 2 16 3.2m Yes <6m TAU 25 3 2 20 1.4m No >6m TAU 29 1 4 2 2.4m Yes >6m BtheB 32 2 4 24 3.4m Yes <6m TAU 25 3 4 NA 1.6m No >6m TAU 29 1 6 NA 2.6m Yes >6m BtheB 32 2 6 17 3.6m Yes <6m TAU 25 3 6 NA 1.8m No >6m TAU 29 1 8 NA 2.8m Yes >6m BtheB 32 2 8 20 3.8m Yes <6m TAU 25 3 8 NA 同じ id の者が 4 回出現しており， time がそれぞれ 2 ヵ月から 8 カ月まで変わり， bdi がそれぞれに対応した値になっている

library(“lme4”) # 線形混合モデルを行うためのパッケージ呼び出し # ランダム切片モデル BtheB_lmer1 <- lmer(bdi ~ bdi.pre + time + treatment + drug +length + (1 | subject), data = BtheB_long, method = "ML", na.action = na.omit) # ランダム傾きモデル BtheB_lmer2 <- lmer(bdi ~ bdi.pre + time + treatment + drug + length + (time | subject), data = BtheB_long, method = "ML", na.action = na.omit) ◆線形混合モデルの適用赤字部分がランダム効果を表す。 1 は切片，｜の後に入れ子の上位ユニットを表す変数がくる。ここでは個人

anova(BtheB_lmer1, BtheB_lmer2) ◆モデルの比較 summary(BtheB_lmer1) ◆線形混合モデルの結果出力 Models: BtheB_lmer1: bdi ~ bdi.pre + time + treatment + drug + length + (1 | subject) BtheB_lmer2: bdi ~ bdi.pre + time + treatment + drug + length + (time | subject) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) BtheB_lmer1 8 1886.6 1915.7 -935.31 BtheB_lmer2 10 1889.8 1926.2 -934.90 0.8161 2 0.665 有意でないので time のランダム傾きを含める意味はなさそう

Linear mixed model fit by maximum likelihood Formula: bdi ~ bdi.pre + time + treatment + drug + length + (1 | subject) Data: BtheB_long AIC BIC logLik deviance REMLdev 1887 1916 -935.3 1871 1866 Random effects: Groups Name Variance Std.Dev. subject (Intercept) 48.304 6.9501 Residual 25.128 5.0127 Number of obs: 280, groups: subject, 97 Fixed effects: Estimate Std. Error t value (Intercept) 5.94371 2.24911 2.643 bdi.pre 0.63819 0.07759 8.225 time -0.71703 0.14605 -4.909 treatmentBtheB -2.37311 1.66365 -1.426 drugYes -2.79786 1.71990 -1.627 length>6m 0.25639 1.63210 0.157

参考： Stata xtmixed bdir bdipre time treatmentr drugr lengthr || subject:, mle

他にも必要な知識級内相関（ intraclass correlation: ICC ）推定法 – 最尤法（ maximum liklihood ： ML) – 制限付き最尤法 (restricted ML: REML) 共分散パターン – unstructured ，自己回帰，複合シンメトリーなど中心化 – 例：個々の独立変数－平均値 → 平均を 0 に

論文に，書きたいです記述例：無作為割付比較試験の効果推定 – ANOVA タイプいい例が見つからず・・・ – 回帰タイプ例： Sumathipala et al., 2008 Cognitive-behavioural therapy v. structured care for medically unexplained symptoms: randomised controlled trial. Br J Psychiatry. 193(1):51-9.

Sumathipala et al., 2008 – Statistical analysis ・・・ analysed the scores from the four fixed time points (3 months, 6 months, 9 months and 12 months after randomisation) using a mixed effects model. We included as fixed effects group allocation, baseline score, time, and the interaction of group and time. Time was coded as months after the 3-month time point, giving values of 0, 3, 6 and 9, so that in the presence of the interaction the effect of group represents the difference at 3 months. We included the patient as a random effect. We also fitted models with a random effect of time, with various covariance patterns, with treating doctor as an effect, and pattern mixture models to allow for the different drop-out patterns. We report here the simpler models as they fit as well as any of the more complex ones. We also examined model residuals. A mixed effects model was used as this enables effective use of all the information even from participants who had some missing scores. We used r for the analysis, 35 with the nlme package for fitting the mixed effects models. 36 中心化

Sumathipala et al., 2008 Table 3 Fig 2

文献 Atkins DC. Using multilevel models to analyze couple and family treatment data: basic and advanced issues. J Fam Psychol. 2005 Mar;19(1):98-110. Blackwell E, de Leon CF, Miller GE. Applying mixed regression models to the analysis of repeated-measures data in psychosomatic medicine. Psychosom Med. 2006;68(6):870-8. Boyle MH, Willms JD. Multilevel modelling of hierarchical data in developmental studies. J Child Psychol Psychiatry. 2001;42(1):141-62. Diez Roux AV. A glossary for multilevel analysis. J Epidemiol Community Health. 2002;56(8):588-94. Gueorguieva R, Krystal JH. Move over ANOVA: progress in analyzing repeated-measures data and its reflection in papers published in the Archives of General Psychiatry. Arch Gen Psychiatry. 2004;61(3):310-7

Hamer RM, Simpson PM. Last observation carried forward versus mixed models in the analysis of psychiatric clinical trials. Am J Psychiatry. 2009 Jun;166(6):639-41. Kristjansson SD, Kircher JC, Webb AK. Multilevel models for repeated measures research designs in psychophysiology: an introduction to growth curve modeling. Psychophysiology. 2007;44(5):728-36. 小野寺孝義編訳岩田昇・菱村豊・長谷川孝治・村山航訳 2006 基礎から学ぶマルチレベルモデル入り組んだ文脈から新たな理論を創出するための統計手法ナカニシヤ出版（ Kreft IGG, De Leeuw J ： Introducing multilevel modeling Sage London, 1998 ） Nezlek, J. B. (2008). An Introduction to Multilevel Modeling for Social and Personality Psychology. Social and Personality Psychology Compass, 2(2), 842-860.

Nich C, Carroll K. Now you see it, now you don't: a comparison of traditional versus random-effects regression models in the analysis of longitudinal follow-up data from a clinical trial. J Consult Clin Psychol. 1997;65(2):252-61. Shin JH. Application of repeated-measures analysis of variance and hierarchical linear model in nursing research. Nurs Res. 2009;58(3):211-7. Vickers AJ. Analysis of variance is easily misapplied in the analysis of randomized trials: a critique and discussion of alternative statistical approaches. Psychosom Med. 2005;67(4):652-5. West SG, Ryu E, Kwok OM, Cham H. Multilevel modeling: current and future applications in personality research. J Pers. 2011;79(1):2-50.

初心者による初心者のための「線形混合モデル」土屋政雄 2011/08/20 第 2 回心理・医学系研究者のためのデータ解析環境 R による統計学の研究会 ※著作権に関わりそうな画像類は削除.

Similar presentations

Presentation on theme: "初心者による初心者のための「線形混合モデル」土屋政雄 2011/08/20 第 2 回心理・医学系研究者のためのデータ解析環境 R による統計学の研究会 ※著作権に関わりそうな画像類は削除."— Presentation transcript:

Similar presentations

About project

フィードバック

ログインする

Auth with social network:

初心者による初心者の ための「線形混合モデル」 土屋政雄 2011/08/20 第 2 回 心理・医学系研究者のためのデー タ解析環境 R による統計学の研究会 ※著作権に関わりそうな画像類は削除.

Similar presentations

Similar presentations

About project

フィードバック

初心者による初心者のための「線形混合モデル」土屋政雄 2011/08/20 第 2 回心理・医学系研究者のためのデータ解析環境 R による統計学の研究会 ※著作権に関わりそうな画像類は削除.