R 语言|​重复测量方差分析:传统方法与线性混合模型|多重比较

文摘   2024-06-21 12:07   江苏  

引言

重复测量方差分析是科学研究中常用的统计方法,适用于评估相同被试对象在不同时间点的多次测量数据。

本文将比较两种主要的重复测量方差分析方法:传统的方差分析(使用 stats 包的 aov 函数)和线性混合模型(使用 lme4 包的 lmer 函数)。

需要的包

library(ggplot2)
library(lme4)
library(lmerTest)
library(emmeans)
library(multcomp)

数据

# 生成示例数据
set.seed(1)

subject <- factor(rep(1:9, each = 3))
time <- factor(rep(1:3, times = 9))
treatment <- rep(c("Control", "DrugA", "DrugB"), each = 3)
value <- unlist(lapply(rep(1:3, times = 3), function(i) rnorm(n = 3, mean = i, sd = 1)))

data <- data.frame(subject, time, treatment, value)
data
   subject time treatment      value
1 1 1 Control 0.3735462
2 1 2 Control 1.1836433
3 1 3 Control 0.1643714
4 2 1 DrugA 3.5952808
5 2 2 DrugA 2.3295078
6 2 3 DrugA 1.1795316
7 3 1 DrugB 3.4874291
8 3 2 DrugB 3.7383247
9 3 3 DrugB 3.5757814
10 4 1 Control 0.6946116
11 4 2 Control 2.5117812
12 4 3 Control 1.3898432
13 5 1 DrugA 1.3787594
14 5 2 DrugA -0.2146999
15 5 3 DrugA 3.1249309
16 6 1 DrugB 2.9550664
17 6 2 DrugB 2.9838097
18 6 3 DrugB 3.9438362
19 7 1 Control 1.8212212
20 7 2 Control 1.5939013
21 7 3 Control 1.9189774
22 8 1 DrugA 2.7821363
23 8 2 DrugA 2.0745650
24 8 3 DrugA 0.0106483
25 9 1 DrugB 3.6198257
26 9 2 DrugB 2.9438713
27 9 3 DrugB 2.8442045
table(data$time, data$treatment)
   
Control DrugA DrugB
1 3 3 3
2 3 3 3
3 3 3 3

本文使用的示例数据包括了 10 名被试,在不同时间点(1 至 3)和三种处理(treatment)下的测量值。数据结构旨在展示如何应用不同的统计方法来分析重复测量数据。

可视化

fig <- ggplot(data, aes(time, value, color = treatment)) +
stat_summary(
geom = "line",
fun = mean,
linewidth = 1,
position = position_dodge(0.4),
show.legend = FALSE
) +
stat_summary(
size = 1,
linewidth = 1,
position = position_dodge(0.4)
) +
scale_color_brewer(palette = "Set2") +
ggthemes::theme_gdocs(
base_family = "Prompt",
base_size = 22
) +
theme(
plot.background = element_rect(color = NA)
)
# fig

ggsave("dat-viz.png", fig, width = 6, height = 5, dpi = 600, bg = "white")
No summary function supplied, defaulting to `mean_se()`
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?

图中展示了不同时间点和处理下测量值的趋势。线段连接每个处理组在不同时间点的平均测量值,不同颜色表示不同的处理(treatment)。

aov 方差分析

# 使用 aov 做重复测量方差分析
rm_aov <- aov(value ~ treatment * time + Error(subject / time), data = data)
summary(rm_aov)

Error: subject
Df Sum Sq Mean Sq F value Pr(>F)
treatment 2 20.466 10.233 14.49 0.00505 **
Residuals 6 4.237 0.706
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Error: subject:time
Df Sum Sq Mean Sq F value Pr(>F)
time 2 0.369 0.1845 0.182 0.835
treatment:time 4 3.488 0.8720 0.863 0.514
Residuals 12 12.129 1.0108

传统的重复测量方差分析(aov 函数)通过误差项(Error)来区分被试内(subject)和时间(subject:time)的变异。

分析结果显示处理(treatment)对测量值有显著影响(F = 14.49,P = 0.00505),而时间点及其交互效应不显著,而时间(F = 0.182,P = 0.835)和处理的交互效应(F = 0.863,P = 0.514)不显著。

aov 多重比较

# 多重比较
emm_aov <- emmeans(rm_aov, specs = ~treatment)
Note: re-fitting model with sum-to-zero contrasts
NOTE: Results may be misleading due to involvement in interactions
# 输出多重比较结果
contrast(emm_aov, method = "pairwise", adjust = "tukey")
 contrast        estimate    SE df t.ratio p.value
Control - DrugA -0.512 0.396 6 -1.293 0.4489
Control - DrugB -2.049 0.396 6 -5.172 0.0050
DrugA - DrugB -1.537 0.396 6 -3.879 0.0192

Results are averaged over the levels of: time
P value adjustment: tukey method for comparing a family of 3 estimates

lmer 分析

# 使用 lmer 做重复测量方差分析
lmm <- lmer(value ~ treatment * time + (1 | subject), data = data)
boundary (singular) fit: see help('isSingular')
anova(lmm, type = 3)
Type III Analysis of Variance Table with Satterthwaite's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
treatment 20.4664 10.2332 2 18 11.2545 0.0006753 ***
time 0.3689 0.1845 2 18 0.2029 0.8182309
treatment:time 3.4881 0.8720 4 18 0.9590 0.4536017
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

线性混合模型(lmer 函数)考虑了每个被试(subject)的随机效应(random effect)。

结果显示处理(treatment)对测量值有显著影响(F = 11.2545,P = 0.0006753),而时间(F = 0.2029,P = 0.8182309)和处理的交互效应(F = 0.9590,P = 0.4536017)不显著。

lmer 多重比较

# 多重比较 1
summary(
glht(
lmm,
emm(pairwise ~ treatment)
)
)
NOTE: Results may be misleading due to involvement in interactions
Note: df set to 6

Simultaneous Tests for General Linear Hypotheses

Fit: lmer(formula = value ~ treatment * time + (1 | subject), data = data)

Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
Control - DrugA == 0 -0.5121 0.4495 -1.139 0.52728
Control - DrugB == 0 -2.0489 0.4495 -4.558 0.00896 **
DrugA - DrugB == 0 -1.5368 0.4495 -3.419 0.03263 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
# 多重比较 2
emm_lmm <- emmeans(lmm, specs = ~treatment)
NOTE: Results may be misleading due to involvement in interactions
contrast(emm_lmm, method = "pairwise", adjust = "tukey")
 contrast        estimate   SE df t.ratio p.value
Control - DrugA -0.512 0.45 6 -1.139 0.5273
Control - DrugB -2.049 0.45 6 -4.558 0.0092
DrugA - DrugB -1.537 0.45 6 -3.419 0.0327

Results are averaged over the levels of: time
Degrees-of-freedom method: kenward-roger
P value adjustment: tukey method for comparing a family of 3 estimates


ecologyR
🚀打赏可获取本公众号代码合集(见置顶文章)📌统计案例📌统计制图📌显著性标记📌结构方程模型可视化工具📌SEM教程与案例📌论文代码复现📌地图可视化
 最新文章