R 语言 SEM|结构方程模型直接间接效应|作图|可视化

文摘   科学   2024-11-20 10:02   江苏  

往期

👉 鼠标修改布局!全球独家 R 语言 SEM 可视化工具
👉 完全用 R 语言实现 Manuel Delgado-Baquerizo 风格的 SEM 图
👉️ 尝试用 R 语言画一个 PNAS 文章的 SEM 图
👉 R 语言 SEM 之多组分析(Multigroup)
👉 R 语言 SEM 之复合变量(Composite)的构造案例
👉 R 语言 SEM 之使用 ggplot2 衍生包画 Nature 结构方程模型图
👉 R 语言 SEM 之贝叶斯(Bayesian)结构方程模型
👉 R 语言 SEM 合集


(点击 👉 即可跳转。)

目录

引言
1 需要的包
2 需要的数据
3 使用 piecewiseSEM 拟合
4 SEM 可视化:计算直接和间接效应
4.0 用鼠标绘制 SEM 图示 APP(更新)
4.1 第一步:将 SEM 结果变成网络
4.2 绘图:结构方程模型路径图
4.3 绘图:直接和间接效应
6 SEM:重抽样的直接和间接效应
6.1 重抽样:直接和间接效应
6.2 重抽样:直接和间接效应的置信区间
7 附:getEffects 函数

引言

探索使用代码(而不是动手)计算 SEM 的直接和间接效应。

1 需要的包

library(tidyverse)
library(piecewiseSEM)
library(tidygraph)
library(ggraph)
library(igraph)

2 需要的数据

数据标准化,

# 来自 piecewiseSEM 包
data(keeley)

# 数据标准化
keeleyScaled <- keeley |>
dplyr::mutate_if(
is.numeric,
~ as.numeric(scale(.x))
)
str(keeleyScaled)
'data.frame': 90 obs. of 8 variables:
$ distance: num 0.473 -1.381 0.505 0.505 0.309 ...
$ elev : num 3.1 -1.41 -0.87 -0.87 2.11 ...
$ abiotic : num 1.489 -1.08 0.228 1.552 -0.335 ...
$ age : num 1.1486 -0.0451 -0.8409 -0.8409 -0.2043 ...
$ hetero : num 0.6423 -1.672 1.4037 0.0656 -1.1992 ...
$ firesev : num -0.645 -0.312 -1.189 -1.008 -0.16 ...
$ cover : num 1.096 -0.673 0.812 1.588 1.913 ...
$ rich : num 0.117 -1.207 1.441 0.978 1.242 ...

3 使用 piecewiseSEM 拟合

# 拟合 SEM
pSem <- psem(
lm(age ~ distance, data = keeleyScaled),
lm(hetero ~ distance, data = keeleyScaled),
lm(abiotic ~ distance, data = keeleyScaled),
lm(firesev ~ age, data = keeleyScaled),
lm(cover ~ firesev, data = keeleyScaled),
lm(rich ~ distance + hetero + abiotic + cover, data = keeleyScaled)
)

# 路径系数
# coefs(pSem) |> print()

4 SEM 可视化:计算直接和间接效应

整理 SEM 结果,

# 结果整理为 from、to、weight、p 数据框
semCoefs <- coefs(pSem) |>
dplyr::relocate(
from = Predictor,
to = Response,
weight = Estimate,
p = P.Value
)
semCoefs |> print(row.names = FALSE)
from to weight p Std.Error DF Crit.Value Std.Estimate
distance age -0.2781 0.0079 0.1024 88 -2.7164 -0.2781 **
distance hetero 0.3460 0.0008 0.1000 88 3.4593 0.3460 ***
distance abiotic 0.4597 0.0000 0.0947 88 4.8562 0.4597 ***
age firesev 0.4539 0.0000 0.0950 88 4.7781 0.4539 ***
firesev cover -0.4371 0.0000 0.0959 88 -4.5594 -0.4371 ***
distance rich 0.2829 0.0021 0.0892 85 3.1710 0.2829 **
hetero rich 0.3383 0.0001 0.0825 85 4.1035 0.3383 ***
abiotic rich 0.2495 0.0037 0.0837 85 2.9815 0.2495 **
cover rich 0.2866 0.0005 0.0790 85 3.6297 0.2866 ***

对于 lavaan、plspm 或者 brms 的结构方程模型的整理,见以下超链接,

👉 拟合 SEM:lavaan、piecewiseSEM 和 plspm 三个包

(点击 👉 即可跳转。)

4.0 用鼠标绘制 SEM 图示 APP(更新)

将结果整理为 from、to、weight、p 数据框,并存为 .csv 文件,还可以使用独家 APP,用鼠标点击,修改 SEM 图示的布局!

  • APP 有一个小更新,见本期另一篇推送。

# 保存数据
write.csv(semCoefs, "sem_table.csv", row.names = FALSE)

打赏并私信,可以获取此 APP!

  • 详情见本期另一篇推送,

  • 另见以下超链接,

👉 全球独家 SEM 工具:用鼠标点击,可视化结构方程模型

(点击 👉 即可跳转。)

4.1 第一步:将 SEM 结果变成网络

此处,对于 piecewiseSEM 的结果,可以这样转换为一个网络,

# igraph 和 tidygraph 包,将数据转换为一个图(graph)
semGraph <- semCoefs |>
select(from, to, weight, p) |>
igraph::graph_from_data_frame(directed = TRUE) |>
tidygraph::as_tbl_graph()
semGraph
# A tbl_graph: 7 nodes and 9 edges
#
# A directed acyclic simple graph with 1 component
#
# Node Data: 7 × 1 (active)
name
<chr>
1 distance
2 age
3 firesev
4 hetero
5 abiotic
6 cover
7 rich
#
# Edge Data: 9 × 4
from to weight p
<int> <int> <dbl> <dbl>
1 1 2 -0.278 0.0079
2 1 4 0.346 0.0008
3 1 5 0.460 0
# ℹ 6 more rows

4.2 绘图:结构方程模型路径图

# 字体
customizedFont <- "DejaVu Sans"

# 简单的 SEM 可视化
semGraph |>
ggraph(layout = "tree") +# sugiyama, circle, tree
geom_edge_arc(
strength = -0.2,
aes(
edge_color = ifelse(weight > 0, "+", "-"),
edge_width = abs(weight),
label = round(weight, 3)
),
start_cap = circle(0.5),
end_cap = circle(0.5),
label_size = 3,
angle_calc = "along",
vjust = 1.5,
arrow = arrow(angle = 60, length = unit(5, "pt")),
family = customizedFont
) +
geom_node_label(
aes(label = name),
size = 4,
family = customizedFont
) +
scale_edge_color_brewer(palette = "Set2", direction = -1) +
scale_edge_width_continuous(range = c(0.2, 2)) +
coord_equal(clip = "off") +
theme_void(
base_family = customizedFont,
base_size = 18
) +
theme(
legend.position = "none",
strip.text = element_text(size = 12)
)
Warning: Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
generated.

4.3 绘图:直接和间接效应

使用代码,计算直接和间接效应,

  • 好处是,适用性强,只要将结果整理为 from、to、weight、p 数据框(如下)就可以使用,

  • 缺点是,无法检验间接路径显著性(但是 semEff 包可以,见后文),

  • getEffects 函数见文末,另存为 getEffects.R,然后 source("getEffects.R") 调用之。

# 使用函数
source("getEffects.R")
effectData <- getEffects(semGraph)

# 绘制效应
effectData |>
ggplot() +
aes(
x = from,
y = value,
fill = name
) +
geom_col(
color = "grey0",
lwd = 1.5
) +
geom_label(
aes(label = round(value, 3)),
size = 3,
position = position_stack(vjust = 0.5),
family = customizedFont
) +
scale_fill_manual(
name = NULL,
values = c("Direct" = "green4", "Indirect" = "grey100"),
na.translate = FALSE
) +
facet_grid(. ~ paste("Effects to:", to)) +
coord_cartesian(ylim = c(-0.2, 0.6)) +
theme_minimal(
base_family = customizedFont,
base_size = 16,
base_line_size = 2
) +
theme(
panel.grid.major.x = element_blank(),
panel.grid.minor = element_blank(),
strip.text = element_text(size = 14)
)
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_col()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_label()`).

6 SEM:重抽样的直接和间接效应

用了另一个现成的包,如下,

  • 因为是重抽样获得,所以与上述根据结果计算略有不同,

  • 而且,可以检验间接效应是否显著!

# https://murphymv.github.io/semEff/articles/semEff.html
library(semEff)

# 重抽样 1000 次(并计时)
system.time(
pSemBoot <- semEff::bootEff(
mod = pSem,
R = 100,# 可设置为 1000 或以上
seed = 1,
parallel = "multicore"
)
)
user system elapsed
13.331 7.083 2.675

获取效应,

# 获取效应
semEff <- semEff::semEff(pSemBoot)
# summary(semEff) # 此代码,获取所有效应,去掉注释查看效果

# 各类效应
semEff::getTotEff(semEff, "rich") |> round(3)# 总效应
(Intercept) distance age hetero abiotic firesev
0.000 0.453 -0.053 0.301 0.219 -0.117
cover
0.267
semEff::getDirEff(semEff, "rich") |> round(3)# 总效应
(Intercept) distance hetero abiotic cover
0.000 0.233 0.301 0.219 0.267
semEff::getIndEff(semEff, "rich") |> round(3)# 总效应
(Intercept) distance age firesev
0.000 0.220 -0.053 -0.117

6.1 重抽样:直接和间接效应

  • 因为是重抽样获得,所以与上述根据结果计算略有不同。

# 结果存储在 effRes 变量中
effRes <- semEff::getEff(semEff, "rich")

# 提取直接、间接、总效应
dEf <- effRes$rich$Direct
indEf <- effRes$rich$Indirect
totalEf <- effRes$rich$Total

# 转换为数据框
funGetEff <-function(Tp, Ef) {
data.frame(effType = Tp, Effect = Ef, row.names = names(Ef)) |>
tibble::rownames_to_column(var = "Predictor")
}
dEfDf <- funGetEff("Direct", dEf)
indEfDf <- funGetEff("Indirect", indEf)
totalEfDf <- funGetEff("Total", totalEf)

# 合并数据框
combinedDf <- rbind(dEfDf, indEfDf, totalEfDf) |>
dplyr::filter(Predictor != "(Intercept)")

# 绘图
combinedDf |>
dplyr::filter(effType != "Total") |>
ggplot() +
aes(
x = Predictor,
y = Effect,
fill = effType
) +
geom_col(
color = "grey0",
lwd = 1.5
) +
geom_label(
aes(label = round(Effect, 3)),
size = 3,
position = position_stack(vjust = 0.5),
family = customizedFont
) +
scale_fill_manual(
name = NULL,
values = c("Direct" = "green4", "Indirect" = "grey100"),
na.translate = FALSE
) +
facet_grid(. ~ paste("Effects to: rich")) +
coord_cartesian(ylim = c(-0.2, 0.6)) +
theme_minimal(
base_family = customizedFont,
base_size = 16,
base_line_size = 2
) +
theme(
panel.grid.major.x = element_blank(),
panel.grid.minor = element_blank(),
strip.text = element_text(size = 14)
)

6.2 重抽样:直接和间接效应的置信区间

bootEff <- semEff$`Bootstrapped Effects`[["rich"]]
lapply(bootEff, head, 3)
$Direct
(Intercept) distance hetero abiotic cover
[1,] -6.001212e-17 0.2489151 0.3069260 0.04232212 0.32140204
[2,] 1.596619e-17 0.3025537 0.4142906 0.12890911 0.10502234
[3,] -1.513694e-17 0.2867629 0.1815622 0.29990303 0.09868543

$Indirect
(Intercept) distance age firesev
[1,] -6.001212e-17 0.1586614 -0.044291734 -0.17155155
[2,] 1.596619e-17 0.1707798 -0.014439474 -0.03006690
[3,] -1.513694e-17 0.1651902 -0.006246364 -0.01870544

$Total
(Intercept) distance age hetero abiotic firesev
[1,] -6.001212e-17 0.4075764 -0.044291734 0.3069260 0.04232212 -0.17155155
[2,] 1.596619e-17 0.4733334 -0.014439474 0.4142906 0.12890911 -0.03006690
[3,] -1.513694e-17 0.4519531 -0.006246364 0.1815622 0.29990303 -0.01870544
cover
[1,] 0.32140204
[2,] 0.10502234
[3,] 0.09868543

$Mediators
age hetero abiotic firesev cover
[1,] 0.006250031 0.13073737 0.02167396 -0.038041703 -0.20959325
[2,] 0.002024480 0.10452326 0.06423202 -0.012414993 -0.04248189
[3,] 0.001294425 0.03769445 0.12620135 -0.004951939 -0.02365738
# 将 list 中的每个矩阵转换为数据框,保留其名字 names;并合并所有数据框 bind_rows
bootEffDf <- map(bootEff, as.data.frame) |>
map2(names(bootEff), ~ mutate(.x, effType = .y)) |>
bind_rows() |>
dplyr::select(-`(Intercept)`)

# 计算每个路径的均值和 95% 置信区间
summaryStats <- bootEffDf |>
# 将数据转换为长格式,以便进行计算
pivot_longer(
cols = -effType,
names_to = "Predictor",
values_to = "tempEstimate"
) |>
# 按分组 .by 计算均值和 95% 置信区间
summarise(
Estimate = mean(tempEstimate, na.rm = TRUE),
ci95Lower = quantile(tempEstimate, 0.025, na.rm = TRUE),
ci95Upper = quantile(tempEstimate, 0.975, na.rm = TRUE),
.by = c(effType, Predictor)
)

# 绘图
summaryStats |>
filter(!effType %in% c("Total", "Mediators")) |>
ggplot() +
aes(
x = Predictor,
y = Estimate,
ymin = ci95Lower,
ymax = ci95Upper,
color = effType
) +
geom_col(
width = 0.9,
lwd = 1.5,
fill = "white",
position = position_stack(vjust = 0.5)
# position = position_dodge(1)
) +
# geom_errorbar(
# width = 0,
# lwd = 0.5,
# position = position_dodge(1),
# show.legend = FALSE
# ) +
geom_text(
label = "✱",
size = 2,
position = position_stack(vjust = 0.5),
# position = position_dodge(1),
family = customizedFont
) +
scale_color_manual(
name = NULL,
values = c("Direct" = "green4", "Indirect" = "grey0"),
na.translate = FALSE
) +
facet_grid(. ~ paste("Effects to: rich")) +
coord_cartesian(ylim = c(-0.2, 0.6)) +
theme_minimal(
base_family = customizedFont,
base_size = 16,
base_line_size = 2
) +
theme(
panel.grid.major.x = element_blank(),
panel.grid.minor = element_blank(),
strip.text = element_text(size = 14)
)
Warning: Stacking not well defined when not anchored on the axis
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_col()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_text()`).

7 附:getEffects 函数

getEffects <-function(semDag) {
# 获取拓扑排序的节点名称(假设 topo 最后的节点,是关注的自变量;最终只有一个关注的自变量)
dagTopo <- igraph::topo_sort(semDag)$name
lenTopo <- length(dagTopo)

# 初始化一个向量来存储所有路径的直接和间接效应
dEfVector <- numeric(length(dagTopo))# 直接效应
indEfVector <- numeric(length(dagTopo))# 间接效应

# 循环遍历拓扑排序的节点
for (jin seq_along(dagTopo)[-lenTopo]) {
# 所有简单路径
allSimplePaths <- igraph::all_simple_paths(
semDag,
from = dagTopo[j],
to = dagTopo[lenTopo]
)

# 初始化间接效应的变量
directEf <- indirectEfSum <- 0

# 遍历所有路径
# 如果路径长度大于 2,则为间接路径,需要相乘再求和
# 如果路径长度为 2,则为直接路径,直接取宽度
for (iin allSimplePaths) {
eWd <- E(semDag, path = i)$weight
# 路径长度为 2 直接取宽度
if (length(i) == 2) {
directEf = eWd
}else {
# 路径长度大于 2 需要相乘再求和
indirectEfSum = indirectEfSum + prod(eWd)
}
}

# 存储直接效应和间接效应
dEfVector[j] = directEf
indEfVector[j] = indirectEfSum
}

# 创建一个数据框来存储结果
efDf <- data.frame(
from = dagTopo[-lenTopo],
to = dagTopo[lenTopo],
Direct = dEfVector[-lenTopo],
Indirect = indEfVector[-lenTopo]
) |>
tidyr::pivot_longer(cols = c(Direct, Indirect)) |>
dplyr::mutate(value = ifelse(value == 0, NA, value))

# 返回
return(efDf)
}



ecologyR
🚀打赏可获取本公众号代码合集(见置顶文章)📌统计案例📌统计制图📌显著性标记📌结构方程模型可视化工具📌SEM教程与案例📌论文代码复现📌地图可视化
 最新文章