写在前面
又是行善积德的一天,把手里的5篇稿件都审回去了,只有一篇由于作者的回复态度极其恶劣,让作者又修改了一轮。所以小伙伴们回复审稿人的时候一定要认真对待,面对自己不会修改的问题或不想修改的问题,一定要做合理回复。好了,回到正题。本来想连续的更新层次分割的,但是昨天做结构方程模型的时候,想起来之前讲过如果只描述SEM没有凸显出自己想要的信息的时候,我们可以计算SEM的直接、间接和总效应达到我们的目的。关于如何计算SEM的直接、间接和总效应我在之前的推文里做了详细的解释,后来也尝试过自动化的计算,但是之前写的代码有些小bug,没有普适性,当因子无限多的时候会无法计算,所以我更新了代码。调试了好几遍,加多少因子和路径都能计算。
准备数据集
数据如下,小编已上传,文末下载。
开始分析
我们先读取数据并运行SEM
library(piecewiseSEM)
library(ggplot2)
library(reshape2)
library(dplyr)
# 读取数据
litter <- read.csv(file.choose(), header = TRUE)
# 定义模型
model <- psem(
lm(TRAD ~ BD+AK+AN+SM+AP, litter ),
lm(AK ~ BD+SM, litter ),
lm(AN ~ BD+SM, litter ),
lm(AP ~ BD+SM, litter ),
#双向在后边的计算中会被忽略 ~~%BD)
)
#模型结果
summary(model, .progressBar = F)
模型结果如下,我们可以看见AIC BIC Fisher's C 以及P值,还有就是我们最关键的路径系数
#模型结果
summary(model, .progressBar = F)
Structural Equation Model of model
Call:
TRAD ~ BD + AK + AN + SM + AP
AK ~ BD + SM
AN ~ BD + SM
AP ~ BD + SM
SM ~~ BD
AIC BIC
87.215
---
Tests of directed separation:
Test.Type DF Crit.Value P.Value
AN ~ AK + ... coef 50 -1.0486 0.2994
AP ~ AK + ... coef 50 0.6232 0.5360
AP ~ AN + ... coef 50 2.3911 0.0206 *
Global goodness-of-fit:
C = 11.424 with P-value = 0.076 and on 6 degrees of freedom
---
Coefficients:
Response Predictor Estimate Std.Error DF Crit.Value P.Value Std.Estimate
TRAD BD 0.0963 0.0213 48 4.5111 0.0000 0.3650 ***
TRAD AK 0.0005 1e-04 48 7.2792 0.0000 0.5074 ***
TRAD AN -0.0009 2e-04 48 -5.5825 0.0000 -0.4558 ***
TRAD SM -0.0019 5e-04 48 -3.5507 0.0009 -0.2427 ***
TRAD AP -0.0117 0.0039 48 -3.0216 0.0040 -0.2306 **
AK BD -84.1092 31.0104 51 -2.7123 0.0091 -0.3297 **
AK SM -3.0687 0.9367 51 -3.2762 0.0019 -0.3983 **
AN BD 76.5119 14.4832 51 5.2828 0.0000 0.5764 ***
AN SM 1.1743 0.4375 51 2.6842 0.0098 0.2929 **
AP BD 2.8427 0.6011 51 4.7292 0.0000 0.5482 ***
AP SM 0.0272 0.0182 51 1.4959 0.1408 0.1734
~~BD -0.0669 - 52 -0.4834 0.6308 -0.0669
codes: 0 '***' 0.001 '**' 0.01 '*' 0.05
---
Individual R-squared:
Response method R.squared
TRAD none 0.83
AK none 0.25
AN none 0.40
AP none 0.32
开始计算三个效应
开始计算直接、间接、总效应,代码很长,但是大家直接调用即可,这步的目的就是把结果计算出来并列成表
#开始计算直接、间接、总效应
coefs_df <- coefs(model)
print("=== 原始 coefs_df ===")
print(coefs_df)
# 若过滤后无数据,则无法计算路径效应
if (nrow(coefs_df) == 0) {
stop("过滤后没有有效的单向回归路径。请检查模型或数据。")
}
coefs_df$effect <- coefs_df$Std.Estimate
build_graph <- function(coefs_df) {
graph <- list()
for (i in 1:nrow(coefs_df)) {
from <- coefs_df$Predictor[i]
to <- coefs_df$Response[i]
effect <- coefs_df$effect[i]
if (!is.null(graph[[from]])) {
graph[[from]] <- rbind(graph[[from]], data.frame(to = to, effect = effect, stringsAsFactors = FALSE))
} else {
graph[[from]] <- data.frame(to = to, effect = effect, stringsAsFactors = FALSE)
}
}
return(graph)
}
graph <- build_graph(coefs_df)
print("=== 构建的 graph ===")
print(graph)
find_all_paths <- function(graph, start, end, visited = character()) {
if (start == end) {
return(list(c(end)))
}
if (!start %in% names(graph)) {
return(list())
}
visited <- c(visited, start)
paths <- list()
for (i in 1:nrow(graph[[start]])) {
next_node <- graph[[start]]$to[i]
if (!(next_node %in% visited)) {
sub_paths <- find_all_paths(graph, next_node, end, visited)
for (sp in sub_paths) {
paths <- c(paths, list(c(start, sp)))
}
}
}
return(paths)
}
path_effect <- function(path, graph) {
eff <- 1
for (i in 1:(length(path)-1)) {
from <- path[i]
to <- path[i+1]
edge <- graph[[from]]
eff <- eff * edge$effect[edge$to == to]
}
return(eff)
}
compute_all_effects <- function(graph, variables) {
results <- data.frame(
predictor = character(),
outcome = character(),
direct_effect = numeric(),
indirect_effect = numeric(),
total_effect = numeric(),
stringsAsFactors = FALSE
)
for (pred in variables) {
for (outc in variables) {
if (pred != outc) {
paths <- find_all_paths(graph, pred, outc)
# Debug print
# print(paste("Paths from", pred, "to", outc, ":"))
# print(paths)
if (length(paths) > 0) {
direct_effect <- 0
indirect_effect <- 0
for (p in paths) {
eff <- path_effect(p, graph)
if (length(p) == 2) {
direct_effect <- direct_effect + eff
} else {
indirect_effect <- indirect_effect + eff
}
}
total_effect <- direct_effect + indirect_effect
results <- rbind(results, data.frame(
predictor = pred,
outcome = outc,
direct_effect = direct_effect,
indirect_effect = indirect_effect,
total_effect = total_effect,
stringsAsFactors = FALSE
))
}
}
}
}
return(results)
}
all_vars <- unique(c(coefs_df$Response, coefs_df$Predictor))
effects_df <- compute_all_effects(graph, all_vars)
print("=== 计算的效应结果 effects_df ===")
#整体的计算结果
print(effects_df)
最后整理的结果如下,这个时候我们得到了所有的因子的直接、间接和总效应
> #整体的计算结果
> print(effects_df)
predictor outcome direct_effect indirect_effect total_effect
1 AK TRAD 0.5074 0.0000000 0.5074000
2 AN TRAD -0.4558 0.0000000 -0.4558000
3 AP TRAD -0.2306 0.0000000 -0.2306000
4 BD TRAD 0.3650 -0.5564278 -0.1914278
5 BD AK -0.3297 0.0000000 -0.3297000
6 BD AN 0.5764 0.0000000 0.5764000
7 BD AP 0.5482 0.0000000 0.5482000
8 SM TRAD -0.2427 -0.3755873 -0.6182873
9 SM AK -0.3983 0.0000000 -0.3983000
10 SM AN 0.2929 0.0000000 0.2929000
11 SM AP 0.1734 0.0000000 0.1734000
12 ~~BD ~~SM -0.0669 0.0000000 -0.0669000
虽然得到了所有的因子的直接、间接和总效应,但是别忘了我们的目的,我们是为了计算这些因子多目标因子TRAD的直接、间接和总效应,所以我在这里接了一步确定目标因子的代码。第16行是规定最后的绘图显示的因子有哪些(因为全画会有双向因子的出现,而我们双向因子的路径系数是不纳入效应值计算的),26行则是明确我们的目标因子是TRAD
if (nrow(effects_df) == 0) {
stop("没有计算出任何效应值,请检查路径是否存在。")
}
# 自定义三个效应的展示顺序
metrics_to_show <- c("direct_effect", "indirect_effect", "total_effect")
plot_df <- melt(effects_df,
id.vars = c("predictor", "outcome"),
measure.vars = metrics_to_show,
variable.name = "effect_type",
value.name = "effect_size")
# 自定义X轴显示顺序及需要显示的指标
plot_df$effect_type <- factor(plot_df$effect_type, levels = metrics_to_show)
desired_order <- c("BD", "SM", "AN", "AP", "AK")
# 只保留指定顺序中的 predictor
plot_df <- plot_df %>% filter(predictor %in% desired_order)
# 将 predictor 转换为因子并按顺序排序
plot_df$predictor <- factor(plot_df$predictor, levels = desired_order)
# 假设 'outcome' 是列名,筛选出 outcome 为 'TRAD' 的行
plot_df_filtered <- plot_df %>% filter(outcome == "TRAD")
plot_df_filtered
最后的结果如下,就得到了对目标因子TRAD的直接、间接、总效应。这样我们就大功告成了
plot_df_filtered
predictor outcome effect_type effect_size
1 AK TRAD direct_effect 0.5074000
2 AN TRAD direct_effect -0.4558000
3 AP TRAD direct_effect -0.2306000
4 BD TRAD direct_effect 0.3650000
5 SM TRAD direct_effect -0.2427000
6 AK TRAD indirect_effect 0.0000000
7 AN TRAD indirect_effect 0.0000000
8 AP TRAD indirect_effect 0.0000000
9 BD TRAD indirect_effect -0.5564278
10 SM TRAD indirect_effect -0.3755873
11 AK TRAD total_effect 0.5074000
12 AN TRAD total_effect -0.4558000
13 AP TRAD total_effect -0.2306000
14 BD TRAD total_effect -0.1914278
15 SM TRAD total_effect -0.6182873
开始绘图
计算结果都有了,我们就开始绘图
# 绘图
ggplot(plot_df_filtered, aes(x = predictor, y = effect_size, fill = effect_type)) +
geom_bar(stat = "identity", position = "dodge") +
labs(x = "Predictor", y = "Effect Size (Standardized)",
title = "Effects on TRAD: Direct, Indirect, and Total") +
scale_fill_manual(values = c("direct_effect" = "#008CB7",
"indirect_effect" = "#A5C98C",
"total_effect" = "#FFBD00")) +
theme_bw() +
theme(axis.text.x = element_text(angle = 0, hjust = 0.5),
legend.title = element_blank()) +
geom_hline(yintercept = 0, linetype = "solid", color = "black", size = 0.5)+ # 添加水平线
ylim(-1, 1) # 设置 Y 轴范围
这样,我们就计算完成了,代码的第一版很冗长,修改了好久写了几个循环才简化下来。好了,快到点了,吃席去了。
完整版代码如下
library(piecewiseSEM)
library(ggplot2)
library(reshape2)
library(dplyr)
# 读取数据
litter <- read.csv(file.choose(), header = TRUE)
# 定义模型
model <- psem(
lm(TRAD ~ BD+AK+AN+SM+AP, litter ),
lm(AK ~ BD+SM, litter ),
lm(AN ~ BD+SM, litter ),
lm(AP ~ BD+SM, litter ),
(SM%~~%BD)#双向在后边的计算中会被忽略
)
#模型结果
summary(model, .progressBar = F)
#开始计算直接、间接、总效应
coefs_df <- coefs(model)
print("=== 原始 coefs_df ===")
print(coefs_df)
# 若过滤后无数据,则无法计算路径效应
if (nrow(coefs_df) == 0) {
stop("过滤后没有有效的单向回归路径。请检查模型或数据。")
}
coefs_df$effect <- coefs_df$Std.Estimate
build_graph <- function(coefs_df) {
graph <- list()
for (i in 1:nrow(coefs_df)) {
from <- coefs_df$Predictor[i]
to <- coefs_df$Response[i]
effect <- coefs_df$effect[i]
if (!is.null(graph[[from]])) {
graph[[from]] <- rbind(graph[[from]], data.frame(to = to, effect = effect, stringsAsFactors = FALSE))
} else {
graph[[from]] <- data.frame(to = to, effect = effect, stringsAsFactors = FALSE)
}
}
return(graph)
}
graph <- build_graph(coefs_df)
print("=== 构建的 graph ===")
print(graph)
find_all_paths <- function(graph, start, end, visited = character()) {
if (start == end) {
return(list(c(end)))
}
if (!start %in% names(graph)) {
return(list())
}
visited <- c(visited, start)
paths <- list()
for (i in 1:nrow(graph[[start]])) {
next_node <- graph[[start]]$to[i]
if (!(next_node %in% visited)) {
sub_paths <- find_all_paths(graph, next_node, end, visited)
for (sp in sub_paths) {
paths <- c(paths, list(c(start, sp)))
}
}
}
return(paths)
}
path_effect <- function(path, graph) {
eff <- 1
for (i in 1:(length(path)-1)) {
from <- path[i]
to <- path[i+1]
edge <- graph[[from]]
eff <- eff * edge$effect[edge$to == to]
}
return(eff)
}
compute_all_effects <- function(graph, variables) {
results <- data.frame(
predictor = character(),
outcome = character(),
direct_effect = numeric(),
indirect_effect = numeric(),
total_effect = numeric(),
stringsAsFactors = FALSE
)
for (pred in variables) {
for (outc in variables) {
if (pred != outc) {
paths <- find_all_paths(graph, pred, outc)
# Debug print
# print(paste("Paths from", pred, "to", outc, ":"))
# print(paths)
if (length(paths) > 0) {
direct_effect <- 0
indirect_effect <- 0
for (p in paths) {
eff <- path_effect(p, graph)
if (length(p) == 2) {
direct_effect <- direct_effect + eff
} else {
indirect_effect <- indirect_effect + eff
}
}
total_effect <- direct_effect + indirect_effect
results <- rbind(results, data.frame(
predictor = pred,
outcome = outc,
direct_effect = direct_effect,
indirect_effect = indirect_effect,
total_effect = total_effect,
stringsAsFactors = FALSE
))
}
}
}
}
return(results)
}
all_vars <- unique(c(coefs_df$Response, coefs_df$Predictor))
effects_df <- compute_all_effects(graph, all_vars)
print("=== 计算的效应结果 effects_df ===")
#整体的计算结果
print(effects_df)
if (nrow(effects_df) == 0) {
stop("没有计算出任何效应值,请检查路径是否存在。")
}
# 自定义三个效应的展示顺序
metrics_to_show <- c("direct_effect", "indirect_effect", "total_effect")
plot_df <- melt(effects_df,
id.vars = c("predictor", "outcome"),
measure.vars = metrics_to_show,
variable.name = "effect_type",
value.name = "effect_size")
# 自定义X轴显示顺序及需要显示的指标
plot_df$effect_type <- factor(plot_df$effect_type, levels = metrics_to_show)
desired_order <- c("BD", "SM", "AN", "AP", "AK")
# 只保留指定顺序中的 predictor
plot_df <- plot_df %>% filter(predictor %in% desired_order)
# 将 predictor 转换为因子并按顺序排序
plot_df$predictor <- factor(plot_df$predictor, levels = desired_order)
# 假设 'outcome' 是列名,筛选出 outcome 为 'TRAD' 的行
plot_df_filtered <- plot_df %>% filter(outcome == "TRAD")
plot_df_filtered
# 绘图
ggplot(plot_df_filtered, aes(x = predictor, y = effect_size, fill = effect_type)) +
geom_bar(stat = "identity", position = "dodge") +
labs(x = "Predictor", y = "Effect Size (Standardized)",
title = "Effects on TRAD: Direct, Indirect, and Total") +
scale_fill_manual(values = c("direct_effect" = "#008CB7",
"indirect_effect" = "#A5C98C",
"total_effect" = "#FFBD00")) +
theme_bw() +
theme(axis.text.x = element_text(angle = 0, hjust = 0.5),
legend.title = element_blank()) +
geom_hline(yintercept = 0, linetype = "solid", color = "black", size = 0.5)+ # 添加水平线
ylim(-1, 1) # 设置 Y 轴范围
脚本与示例数据下载
下载的时候顺便点个赞呗~
关注公众号“生态R学社”回复“效应计算和绘图”即可下载完整代码与示例数据