SEM直接、间接、总效应自动计算及绘图

学术   2024-12-07 15:48   云南  

写在前面

又是行善积德的一天,把手里的5篇稿件都审回去了,只有一篇由于作者的回复态度极其恶劣,让作者又修改了一轮。所以小伙伴们回复审稿人的时候一定要认真对待,面对自己不会修改的问题或不想修改的问题,一定要做合理回复。好了,回到正题。本来想连续的更新层次分割的,但是昨天做结构方程模型的时候,想起来之前讲过如果只描述SEM没有凸显出自己想要的信息的时候,我们可以计算SEM的直接、间接和总效应达到我们的目的。关于如何计算SEM的直接、间接和总效应我在之前的推文里做了详细的解释,后来也尝试过自动化的计算,但是之前写的代码有些小bug,没有普适性,当因子无限多的时候会无法计算,所以我更新了代码。调试了好几遍,加多少因子和路径都能计算。

准备数据集

数据如下,小编已上传,文末下载。

开始分析

我们先读取数据并运行SEM

library(piecewiseSEM)library(ggplot2)library(reshape2)library(dplyr)
# 读取数据litter <- read.csv(file.choose(), header = TRUE)
# 定义模型model <- psem( lm(TRAD ~ BD+AK+AN+SM+AP, litter ), lm(AK ~ BD+SM, litter ), lm(AN ~ BD+SM, litter ), lm(AP ~ BD+SM, litter ), (SM%~~%BD)#双向在后边的计算中会被忽略)
#模型结果summary(model, .progressBar = F)

模型结果如下,我们可以看见AIC BIC Fisher's C 以及P值,还有就是我们最关键的路径系数

> #模型结果> summary(model, .progressBar = F)
Structural Equation Model of model
Call: TRAD ~ BD + AK + AN + SM + AP AK ~ BD + SM AN ~ BD + SM AP ~ BD + SM SM ~~ BD
AIC BIC 49.424 87.215
---Tests of directed separation:
Independ.Claim Test.Type DF Crit.Value P.Value AN ~ AK + ... coef 50 -1.0486 0.2994 AP ~ AK + ... coef 50 0.6232 0.5360 AP ~ AN + ... coef 50 2.3911 0.0206 *
Global goodness-of-fit:
Fisher's C = 11.424 with P-value = 0.076 and on 6 degrees of freedom
---Coefficients:
Response Predictor Estimate Std.Error DF Crit.Value P.Value Std.Estimate TRAD BD 0.0963 0.0213 48 4.5111 0.0000 0.3650 *** TRAD AK 0.0005 1e-04 48 7.2792 0.0000 0.5074 *** TRAD AN -0.0009 2e-04 48 -5.5825 0.0000 -0.4558 *** TRAD SM -0.0019 5e-04 48 -3.5507 0.0009 -0.2427 *** TRAD AP -0.0117 0.0039 48 -3.0216 0.0040 -0.2306 ** AK BD -84.1092 31.0104 51 -2.7123 0.0091 -0.3297 ** AK SM -3.0687 0.9367 51 -3.2762 0.0019 -0.3983 ** AN BD 76.5119 14.4832 51 5.2828 0.0000 0.5764 *** AN SM 1.1743 0.4375 51 2.6842 0.0098 0.2929 ** AP BD 2.8427 0.6011 51 4.7292 0.0000 0.5482 *** AP SM 0.0272 0.0182 51 1.4959 0.1408 0.1734 ~~SM ~~BD -0.0669 - 52 -0.4834 0.6308 -0.0669
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05
---Individual R-squared:
Response method R.squared TRAD none 0.83 AK none 0.25 AN none 0.40 AP none 0.32

开始计算三个效应

开始计算直接、间接、总效应,代码很长,但是大家直接调用即可,这步的目的就是把结果计算出来并列成表

#开始计算直接、间接、总效应coefs_df <- coefs(model)print("=== 原始 coefs_df ===")print(coefs_df)
# 若过滤后无数据,则无法计算路径效应if (nrow(coefs_df) == 0) { stop("过滤后没有有效的单向回归路径。请检查模型或数据。")}
coefs_df$effect <- coefs_df$Std.Estimate
build_graph <- function(coefs_df) { graph <- list() for (i in 1:nrow(coefs_df)) { from <- coefs_df$Predictor[i] to <- coefs_df$Response[i] effect <- coefs_df$effect[i] if (!is.null(graph[[from]])) { graph[[from]] <- rbind(graph[[from]], data.frame(to = to, effect = effect, stringsAsFactors = FALSE)) } else { graph[[from]] <- data.frame(to = to, effect = effect, stringsAsFactors = FALSE) } } return(graph)}
graph <- build_graph(coefs_df)print("=== 构建的 graph ===")print(graph)
find_all_paths <- function(graph, start, end, visited = character()) { if (start == end) { return(list(c(end))) } if (!start %in% names(graph)) { return(list()) } visited <- c(visited, start) paths <- list() for (i in 1:nrow(graph[[start]])) { next_node <- graph[[start]]$to[i] if (!(next_node %in% visited)) { sub_paths <- find_all_paths(graph, next_node, end, visited) for (sp in sub_paths) { paths <- c(paths, list(c(start, sp))) } } } return(paths)}
path_effect <- function(path, graph) { eff <- 1 for (i in 1:(length(path)-1)) { from <- path[i] to <- path[i+1] edge <- graph[[from]] eff <- eff * edge$effect[edge$to == to] } return(eff)}
compute_all_effects <- function(graph, variables) { results <- data.frame( predictor = character(), outcome = character(), direct_effect = numeric(), indirect_effect = numeric(), total_effect = numeric(), stringsAsFactors = FALSE ) for (pred in variables) { for (outc in variables) { if (pred != outc) { paths <- find_all_paths(graph, pred, outc) # Debug print # print(paste("Paths from", pred, "to", outc, ":")) # print(paths) if (length(paths) > 0) { direct_effect <- 0 indirect_effect <- 0 for (p in paths) { eff <- path_effect(p, graph) if (length(p) == 2) { direct_effect <- direct_effect + eff } else { indirect_effect <- indirect_effect + eff } } total_effect <- direct_effect + indirect_effect results <- rbind(results, data.frame( predictor = pred, outcome = outc, direct_effect = direct_effect, indirect_effect = indirect_effect, total_effect = total_effect, stringsAsFactors = FALSE )) } } } } return(results)}
all_vars <- unique(c(coefs_df$Response, coefs_df$Predictor))effects_df <- compute_all_effects(graph, all_vars)print("=== 计算的效应结果 effects_df ===")
#整体的计算结果print(effects_df)

最后整理的结果如下,这个时候我们得到了所有的因子的直接、间接和总效应

> #整体的计算结果> print(effects_df)   predictor outcome direct_effect indirect_effect total_effect1         AK    TRAD        0.5074       0.0000000    0.50740002         AN    TRAD       -0.4558       0.0000000   -0.45580003         AP    TRAD       -0.2306       0.0000000   -0.23060004         BD    TRAD        0.3650      -0.5564278   -0.19142785         BD      AK       -0.3297       0.0000000   -0.32970006         BD      AN        0.5764       0.0000000    0.57640007         BD      AP        0.5482       0.0000000    0.54820008         SM    TRAD       -0.2427      -0.3755873   -0.61828739         SM      AK       -0.3983       0.0000000   -0.398300010        SM      AN        0.2929       0.0000000    0.292900011        SM      AP        0.1734       0.0000000    0.173400012      ~~BD    ~~SM       -0.0669       0.0000000   -0.0669000

虽然得到了所有的因子的直接、间接和总效应,但是别忘了我们的目的,我们是为了计算这些因子多目标因子TRAD的直接、间接和总效应,所以我在这里接了一步确定目标因子的代码。第16行是规定最后的绘图显示的因子有哪些(因为全画会有双向因子的出现,而我们双向因子的路径系数是不纳入效应值计算的),26行则是明确我们的目标因子是TRAD

if (nrow(effects_df) == 0) {  stop("没有计算出任何效应值,请检查路径是否存在。")}
# 自定义三个效应的展示顺序metrics_to_show <- c("direct_effect", "indirect_effect", "total_effect")
plot_df <- melt(effects_df, id.vars = c("predictor", "outcome"), measure.vars = metrics_to_show, variable.name = "effect_type", value.name = "effect_size")
# 自定义X轴显示顺序及需要显示的指标plot_df$effect_type <- factor(plot_df$effect_type, levels = metrics_to_show)desired_order <- c("BD", "SM", "AN", "AP", "AK")
# 只保留指定顺序中的 predictorplot_df <- plot_df %>% filter(predictor %in% desired_order)
# 将 predictor 转换为因子并按顺序排序plot_df$predictor <- factor(plot_df$predictor, levels = desired_order)

# 假设 'outcome' 是列名,筛选出 outcome 为 'TRAD' 的行plot_df_filtered <- plot_df %>% filter(outcome == "TRAD")plot_df_filtered

最后的结果如下,就得到了对目标因子TRAD的直接、间接、总效应。这样我们就大功告成了

> plot_df_filtered   predictor outcome     effect_type effect_size1         AK    TRAD   direct_effect   0.50740002         AN    TRAD   direct_effect  -0.45580003         AP    TRAD   direct_effect  -0.23060004         BD    TRAD   direct_effect   0.36500005         SM    TRAD   direct_effect  -0.24270006         AK    TRAD indirect_effect   0.00000007         AN    TRAD indirect_effect   0.00000008         AP    TRAD indirect_effect   0.00000009         BD    TRAD indirect_effect  -0.556427810        SM    TRAD indirect_effect  -0.375587311        AK    TRAD    total_effect   0.507400012        AN    TRAD    total_effect  -0.455800013        AP    TRAD    total_effect  -0.230600014        BD    TRAD    total_effect  -0.191427815        SM    TRAD    total_effect  -0.6182873

开始绘图

计算结果都有了,我们就开始绘图

# 绘图ggplot(plot_df_filtered, aes(x = predictor, y = effect_size, fill = effect_type)) +  geom_bar(stat = "identity", position = "dodge") +   labs(x = "Predictor", y = "Effect Size (Standardized)",        title = "Effects on TRAD: Direct, Indirect, and Total") +  scale_fill_manual(values = c("direct_effect" = "#008CB7",                                "indirect_effect" = "#A5C98C",                                "total_effect" = "#FFBD00")) +  theme_bw() +  theme(axis.text.x = element_text(angle = 0, hjust = 0.5),        legend.title = element_blank()) +  geom_hline(yintercept = 0, linetype = "solid", color = "black", size = 0.5)+  # 添加水平线  ylim(-1, 1)  # 设置 Y 轴范围

这样,我们就计算完成了,代码的第一版很冗长,修改了好久写了几个循环才简化下来。好了,快到点了,吃席去了。

完整版代码如下

library(piecewiseSEM)library(ggplot2)library(reshape2)library(dplyr)
# 读取数据litter <- read.csv(file.choose(), header = TRUE)
# 定义模型model <- psem( lm(TRAD ~ BD+AK+AN+SM+AP, litter ), lm(AK ~ BD+SM, litter ), lm(AN ~ BD+SM, litter ), lm(AP ~ BD+SM, litter ), (SM%~~%BD)#双向在后边的计算中会被忽略)
#模型结果summary(model, .progressBar = F)
#开始计算直接、间接、总效应coefs_df <- coefs(model)print("=== 原始 coefs_df ===")print(coefs_df)
# 若过滤后无数据,则无法计算路径效应if (nrow(coefs_df) == 0) { stop("过滤后没有有效的单向回归路径。请检查模型或数据。")}
coefs_df$effect <- coefs_df$Std.Estimate
build_graph <- function(coefs_df) { graph <- list() for (i in 1:nrow(coefs_df)) { from <- coefs_df$Predictor[i] to <- coefs_df$Response[i] effect <- coefs_df$effect[i] if (!is.null(graph[[from]])) { graph[[from]] <- rbind(graph[[from]], data.frame(to = to, effect = effect, stringsAsFactors = FALSE)) } else { graph[[from]] <- data.frame(to = to, effect = effect, stringsAsFactors = FALSE) } } return(graph)}
graph <- build_graph(coefs_df)print("=== 构建的 graph ===")print(graph)
find_all_paths <- function(graph, start, end, visited = character()) { if (start == end) { return(list(c(end))) } if (!start %in% names(graph)) { return(list()) } visited <- c(visited, start) paths <- list() for (i in 1:nrow(graph[[start]])) { next_node <- graph[[start]]$to[i] if (!(next_node %in% visited)) { sub_paths <- find_all_paths(graph, next_node, end, visited) for (sp in sub_paths) { paths <- c(paths, list(c(start, sp))) } } } return(paths)}
path_effect <- function(path, graph) { eff <- 1 for (i in 1:(length(path)-1)) { from <- path[i] to <- path[i+1] edge <- graph[[from]] eff <- eff * edge$effect[edge$to == to] } return(eff)}
compute_all_effects <- function(graph, variables) { results <- data.frame( predictor = character(), outcome = character(), direct_effect = numeric(), indirect_effect = numeric(), total_effect = numeric(), stringsAsFactors = FALSE ) for (pred in variables) { for (outc in variables) { if (pred != outc) { paths <- find_all_paths(graph, pred, outc) # Debug print # print(paste("Paths from", pred, "to", outc, ":")) # print(paths) if (length(paths) > 0) { direct_effect <- 0 indirect_effect <- 0 for (p in paths) { eff <- path_effect(p, graph) if (length(p) == 2) { direct_effect <- direct_effect + eff } else { indirect_effect <- indirect_effect + eff } } total_effect <- direct_effect + indirect_effect results <- rbind(results, data.frame( predictor = pred, outcome = outc, direct_effect = direct_effect, indirect_effect = indirect_effect, total_effect = total_effect, stringsAsFactors = FALSE )) } } } } return(results)}
all_vars <- unique(c(coefs_df$Response, coefs_df$Predictor))effects_df <- compute_all_effects(graph, all_vars)print("=== 计算的效应结果 effects_df ===")
#整体的计算结果print(effects_df)
if (nrow(effects_df) == 0) { stop("没有计算出任何效应值,请检查路径是否存在。")}
# 自定义三个效应的展示顺序metrics_to_show <- c("direct_effect", "indirect_effect", "total_effect")
plot_df <- melt(effects_df, id.vars = c("predictor", "outcome"), measure.vars = metrics_to_show, variable.name = "effect_type", value.name = "effect_size")
# 自定义X轴显示顺序及需要显示的指标plot_df$effect_type <- factor(plot_df$effect_type, levels = metrics_to_show)desired_order <- c("BD", "SM", "AN", "AP", "AK")
# 只保留指定顺序中的 predictorplot_df <- plot_df %>% filter(predictor %in% desired_order)
# 将 predictor 转换为因子并按顺序排序plot_df$predictor <- factor(plot_df$predictor, levels = desired_order)

# 假设 'outcome' 是列名,筛选出 outcome 为 'TRAD' 的行plot_df_filtered <- plot_df %>% filter(outcome == "TRAD")plot_df_filtered
# 绘图ggplot(plot_df_filtered, aes(x = predictor, y = effect_size, fill = effect_type)) + geom_bar(stat = "identity", position = "dodge") + labs(x = "Predictor", y = "Effect Size (Standardized)", title = "Effects on TRAD: Direct, Indirect, and Total") + scale_fill_manual(values = c("direct_effect" = "#008CB7", "indirect_effect" = "#A5C98C", "total_effect" = "#FFBD00")) + theme_bw() + theme(axis.text.x = element_text(angle = 0, hjust = 0.5), legend.title = element_blank()) + geom_hline(yintercept = 0, linetype = "solid", color = "black", size = 0.5)+ # 添加水平线  ylim(-11)  # 设置 Y 轴范围

脚本与示例数据下载


下载的时候顺便点个赞呗~

关注公众号“生态R学社”回复“效应计算和绘图”即可下载完整代码与示例数据



走天涯徐小洋地理数据科学
一个爱生活的地理土博,分享GIS、遥感、空间分析、R语言、景观生态等地理数据科学实操教程、经典文献、数据资源
 最新文章