更新|不用写代码!但这个 APP 会帮你自动生成代码!复现 Science 论文结构方程模型|模型选择

文摘   2024-05-27 19:27   江苏  

之前发布的一个小工具 R 语言 shiny APP 有个小更新:做结构方程模型,不用自己写代码,但是 APP 会自动生成完整的模型选择代码(见本文第 2 节*)!而且,是以 R Markdown(.Rmd)以及渲染好的 .html 文件的形式保存——以保证分析结果可以复现——尤其在好杂志的部分审稿人可能要求检查或公开代码的情况下,这点尤其重要。

如要获取这个 easypsemselection APP(SEM 模型选择 APP)——另包括一个鼠标点击交互式 SEM 可视化 APP)——这可能适合初学者,或急用但没太多闲学习 R 语言代码的朋友。

👉 打赏 15 元或以上的,即可获取;
👉 或,曾打赏上一个版本的,只需打赏 5 元,即可获取;
👉 或,票圈集赞 20 个,且分享到科研群的,即可获取。

(👉 记得留下邮件地址。)

引言

从模型选择,到可视化——复现一篇 Science 论文包含广义线性型(GLM)的分段结构方程模型(piecewiseSEM)。

这将使用到本公众号制作的,根据 AIC 的做 piecewiseSEM 模型选择的 shiny APP 小工具:easypsemselection,

  • 该 APP 可同时输入多达 24 个模型;一次性完成拟合;

  • 可以根据模型 AIC 对 SEM 做排序;

  • 除安装 R 包之外,使用者不用再写任何代码(只需要稍微了解 R 语言回归公式的写法)。

Disruption of an ant-plant mutualism shapes interactions between lions and their primary prey

(一篇有趣的 Science 论文。)

1 演示

演示视频如下(此更新版,除了增加生成代码功能,其余功能不变,以下演示视频仍及有效):

  • 第 1 部分,模型选择;

  • 第 2 部分,SEM 结果数据整理;

  • 第 3 部分,SEM 路径图可视化;

  • 第 4 部分,演示用 PPT,或鼠标,修改 SEM 路径图!

2 以下是 APP 生成的代码

*以下 R 语言代码由此 APP 自动生成(仅演示了其中 6 个备选模型的选择;当模型备选模型像原文那样有 22 个的时候,这个 APP 能节省大量分析时间)。即使不打赏该 APP,以下代码可能也可以当作一个手动分析的模版

Packages

# Packages: Many tools
library(tidyverse)
library(ggraph)
library(piecewiseSEM)

Data

# Data: Read the uploaded .csv file
DATA <- read.csv("ant-plant-lion-prey.csv", header = TRUE)

Model selection

model1 <- psem(
# Full model
glm( Zebra_kill ~ Visibility + zeb.density + Lion_activity, family = binomial(link = "logit"), DATA ),
glm( Visibility ~ Invasion, family = gaussian(), DATA ),
glm( zeb.density ~ Visibility + Lion_activity, family = gaussian(), DATA ),
glm( Lion_activity ~ Visibility, family = gaussian(), DATA ))

model2 <- psem(
# Model-1
glm( Zebra_kill ~ Visibility + Lion_activity, family = binomial(link = "logit"), DATA ),
glm( Visibility ~ Invasion, family = gaussian(), DATA ),
glm( zeb.density ~ 1, family = gaussian(), DATA ),
glm( Lion_activity ~ Visibility, family = gaussian(), DATA ))

model3 <- psem(
# Model-2
glm( Zebra_kill ~ Visibility + Lion_activity, family = binomial(link = "logit"), DATA ),
glm( Visibility ~ 1, family = gaussian(), DATA ),
glm( zeb.density ~ 1, family = gaussian(), DATA ),
glm( Lion_activity ~ Visibility, family = gaussian(), DATA ))

model4 <- psem(
# Model-3
glm( Zebra_kill ~ Visibility + Lion_activity, family = binomial(link = "logit"), DATA ),
glm( Visibility ~ Invasion, family = gaussian(), DATA ),
glm( zeb.density ~ 1, family = gaussian(), DATA ),
glm( Lion_activity ~ 1, family = gaussian(), DATA ))

model5 <- psem(
# Model-17
glm( Zebra_kill ~ Visibility, family = binomial(link = "logit"), DATA ),
glm( Visibility ~ Invasion, family = gaussian(), DATA ),
glm( zeb.density ~ Lion_activity, family = gaussian(), DATA ),
glm( Lion_activity ~ 1, family = gaussian(), DATA ))

model6 <- psem(
# Model-18
glm( Zebra_kill ~ Visibility, family = binomial(link = "logit"), DATA ),
glm( zeb.density ~ Lion_activity, family = gaussian(), DATA ),
glm( Lion_activity ~ 1, family = gaussian(), DATA ),
glm( Visibility ~ 1, family = gaussian(), DATA ))

Result

# Result: Find the 'best' model with lowest AIC
nModel <- 6
(aicValues <- sapply(1:nModel, function(i) AIC(eval(parse(text = paste0("model", i))))$AIC))
## [1] 1060.145 1062.605 1082.542
## [4] 1061.010 1056.126 1076.063
(orderBestModel <- order(aicValues)[1])
## [1] 5
bestModel <- eval(parse(text = paste0("model", orderBestModel)))
summary(bestModel, .progressBar = FALSE)
## 
## Structural Equation Model of bestModel
##
## Call:
## Zebra_kill ~ Visibility
## Visibility ~ Invasion
## zeb.density ~ Lion_activity
## Lion_activity ~ 1
##
## AIC
## 1056.126
##
## ---
## Tests of directed separation:
##
## Independ.Claim
## Zebra_kill ~ Invasion + ...
## zeb.density ~ Invasion + ...
## Visibility ~ Lion_activity + ...
## Zebra_kill ~ Lion_activity + ...
## zeb.density ~ Visibility + ...
## zeb.density ~ Zebra_kill + ...
## Test.Type DF Crit.Value
## coef 102 -0.2491
## coef 102 -1.9833
## coef 102 0.7135
## coef 102 -1.4676
## coef 101 0.2147
## coef 101 0.7504
## P.Value
## 0.8033
## 0.0500
## 0.4772
## 0.1422
## 0.8304
## 0.4547
##
## --
## Global goodness-of-fit:
##
## Chi-Squared = 7.592 with P-value = 0.27 and on 6 degrees of freedom
## Fisher's C = 13.757 with P-value = 0.316 and on 12 degrees of freedom
##
## ---
## Coefficients:
##
## Response Predictor
## Zebra_kill Visibility
## Visibility Invasion
## zeb.density Lion_activity
## Estimate Std.Error DF
## -0.0867 0.0203 103
## 13.4531 2.7500 103
## -1.0723 0.3945 103
## Crit.Value P.Value
## -4.2682 0.0000
## 4.8920 0.0000
## -2.7181 0.0077
## Std.Estimate
## -0.5794 ***
## 0.4342 ***
## -0.2587 **
##
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05
##
## ---
## Individual R-squared:
##
## Response method
## Zebra_kill nagelkerke
## Visibility none
## zeb.density none
## Lion_activity none
## R.squared
## 0.31
## 0.19
## 0.07
## 0.00

Graph

# Graph: Generate and display the graph of the best model
FONT <- "Prompt"

# Function to plot the SEM
makeGR <- function(graph, layout = "stress") {
ggraph(graph, layout = layout) +
geom_edge_link(
aes(
edge_width = abs(Std.Estimate),
edge_color = ifelse(Std.Estimate > 0, "+", "-"),
edge_linetype = ifelse(P.Value <= 0.05, 1, 2),
label = paste0(
round(Std.Estimate, 3),
symnum(P.Value, cutpoints = c(0, .001,.01,.05, .1, 1), symbols = c('***', '**', '*', '#', ''))
),
start_cap = label_rect(node1.name),
end_cap = label_rect(node2.name)
),
angle_calc = "along",
label_size = 4,
vjust = 1.5,
arrow = arrow(60, length = unit(10, "pt")),
family = FONT
) +
geom_node_text(
aes(label = name),
size = 5,
family = FONT
) +
scale_edge_width_continuous(
name = NULL,
range = c(0.3, 3)
) +
scale_edge_color_brewer(
name = NULL,
palette = "Dark2",
direction = -1
) +
scale_edge_linetype_identity(
name = NULL
) +
coord_cartesian(
clip = "off"
) +
theme_void() +
theme(
plot.margin = margin(10, 20, 10, 20)
)
}

# Plot the 'best' model
piecewiseSEM::coefs(bestModel) |>
dplyr::select(-9) |>
dplyr::relocate(
Predictor,
Response,
Std.Estimate,
P.Value
) |>
tidygraph::as_tbl_graph(directed = TRUE) |>
makeGR()

# Save a .csv for further visualization
piecewiseSEM::coefs(bestModel) |>
dplyr::select(-9) |>
dplyr::relocate(
from = Predictor,
to = Response,
weight = Std.Estimate,
p = P.Value
) |>
write.csv('Result.csv', row.names = FALSE)

4 进一步的 SEM 可视化

all-sem-graph.R 用于 SEM 的可视化,

all-sem-graph-share.R # SEM 可视化
  • 关于 all-sem-graph.R 小工具;另见,
👉 使用鼠标完成 SEM 交互式可视化!独家工具


(点击 👉 跳转。)

结语

另见,
👉 一行代码不写,也能做结构方程模型!初学者神器
👉 正确设置模型结构的重要性!模型选择


(点击 👉 跳转。)


ecologyR
🚀打赏可获取本公众号代码合集(见置顶文章)📌统计案例📌统计制图📌显著性标记📌结构方程模型可视化工具📌SEM教程与案例📌论文代码复现📌地图可视化
 最新文章