中介效应：有序因果中介分析的半参数估计B-实操

学术 2024-09-16 10:01 山西

🍓 课程推荐：2024 空间计量专题
主讲老师：范巧 (兰州大学)
课程时间：2024 年 10 月 2-4 日 (三天)
课程咨询：王老师 18903405450（微信）

课程特色 · 2024空间计量：

👉 一、从“零基础”到“高水平”的课程设计

兼顾基础知识、主流模型与前沿模型
既考虑软件安装、程序编写以及空间权重矩阵设计等 基础知识 讲授，更强调时空面板地理加权回归模型、贝叶斯空间计量模型、矩阵指数模型、空间计量交互模型与空间面板似不相关回归模型等 前沿模型 的传授。

👉 二、“保姆级”的空间计量代码

编写与校准所有模型的MATLAB代码，简化实操环节
模型的估计与检验等 仅按照提供的Excel数据版式 搜集与整理原始数据，即可一次性出结果并作图。

👉 三、“最多上新” 的内容体系

新增矩阵指数模型、短面板空间似不相关模型、空间计量交互模型、贝叶斯空间计量模型等
新增前沿应用案例，包括空间计量与索洛余值法、随机前沿分析与数据包络分析等的互嵌研究，阐释基于空间计量的产业空间结构优化评价方法。
新增 Dagum空间基尼系数、核密度估计、空间马尔科夫链与空间收敛性等内容，阐释现实研究中对空间收敛性的应用“谬误”。

作者： 余坚 (贵州财经大学)
邮箱： yujiangeren@163.com

Source: Xiang Zhou, 2021, Semiparametric estimation for causal mediation analysis with multiple causally ordered mediators. -Link-，-Data-

温馨提示： 文中链接在微信中无法生效。请点击底部「阅读原文」。或直接长按/扫描如下二维码，直达原文：

1. 引言
2. 数据背景
3. R 语言实操

3.1 前期工作
3.2 估计过程
3.3 输出与绘制估计结果

4. 估计结果解读
5. 结语
6. 参考资料
7. 相关推文

1. 引言

在有序因果中介分析的半参数估计理论篇 [ 中介效应：有序因果中介分析的半参数估计A-理论 ] 中，我们已经对个因果有序中介设定下，路径特定效应 (path-specific effects, PSEs) 的估计进行了理论介绍。本推文将在此基础上，复现作者原文中的实证结果，在 R 语言软件中演示具体操作，并对相应结果进行解读。

2. 数据背景

作者使用的数据来源于美国 1997 年全国青年纵向调查 (National Longitudinal Survey of Youth 1997)，数据集中共有 2969 人，在 1997 年他们的年龄为 15 岁至 17 岁，并在 20 岁前完成了高中学业。相关研究表明，大学入学率对美国的政治参与 (选举中是否投票) 有实质性的积极影响，但这种因果关系背后的机制仍不清楚。大学入学率对政治参与的影响可能通过公民和政治兴趣的发展、经济状况的提高、或社会和职业网络等其他途径发挥作用。为了检验这些直接和间接的影响，我们考虑一个因果结构，令表示大学入学率，表示政治参与，和表示两个因果有序中介，分别反映了 (a) 经济状况， (b) 公民和政治兴趣。

在该数据集中，处理是一个二元变量，用来衡量个体在 20 岁之前是否上过 2 年或 4 年的大学。结果是一个二元变量，用来衡量个体是否在 2010 年大选中投票。经济状况由受访者 2006 年至 2009 年的平均年收入度量。公民和政治兴趣由一组变量来度量，反映受访者在 2007 至 2010 年间对政府和公共事务的兴趣，以及参与志愿服务、捐赠、社区团体活动的情况。

预处理协变量包括性别、种族、民族、1997 年的年龄、父母教育、父母收入、父母资产、父亲形象、是否与亲生父母都住在一起、军队职业能力测试分数、高中 GPA、物质使用指数、青少年犯罪指数，受访者在 18 岁之前是否有孩子，受访者的同龄人对大学的期望，以及一些学校水平的特征。

我们在上一篇推文中已经介绍过反映因果路径，，和的 PSEs。为了便于说明，现在我们主要关注的累积路径特定效应 (cumulative path-sprecific effects, cPSEs)：

其中，第一个部分是大学入学率的 NDE，第二个和第三个部分反映了分别通过公民/政治兴趣中介和经济状况中介的处理效应的大小。

3. R 语言实操

3.1 前期工作

首先，我们进行一系列前期工作，包括载入所需的包，设定画图的主题，以及导入数据。

rm(list = ls())
library(haven)
library(survey)
library(gbm)
library(ranger)
library(rlang)
library(glmnet)
library(rsample)
library(caret)
library(tidyverse)
library(mice)
library(SuperLearner)
source("zmisc.R")
set.seed(02138)

# set my ggplot theme
mytheme <- theme_minimal(base_size = 16) + 
  theme(legend.position = "bottom",
        plot.title = element_text(hjust = 0.5),
        plot.caption = element_text(color = "grey30"))
theme_set(mytheme)

# load data
nlsy97_ed3 <- readRDS("nlsy97_ed3.RDS")

并且，我们使用多重插补法 (Multiple Imputation) 补齐缺失的协变量。

##########################################################
# Multiple Imputation for All Missing Covariates
##########################################################

# number of imputed datasets
I <- 10

nlsy97_ed3_mi <- nlsy97_ed3 %>%
  # remove redundant variables for imputation
  select(-id, -weight) %>%
  mice(m = I, maxit = 10, defaultMethod = "rf")

nlsy97_ed3_full <- map(1:I, ~ complete(nlsy97_ed3_mi, .x)) %>%
  map(., ~ mutate(.x,
                  id = nlsy97_ed3$id,
                  weight = nlsy97_ed3$weight,
                  logearn2007 = log(pmax(total_earnings_adj_2007, 1000)),
                  logearn2008 = log(pmax(total_earnings_adj_2008, 1000)),
                  logearn2009 = log(pmax(total_earnings_adj_2009, 1000)),
                  logearn2010 = log(pmax(total_earnings_adj_2010, 1000)),
  ))

saveRDS(nlsy97_ed3_full, file = "nlsy97_ed3_full.RDS")

3.2 估计过程

首先，我们载入所需的包并对一些变量进行预处理。

rm(list = ls())
library(survey)
library(gbm)
library(ranger)
library(glmnet)
library(rsample)
library(caret)
library(rlang)
library(tidyverse)
library(Hmisc)
library(SuperLearner)
source("zmisc.R")
set.seed(02138)

nlsy97_full <- readRDS("nlsy97_ed3_full.RDS") %>%
  map(., ~ mutate(.x,
                  aveearn = rowMeans(select(.x, num_range("total_earnings_adj_", 2007:2010)), na.rm = TRUE),
                  logearn = log(pmax(aveearn, 1000)),
                  interest = 5 - rowMeans(select(.x, interest2008, interest2010), na.rm = TRUE)))

n <- nrow(nlsy97_full[[1]])

接着，我们开始设定变量与公式，此处 a 为处理组的平均处理效应 (Average Treatment Effects on Treated, ATT)；y 为结果变量，即是否在 2010 年选举中投票；第一个中介 m1 为经济状况；第二个中介 m2 为一组多元变量，包括受访者在 2007 至 2010 年间对政府和公共事务的兴趣，以及参与志愿服务、捐赠、社区团体活动的情况；x 为一组控制变量。我们分别对无中介、仅有中介 m1、中介 m1 和 m2 三种情况下的处理变量和结果变量设定计算公式。

##########################################################
# Formulas for treatment and outcome models
##########################################################

# continuous variables
nlsy97_full[[1]] %>%
  map_lgl( ~ length(unique(.x)) > 2) %>%
  which()

a <- "att"
y <- "vote2010"
m1 <- c("logearn")
m2 <- c("volunteer2007", "community2007", "donate2007", "interest")

x <- exprs(age97, female, black, hispan, parinc, paredu, parast, livebio, father,
           afqt3, gpa, children18, substind, delinqind, rural97, south97,
           frcoll_75, frcoll_90, schstolen, schthreat, schfight) %>%
  map_chr(as_string)

a0_form <- as.formula(paste(a, " ~ ", paste(x, collapse= "+")))
a1_form <- as.formula(paste(a, " ~ ", paste(c(x, m1), collapse= "+")))
a2_form <- as.formula(paste(a, " ~ ", paste(c(x, m1, m2), collapse= "+")))

y0_form <- as.formula(paste(y, " ~ ", paste(c(x, a), collapse= "+")))
y1_form <- as.formula(paste(y, " ~ ", paste(c(x, a, m1), collapse= "+")))
y2_form <- as.formula(paste(y, " ~ ", paste(c(x, a, m1, m2), collapse= "+")))

最后，则是对原文中的理论估计模型进行编程并得到估计结果。作者使用了两种方法，一是结合结果模型与迭代回归输入，二是使用目标极大似然估计方法。注意以下命令全部运行完约需两个小时。

##########################################################
# Main analyses
##########################################################

estimands <- expand.grid(c(0, 1), c(0, 1), c(0, 1)) %>%
  `colnames<-`(c("a1", "a2", "a3"))

I <- length(nlsy97_full)
S <- nrow(estimands)
K <- 5

out <- vector(mode = "list", I)

for(i in 1:I){
  
  cat("imputed sample ", i, "\n")
  
  df <- nlsy97_full[[i]]
  
  df_p0 <- model.matrix(a0_form, data = df)[, -1] %>% as_tibble()
  df_p1 <- model.matrix(a1_form, data = df)[, -1] %>% as_tibble()
  df_p2 <- model.matrix(a2_form, data = df)[, -1] %>% as_tibble()
  
  df_mu0 <- model.matrix(y0_form, data = df)[, -1] %>% as_tibble()
  df_mu1 <- model.matrix(y1_form, data = df)[, -1] %>% as_tibble()
  df_mu2 <- model.matrix(y2_form, data = df)[, -1] %>% as_tibble()
  
  df_mu2n <- model.matrix(y2_form, data = mutate(df, att = 0))[, -1] %>% as_tibble()
  df_mu1n <- model.matrix(y1_form, data = mutate(df, att = 0))[, -1] %>% as_tibble()
  df_mu0n <- model.matrix(y0_form, data = mutate(df, att = 0))[, -1] %>% as_tibble()
  
  df_mu2y <- model.matrix(y2_form, data = mutate(df, att = 1))[, -1] %>% as_tibble()
  df_mu1y <- model.matrix(y1_form, data = mutate(df, att = 1))[, -1] %>% as_tibble()
  df_mu0y <- model.matrix(y0_form, data = mutate(df, att = 1))[, -1] %>% as_tibble()
    
  # create cross-fitting split
  cf_fold <- createFolds(df$vote2010, K)
  
  main_list <- vector(mode = "list", K)
  
  for(k in 1:K){
    
    cat(" cross-fitting fold ", k, "\n")
    
    #################################################
    # Design matrices for different models
    #################################################
    
    # auxiliary and main data
    aux <- df[-cf_fold[[k]], ]
    main <- df[cf_fold[[k]], ]
    
    aux_p0 <- df_p0[-cf_fold[[k]], ]
    aux_p1 <- df_p1[-cf_fold[[k]], ]
    aux_p2 <- df_p2[-cf_fold[[k]], ]
    
    main_p0 <- df_p0[cf_fold[[k]], ]
    main_p1 <- df_p1[cf_fold[[k]], ]
    main_p2 <- df_p2[cf_fold[[k]], ]
    
    aux_mu0 <- df_mu0[-cf_fold[[k]], ]
    aux_mu1 <- df_mu1[-cf_fold[[k]], ]
    aux_mu2 <- df_mu2[-cf_fold[[k]], ]
    
    main_mu0 <- df_mu0[cf_fold[[k]], ]
    main_mu1 <- df_mu1[cf_fold[[k]], ]
    main_mu2 <- df_mu2[cf_fold[[k]], ]
    
    #################################################
    # Treatment Models
    #################################################
    
    p0_sl <- SuperLearner(
      Y          = aux$att,
      X          = aux_p0,
      newX       = df_p0,
      family     = binomial(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE, trimLogit = 0.001),
      cvControl  = list(V = 5L, stratifyCV = TRUE, shuffle = TRUE, validRows = NULL)
    )
    
    p1_sl <- SuperLearner(
      Y          = aux$att,
      X          = aux_p1,
      newX       = df_p1,
      family     = binomial(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE, trimLogit = 0.001),
      cvControl  = list(V = 5L, stratifyCV = TRUE, shuffle = TRUE, validRows = NULL)
    )
    
    p2_sl <- SuperLearner(
      Y          = aux$att,
      X          = aux_p2,
      newX       = df_p2,
      family     = binomial(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE, trimLogit = 0.001),
      cvControl  = list(V = 5L, stratifyCV = TRUE, shuffle = TRUE, validRows = NULL)
    )
    
    df <- df %>% mutate(
      p0_fit = p0_sl$SL.predict,
      p1_fit = p1_sl$SL.predict,
      p2_fit = p2_sl$SL.predict,
      w0_n = I(att == 0)/(1 - p0_fit),
      w0_y = I(att == 1)/p0_fit,
      w1_nn = I(att == 0)/(1 - p0_fit) * 1,
      w1_ny = I(att == 1)/(1 - p0_fit) * (1 - p1_fit)/p1_fit,
      w1_yn = I(att == 0)/p0_fit * p1_fit/(1 - p1_fit),
      w1_yy = I(att == 1)/p0_fit * 1,
      w2_000 = I(att == 0)/(1 - p0_fit) * 1 * 1,
      w2_001 = I(att == 1)/(1 - p0_fit) * 1 * (1 - p2_fit)/p2_fit,
      w2_010 = I(att == 0)/(1 - p0_fit) * (1 - p1_fit)/p1_fit * p2_fit/(1 - p2_fit),
      w2_011 = I(att == 1)/(1 - p0_fit) * (1 - p1_fit)/p1_fit * 1,
      w2_100 = I(att == 0)/p0_fit * p1_fit/(1 - p1_fit) * 1,
      w2_101 = I(att == 1)/p0_fit * p1_fit/(1 - p1_fit) * (1 - p2_fit)/p2_fit,
      w2_110 = I(att == 0)/p0_fit * 1 * p2_fit/(1 - p2_fit),
      w2_111 = I(att == 1)/p0_fit * 1 * 1,
    )
    
    #################################################
    # Outcome models
    #################################################
    
    mu2_sl <- SuperLearner(
      Y          = aux$vote2010,
      X          = aux_mu2,
      family     = binomial(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    df$mu2_fit_a3n <- predict.SuperLearner(mu2_sl, newdata = df_mu2n)$pred
    df$mu2_fit_a3y <- predict.SuperLearner(mu2_sl, newdata = df_mu2y)$pred
    
    mu1_sl_a3n <- SuperLearner(
      Y          = df$mu2_fit_a3n[-cf_fold[[k]]],
      X          = aux_mu1,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu1_sl_a3y <- SuperLearner(
      Y          = df$mu2_fit_a3y[-cf_fold[[k]]],
      X          = aux_mu1,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    df$mu1_fit_a3n_a2n <- predict.SuperLearner(mu1_sl_a3n, newdata = df_mu1n)$pred
    df$mu1_fit_a3n_a2y <- predict.SuperLearner(mu1_sl_a3n, newdata = df_mu1y)$pred
    df$mu1_fit_a3y_a2n <- predict.SuperLearner(mu1_sl_a3y, newdata = df_mu1n)$pred
    df$mu1_fit_a3y_a2y <- predict.SuperLearner(mu1_sl_a3y, newdata = df_mu1y)$pred
    
    mu0_sl_a3n_a2n <- SuperLearner(
      Y          = df$mu1_fit_a3n_a2n[-cf_fold[[k]]],
      X          = aux_mu0,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu0_sl_a3n_a2y <- SuperLearner(
      Y          = df$mu1_fit_a3n_a2y[-cf_fold[[k]]],
      X          = aux_mu0,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu0_sl_a3y_a2n <- SuperLearner(
      Y          = df$mu1_fit_a3y_a2n[-cf_fold[[k]]],
      X          = aux_mu0,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu0_sl_a3y_a2y <- SuperLearner(
      Y          = df$mu1_fit_a3y_a2y[-cf_fold[[k]]],
      X          = aux_mu0,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    df$mu0_fit_a3n_a2n_a1n <- predict.SuperLearner(mu0_sl_a3n_a2n, newdata = df_mu0n)$pred
    df$mu0_fit_a3n_a2n_a1y <- predict.SuperLearner(mu0_sl_a3n_a2n, newdata = df_mu0y)$pred
    
    df$mu0_fit_a3n_a2y_a1n <- predict.SuperLearner(mu0_sl_a3n_a2y, newdata = df_mu0n)$pred
    df$mu0_fit_a3n_a2y_a1y <- predict.SuperLearner(mu0_sl_a3n_a2y, newdata = df_mu0y)$pred
    
    df$mu0_fit_a3y_a2n_a1n <- predict.SuperLearner(mu0_sl_a3y_a2n, newdata = df_mu0n)$pred
    df$mu0_fit_a3y_a2n_a1y <- predict.SuperLearner(mu0_sl_a3y_a2n, newdata = df_mu0y)$pred
    
    df$mu0_fit_a3y_a2y_a1n <- predict.SuperLearner(mu0_sl_a3y_a2y, newdata = df_mu0n)$pred
    df$mu0_fit_a3y_a2y_a1y <- predict.SuperLearner(mu0_sl_a3y_a2y, newdata = df_mu0y)$pred
    
    #################################################
    # Targeted MLE
    #################################################
    
    # targeted mu2
    
    mu2_tmle_000 <- glm(vote2010 ~ 0 + offset(I(qlogis(mu2_fit_a3n))) + w2_000, family = binomial(), weights = weight, data = df[cf_fold[[k]], ])
    mu2_tmle_001 <- glm(vote2010 ~ 0 + offset(I(qlogis(mu2_fit_a3y))) + w2_001, family = binomial(), weights = weight, data = df[cf_fold[[k]], ])
    mu2_tmle_010 <- glm(vote2010 ~ 0 + offset(I(qlogis(mu2_fit_a3n))) + w2_010, family = binomial(), weights = weight, data = df[cf_fold[[k]], ])
    mu2_tmle_011 <- glm(vote2010 ~ 0 + offset(I(qlogis(mu2_fit_a3y))) + w2_011, family = binomial(), weights = weight, data = df[cf_fold[[k]], ])
    mu2_tmle_100 <- glm(vote2010 ~ 0 + offset(I(qlogis(mu2_fit_a3n))) + w2_100, family = binomial(), weights = weight, data = df[cf_fold[[k]], ])
    mu2_tmle_101 <- glm(vote2010 ~ 0 + offset(I(qlogis(mu2_fit_a3y))) + w2_101, family = binomial(), weights = weight, data = df[cf_fold[[k]], ])
    mu2_tmle_110 <- glm(vote2010 ~ 0 + offset(I(qlogis(mu2_fit_a3n))) + w2_110, family = binomial(), weights = weight, data = df[cf_fold[[k]], ])
    mu2_tmle_111 <- glm(vote2010 ~ 0 + offset(I(qlogis(mu2_fit_a3y))) + w2_111, family = binomial(), weights = weight, data = df[cf_fold[[k]], ])
    
    df$mu2b_fit_000 <- predict(mu2_tmle_000, newdata = df, type = "response")
    df$mu2b_fit_001 <- predict(mu2_tmle_001, newdata = df, type = "response")
    df$mu2b_fit_010 <- predict(mu2_tmle_010, newdata = df, type = "response")
    df$mu2b_fit_011 <- predict(mu2_tmle_011, newdata = df, type = "response")
    df$mu2b_fit_100 <- predict(mu2_tmle_100, newdata = df, type = "response")
    df$mu2b_fit_101 <- predict(mu2_tmle_101, newdata = df, type = "response")
    df$mu2b_fit_110 <- predict(mu2_tmle_110, newdata = df, type = "response")
    df$mu2b_fit_111 <- predict(mu2_tmle_111, newdata = df, type = "response")
    
    # refit mu1
    
    mu1b_sl_000 <- SuperLearner(
      Y          = df$mu2b_fit_000[-cf_fold[[k]]],
      X          = aux_mu1,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu1b_sl_001 <- SuperLearner(
      Y          = df$mu2b_fit_001[-cf_fold[[k]]],
      X          = aux_mu1,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu1b_sl_010 <- SuperLearner(
      Y          = df$mu2b_fit_010[-cf_fold[[k]]],
      X          = aux_mu1,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu1b_sl_011 <- SuperLearner(
      Y          = df$mu2b_fit_011[-cf_fold[[k]]],
      X          = aux_mu1,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu1b_sl_100 <- SuperLearner(
      Y          = df$mu2b_fit_100[-cf_fold[[k]]],
      X          = aux_mu1,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu1b_sl_101 <- SuperLearner(
      Y          = df$mu2b_fit_101[-cf_fold[[k]]],
      X          = aux_mu1,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu1b_sl_110 <- SuperLearner(
      Y          = df$mu2b_fit_110[-cf_fold[[k]]],
      X          = aux_mu1,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu1b_sl_111 <- SuperLearner(
      Y          = df$mu2b_fit_111[-cf_fold[[k]]],
      X          = aux_mu1,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    df$mu1_fit_000 <- predict.SuperLearner(mu1b_sl_000, newdata = df_mu1n)$pred
    df$mu1_fit_001 <- predict.SuperLearner(mu1b_sl_001, newdata = df_mu1n)$pred
    df$mu1_fit_010 <- predict.SuperLearner(mu1b_sl_010, newdata = df_mu1y)$pred
    df$mu1_fit_011 <- predict.SuperLearner(mu1b_sl_011, newdata = df_mu1y)$pred
    df$mu1_fit_100 <- predict.SuperLearner(mu1b_sl_100, newdata = df_mu1n)$pred
    df$mu1_fit_101 <- predict.SuperLearner(mu1b_sl_101, newdata = df_mu1n)$pred
    df$mu1_fit_110 <- predict.SuperLearner(mu1b_sl_110, newdata = df_mu1y)$pred
    df$mu1_fit_111 <- predict.SuperLearner(mu1b_sl_111, newdata = df_mu1y)$pred
    
    # targeted mu1
    
    mu1_tmle_000 <- lm(mu2b_fit_000 ~ 0 + offset(mu1_fit_000) + w1_nn, weights = weight, data = df[cf_fold[[k]], ])
    mu1_tmle_001 <- lm(mu2b_fit_001 ~ 0 + offset(mu1_fit_001) + w1_nn, weights = weight, data = df[cf_fold[[k]], ])
    mu1_tmle_010 <- lm(mu2b_fit_010 ~ 0 + offset(mu1_fit_010) + w1_ny, weights = weight, data = df[cf_fold[[k]], ])
    mu1_tmle_011 <- lm(mu2b_fit_011 ~ 0 + offset(mu1_fit_011) + w1_ny, weights = weight, data = df[cf_fold[[k]], ])
    mu1_tmle_100 <- lm(mu2b_fit_100 ~ 0 + offset(mu1_fit_100) + w1_yn, weights = weight, data = df[cf_fold[[k]], ])
    mu1_tmle_101 <- lm(mu2b_fit_101 ~ 0 + offset(mu1_fit_101) + w1_yn, weights = weight, data = df[cf_fold[[k]], ])
    mu1_tmle_110 <- lm(mu2b_fit_110 ~ 0 + offset(mu1_fit_110) + w1_yy, weights = weight, data = df[cf_fold[[k]], ])
    mu1_tmle_111 <- lm(mu2b_fit_111 ~ 0 + offset(mu1_fit_111) + w1_yy, weights = weight, data = df[cf_fold[[k]], ])
    
    df$mu1b_fit_000 <- predict(mu1_tmle_000, newdata = df)
    df$mu1b_fit_001 <- predict(mu1_tmle_001, newdata = df)
    df$mu1b_fit_010 <- predict(mu1_tmle_010, newdata = df)
    df$mu1b_fit_011 <- predict(mu1_tmle_011, newdata = df)
    df$mu1b_fit_100 <- predict(mu1_tmle_100, newdata = df)
    df$mu1b_fit_101 <- predict(mu1_tmle_101, newdata = df)
    df$mu1b_fit_110 <- predict(mu1_tmle_110, newdata = df)
    df$mu1b_fit_111 <- predict(mu1_tmle_111, newdata = df)
    
    # refit mu0
    
    mu0b_sl_000 <- SuperLearner(
      Y          = df$mu1b_fit_000[-cf_fold[[k]]],
      X          = aux_mu0,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu0b_sl_001 <- SuperLearner(
      Y          = df$mu1b_fit_001[-cf_fold[[k]]],
      X          = aux_mu0,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu0b_sl_010 <- SuperLearner(
      Y          = df$mu1b_fit_010[-cf_fold[[k]]],
      X          = aux_mu0,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu0b_sl_011 <- SuperLearner(
      Y          = df$mu1b_fit_011[-cf_fold[[k]]],
      X          = aux_mu0,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu0b_sl_100 <- SuperLearner(
      Y          = df$mu1b_fit_100[-cf_fold[[k]]],
      X          = aux_mu0,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu0b_sl_101 <- SuperLearner(
      Y          = df$mu1b_fit_101[-cf_fold[[k]]],
      X          = aux_mu0,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu0b_sl_110 <- SuperLearner(
      Y          = df$mu1b_fit_110[-cf_fold[[k]]],
      X          = aux_mu0,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    mu0b_sl_111 <- SuperLearner(
      Y          = df$mu1b_fit_111[-cf_fold[[k]]],
      X          = aux_mu0,
      family     = gaussian(),
      obsWeights = aux$weight,
      SL.library = c("SL.mean", "SL.glmnet", "SL.ranger"),
      control    = list(saveFitLibrary = TRUE),
      cvControl  = list(V = 5L, shuffle = TRUE, validRows = NULL)
    )
    
    df$mu0_fit_000 <- predict.SuperLearner(mu0b_sl_000, newdata = df_mu0n)$pred
    df$mu0_fit_001 <- predict.SuperLearner(mu0b_sl_001, newdata = df_mu0n)$pred
    df$mu0_fit_010 <- predict.SuperLearner(mu0b_sl_010, newdata = df_mu0y)$pred
    df$mu0_fit_011 <- predict.SuperLearner(mu0b_sl_011, newdata = df_mu0y)$pred
    df$mu0_fit_100 <- predict.SuperLearner(mu0b_sl_100, newdata = df_mu0n)$pred
    df$mu0_fit_101 <- predict.SuperLearner(mu0b_sl_101, newdata = df_mu0n)$pred
    df$mu0_fit_110 <- predict.SuperLearner(mu0b_sl_110, newdata = df_mu0y)$pred
    df$mu0_fit_111 <- predict.SuperLearner(mu0b_sl_111, newdata = df_mu0y)$pred
    
    # targeted mu0
    
    mu0_tmle_000 <- lm(mu1b_fit_000 ~ 0 + offset(mu0_fit_000) + w0_n, weights = weight, data = df[cf_fold[[k]], ])
    mu0_tmle_001 <- lm(mu1b_fit_001 ~ 0 + offset(mu0_fit_001) + w0_n, weights = weight, data = df[cf_fold[[k]], ])
    mu0_tmle_010 <- lm(mu1b_fit_010 ~ 0 + offset(mu0_fit_010) + w0_n, weights = weight, data = df[cf_fold[[k]], ])
    mu0_tmle_011 <- lm(mu1b_fit_011 ~ 0 + offset(mu0_fit_011) + w0_n, weights = weight, data = df[cf_fold[[k]], ])
    mu0_tmle_100 <- lm(mu1b_fit_100 ~ 0 + offset(mu0_fit_100) + w0_y, weights = weight, data = df[cf_fold[[k]], ])
    mu0_tmle_101 <- lm(mu1b_fit_101 ~ 0 + offset(mu0_fit_101) + w0_y, weights = weight, data = df[cf_fold[[k]], ])
    mu0_tmle_110 <- lm(mu1b_fit_110 ~ 0 + offset(mu0_fit_110) + w0_y, weights = weight, data = df[cf_fold[[k]], ])
    mu0_tmle_111 <- lm(mu1b_fit_111 ~ 0 + offset(mu0_fit_111) + w0_y, weights = weight, data = df[cf_fold[[k]], ])
    
    df$mu0b_fit_000 <- predict(mu0_tmle_000, newdata = df)
    df$mu0b_fit_001 <- predict(mu0_tmle_001, newdata = df)
    df$mu0b_fit_010 <- predict(mu0_tmle_010, newdata = df)
    df$mu0b_fit_011 <- predict(mu0_tmle_011, newdata = df)
    df$mu0b_fit_100 <- predict(mu0_tmle_100, newdata = df)
    df$mu0b_fit_101 <- predict(mu0_tmle_101, newdata = df)
    df$mu0b_fit_110 <- predict(mu0_tmle_110, newdata = df)
    df$mu0b_fit_111 <- predict(mu0_tmle_111, newdata = df)
    
    main_list[[k]] <- df[cf_fold[[k]], ]
  }
  
  main_df <- reduce(main_list, bind_rows)
  
  for (s in 1:S){
    
    a1 <- estimands$a1[[s]]
    a2 <- estimands$a2[[s]]
    a3 <- estimands$a3[[s]]
    
    main_df <- main_df %>%
      mutate(
        
        wt0_deno = a1 * p0_fit + (1 - a1) * (1 - p0_fit),
        wt1_nume = a1 * p1_fit + (1 - a1) * (1 - p1_fit),
        wt1_deno = a2 * p1_fit + (1 - a2) * (1 - p1_fit),
        wt2_nume = a2 * p2_fit + (1 - a2) * (1 - p2_fit),
        wt2_deno = a3 * p2_fit + (1 - a3) * (1 - p2_fit),
        
        !!sym(paste0("w0_", a1, a2, a3)) := 1/trim(wt0_deno),
        !!sym(paste0("w1_", a1, a2, a3)) := !!sym(paste0("w0_", a1, a2, a3)) * wt1_nume/trim(wt1_deno),
        !!sym(paste0("w2_", a1, a2, a3)) := !!sym(paste0("w1_", a1, a2, a3)) * wt2_nume/trim(wt2_deno),
        
        !!sym(paste0("mu2fit_", a1, a2, a3)) := a3 * mu2_fit_a3y + (1 - a3) * mu2_fit_a3n,
        
        !!sym(paste0("mu1fit_", a1, a2, a3)) := a3 * a2 * mu1_fit_a3y_a2y + a3 * (1 - a2) * mu1_fit_a3y_a2n +
          (1 - a3) * a2 * mu1_fit_a3n_a2y + (1 - a3) * (1 - a2) * mu1_fit_a3n_a2n,
        
        !!sym(paste0("mu0fit_", a1, a2, a3)) := a3 * a2 * a1 * mu0_fit_a3y_a2y_a1y + a3 * a2 * (1 - a1) * mu0_fit_a3y_a2y_a1n +
          a3 * (1 - a2) * a1 * mu0_fit_a3y_a2n_a1y + a3 * (1 - a2) * (1 - a1) * mu0_fit_a3y_a2n_a1n +
          (1 - a3) * a2 * a1 * mu0_fit_a3n_a2y_a1y + (1 - a3) * a2 * (1 - a1) * mu0_fit_a3n_a2y_a1n +
          (1 - a3) * (1 - a2) * a1 * mu0_fit_a3n_a2n_a1y + (1 - a3) * (1 - a2) * (1 - a1) * mu0_fit_a3n_a2n_a1n,
        
        !!sym(paste0("www_", a1, a2, a3)) := as.double(att==a3) * !!sym(paste0("w2_", a1, a2, a3)) * vote2010,
        !!sym(paste0("iii_", a1, a2, a3)) := !!sym(paste0("mu0fit_", a1, a2, a3)),
        !!sym(paste0("eif_", a1, a2, a3)) := as.double(att==a3) * !!sym(paste0("w2_", a1, a2, a3)) *
          (vote2010 - !!sym(paste0("mu2fit_", a1, a2, a3))) +
          as.double(att==a2) * !!sym(paste0("w1_", a1, a2, a3)) * (!!sym(paste0("mu2fit_", a1, a2, a3)) -
                                                                         !!sym(paste0("mu1fit_", a1, a2, a3))) +
          as.double(att==a1) * !!sym(paste0("w0_", a1, a2, a3)) * (!!sym(paste0("mu1fit_", a1, a2, a3)) -
                                                                         !!sym(paste0("mu0fit_", a1, a2, a3))) +
          !!sym(paste0("mu0fit_", a1, a2, a3)),
        !!sym(paste0("tmle_", a1, a2, a3)) := as.double(att==a3) * !!sym(paste0("w2_", a1, a2, a3)) *
          (vote2010 - !!sym(paste0("mu2b_fit_", a1, a2, a3))) +
          as.double(att==a2) * !!sym(paste0("w1_", a1, a2, a3)) * (!!sym(paste0("mu2b_fit_", a1, a2, a3)) -
                                                                         !!sym(paste0("mu1b_fit_", a1, a2, a3))) +
          as.double(att==a1) * !!sym(paste0("w0_", a1, a2, a3)) * (!!sym(paste0("mu1b_fit_", a1, a2, a3)) -
                                                                         !!sym(paste0("mu0b_fit_", a1, a2, a3))) +
          !!sym(paste0("mu0b_fit_", a1, a2, a3)),
        !!sym(paste0("tmle2_", a1, a2, a3)) := !!sym(paste0("mu0b_fit_", a1, a2, a3))
      )
  }
  
  out[[i]] <- main_df
  
}

save.image(file = "05_vote_main.RData")

3.3 输出与绘制估计结果

我们同样载入所需包、导入数据、设定绘图主题后，根据之前的估计结果计算得到 ATE 与其分解所得的 PSEs 的估计值与标准误，并通过 ggplot 绘制图形。

rm(list = ls())
library(survey)
library(gbm)
library(ranger)
library(glmnet)
library(rsample)
library(caret)
library(rlang)
library(tidyverse)
library(Hmisc)
library(SuperLearner)
library(scales)
source("zmisc.R")
set.seed(02138)

load("05_vote_main.RData")

# set my ggplot theme
mytheme <- theme_minimal(base_size = 18) + 
  theme(legend.position = "bottom",
        plot.title = element_text(hjust = 0.5),
        plot.caption = element_text(color = "grey30"))
theme_set(mytheme)

out_df <- out %>%
  imap( ~ mutate(.x, imp = .y)) %>%
  reduce(bind_rows) %>%
  mutate(eif_type1_ate = eif_111 - eif_000,
         eif_type2_ate = eif_111 - eif_000,
         eif_type3_ate = eif_111 - eif_000,
         eif_type1_pse3 = eif_001 - eif_000,
         eif_type1_pse2 = eif_011 - eif_001,
         eif_type1_pse1 = eif_111 - eif_011,
         eif_type2_pse1 = eif_100 - eif_000,
         eif_type2_pse2 = eif_110 - eif_100,
         eif_type2_pse3 = eif_111 - eif_110,
         eif_type3_pse3 = eif_001 - eif_000,
         eif_type3_pse1 = eif_101 - eif_001,
         eif_type3_pse2 = eif_111 - eif_101,
         tmle_type1_ate = tmle_111 - tmle_000,
         tmle_type2_ate = tmle_111 - tmle_000,
         tmle_type3_ate = tmle_111 - tmle_000,
         tmle_type1_pse3 = tmle_001 - tmle_000,
         tmle_type1_pse2 = tmle_011 - tmle_001,
         tmle_type1_pse1 = tmle_111 - tmle_011,
         tmle_type2_pse1 = tmle_100 - tmle_000,
         tmle_type2_pse2 = tmle_110 - tmle_100,
         tmle_type2_pse3 = tmle_111 - tmle_110,
         tmle_type3_pse3 = tmle_001 - tmle_000,
         tmle_type3_pse1 = tmle_101 - tmle_001,
         tmle_type3_pse2 = tmle_111 - tmle_101) %>%
  group_by(imp) %>%
  summarise_at(vars(contains("type")), list(est = ~ wtd.mean(.x, weight),
                                            se = ~ sqrt(wtd.var(.x, weight)/length(.x)))) %>%
  ungroup() %>%
  pivot_longer(-imp) %>%
  separate(name, into = c("estimator", "type", "estimand", "measure")) %>%
  pivot_wider(names_from = measure, values_from = value) %>%
  group_by(estimator, type, estimand) %>%
  summarise(within_var = mean(se^2),
            between_var = var(est),
            total_var = within_var + (1 + 1/I)*between_var,
            est = mean(est),
            se = sqrt(total_var)) %>%
  ungroup() %>%
  filter(type == "type1")   %>%
  mutate(estimator = factor(estimator,
                            levels = c("eif", "tmle"),
                            labels = c(expression(paste("Estimating Equation (", hat(psi)[np]^eif2, ")")),
                                       expression(paste("TMLE (", hat(psi)[tmle]^eif2, ")")))),
         estimand = factor(estimand,
                           levels = rev(c("ate", "pse3", "pse2", "pse1")),
                           labels = rev(c(expression(paste("Total Effect (", psi[`111`]-psi[`000`], ")")),
                                          expression(paste("Direct Effect (", psi[`001`]-psi[`000`], ")")),
                                          expression(paste("via Civic/Political Interest (", psi[`011`]-psi[`001`], ")")),
                                          expression(paste("via Economic Status (", psi[`111`]-psi[`011`], ")")))))) 

ggplot(out_df, aes(x = estimand, y = est, shape = estimator)) +
  geom_pointrange(aes(ymin = est - 1.96 * se,  ymax = est + 1.96 * se),
                  position = position_dodge(width = - 0.5), size = 1) +
  geom_hline(yintercept = 0, linetype = 2) +
  scale_shape("", labels = parse_format()) +
  scale_color_discrete("", labels = parse_format()) +
  scale_x_discrete("", labels = parse_format()) +
  scale_y_continuous("Effects of College Attendance on Voting") +
  coord_flip()

out_df[, c("est", "se")] %>% round(3)

4. 估计结果解读

R 的最终估计结果如下所示：

est	se
0.153	0.023
0.008	0.005
0.045	0.008
0.1	0.021
0.157	0.022
0.002	0.005
0.051	0.008
0.103	0.021

我们将其与原文表格 (如下所示，括号中为标准误) 进行对比，可见结果是一致的。

	估计方程 ()	TMLE ()
平均总效应	0.152 (0.022)	0.156 (0.023)
经济状况路径 ( )	0.007 (0.005)	0.002 (0.005)
公民/政治兴趣路径 ( )	0.042 (0.008)	0.049 (0.008)
直接效应 ( )	0.103 (0.021)	0.105 (0.021)

此外，我们还通过 R 将以上结果可视化：

在理论篇中，我们知道本文的多重稳健估计量可以通过两种方法来估计，一是调整结果模型的估计方程，使包含逆概率权重的项等于零，只留下一个位于估计对象参数位置的迭代回归输入估计量，即表中的。二是使用目标极大似然估计方法，通过两步拟合每个结果模型，即表中的。从上表可知，这两个估计量均得到了类似的总效应和 PSEs 的估计值。大学入学率对选举中投票的效应有很大一部分是 “直接的”，也就是说，既不是通过提高经济状况路径发挥作用，也不是通过公民和政治兴趣路径发挥作用。

5. 结语

本推文对个因果有序中介设定下 ATE 与其各类分解——自然直接效应、自然间接效应\总间接效应、自然路径特定效应、累积路径特定效应——的建模与估计实操部分进行了展示。在后续拓展中，感兴趣的读者可以对此操作进行复现，在政策评估等实证研究中进行运用。

6. 参考资料

Pearl, J. (2009) Causality: models, reasoning, and inference. Cambridge: Cambridge University Press.
Robins, J.M. (2003) Semantics of causal dag models and the identification of direct and indirect effects. Highly Structured Stochastic Systems, 70–81.
Tchetgen Tchetgen, E.J. & Shpitser, I. (2012) Semiparametric theory for causal mediation analysis: efficiency bounds, multiple robustness, and sensitivity analysis. Annals of Statistics, 40, 1816. -PDF-
Avin, C., Shpitser, I. & Pearl, J. (2005) Identifiability of path-specific effects. In Proceedings of the 19th International Joint Conference on Artificial Intelligence, Morgan Kaufmann Publishers Inc, pp. 357–363.
Zhou, X., & Yamamoto, T. (2023). Tracing causal paths from experimental and observational data. The Journal of Politics, 85(1), 250-265. -PDF-
Vansteelandt, S., Bekaert, M. & Lange, T. (2012) Imputation strategies for the estimation of natural direct and indirect effects. Epidemiologic Methods, 1, 131–158. -PDF-
VanderWeele, T.J., Vansteelandt, S. & Robins, J.M. (2014) Effect decomposition in the presence of an exposure-induced mediator-outcome confounder. Epidemiology, 25, 300–306. -PDF-

7. 相关推文

Note：产生如下推文列表的 Stata 命令为：
lianxh
安装最新版 lianxh 命令：
ssc install lianxh, replace

专题：回归分析

中介效应分析：三段式中介效应模型真的适用于经济学研究吗？

专题：IV-GMM

Stata：基于IV的因果中介分析-ivmediate

专题：内生性-因果推断

Stata：一般化的因果中介分析

专题：交乘项-调节-中介

Stata：中介效应分析新命令-sgmediation2
medsem-中介效应：基于结构方程模型SEM的中介效应分析-T331
Stata：因果中介分析大比拼-T323
med4way：中介效应和交互效应分析
Stata：调节中介效应检验.md
Stata+R：一文读懂中介效应

🍓 课程推荐：2024 空间计量专题
主讲老师：范巧 (兰州大学)
课程时间：2024 年 10 月 2-4 日 (三天)
课程咨询：王老师 18903405450（微信）

🍉 扫码加入连享会微信群，提问交流更方便

http://mp.weixin.qq.com/s?__biz=MzU5MjYxNTgwMg==&mid=2247502864&idx=1&sn=b5e10d1d885ac03918f30dd857cb1a4b

君泉计量

交流学习经验，探讨论文写作