delta method：获取标准误

文摘 2024-09-08 22:04 山西

👇 连享会 · 推文导航 | www.lianxh.cn

🍎 Stata：Stata基础 | Stata绘图 | Stata程序 | Stata新命令
📘 论文：数据处理 | 结果输出 | 论文写作 | 数据分享
💹 计量：回归分析 | 交乘项-调节 | IV-GMM | 时间序列 | 面板数据 | 空间计量 | Probit-Logit | 分位数回归
⛳ 专题：SFA-DEA | 生存分析 | 爬虫 | 机器学习 | 文本分析
🔃 因果：DID | RDD | 因果推断 | 合成控制法 | PSM-Matching
🔨 工具：工具软件 | Markdown | Python-R-Stata
🎧 课程：最新专题 | 计量专题 | 关于连享会

🍓 课程推荐：2024 空间计量专题
主讲老师：范巧 (兰州大学)
课程时间：2024 年 10 月 2-4 日 (三天)
课程咨询：王老师 18903405450（微信）

课程特色 · 2024空间计量：

👉 一、从“零基础”到“高水平”的课程设计

兼顾基础知识、主流模型与前沿模型
既考虑软件安装、程序编写以及空间权重矩阵设计等 基础知识 讲授，更强调时空面板地理加权回归模型、贝叶斯空间计量模型、矩阵指数模型、空间计量交互模型与空间面板似不相关回归模型等 前沿模型 的传授。

👉 二、“保姆级”的空间计量代码

编写与校准所有模型的MATLAB代码，简化实操环节
模型的估计与检验等 仅按照提供的Excel数据版式 搜集与整理原始数据，即可一次性出结果并作图。

👉 三、“最多上新” 的内容体系

新增矩阵指数模型、短面板空间似不相关模型、空间计量交互模型、贝叶斯空间计量模型等
新增前沿应用案例，包括空间计量与索洛余值法、随机前沿分析与数据包络分析等的互嵌研究，阐释基于空间计量的产业空间结构优化评价方法。
新增 Dagum空间基尼系数、核密度估计、空间马尔科夫链与空间收敛性等内容，阐释现实研究中对空间收敛性的应用“谬误”。

温馨提示: 文中链接在微信中无法生效。请点击底部「阅读原文」。或直接长按/扫描如下二维码，直达原文：

作者：章青慈 (中央财经大学)
邮箱：Quincy_zqc@163.com

编者按：本文主要整理自以下内容，特此致谢！
Alex Gold, Nat Olin, Annie Wang. What is the Delta Method?. 2020. -Link-
Perraillon, M.C., 2020. Lecture, Interpreting model results: Marginal and incremental effects. -Link-

1. 引言

Delta method，又称为增量法或 Delta 法，是一种在概率统计、金融、经济学等领域广泛应用的概率分布逼近方法。该方法由 Cramér 在 20 世纪 40 年代奠定，可通过线性逼近来估计随机变量函数的概率分布，常用于计算复杂随机变量的方差、标准差等统计量，从而简化概率推断的计算过程。

Delta method 是一种半参数方法，它的基本思想是利用泰勒展开式，将一个复杂函数近似为线性函数，进而利用线性函数的性质进行估计。具体而言，假设有一个随机变量是由另一个随机变量通过可导函数得到的，即。Delta method 通过对在的均值处进行泰勒展开 (常用一阶)，得到一个线性逼近的表达式，进而利用这个线性表达式来估计的方差和标准差。

2. 理论背景

2.1 delta method 和泰勒展开

泰勒展开允许我们将可微函数重新表述为的导数 (的无限和) 的形式。更精确地说，一个在点处无限可微的函数在点的值可以写成以下形式：

在展开式中截取一定数量的项 (通常是两项) 之后停止，我们就可以得到一个的近似，这一点在计算复杂函数的标准误时尤其有用。

在预测边际水平下，当回归模型具有链接函数 (link function) 时，预测水平在协变量处的列向量为：

同一回归模型的预测效应是的函数，其导数为：

对于连续变量，这通常是一个实际导数 (即瞬时变化率)；对于离散变量 (分类变量)，这可能是在不同值下计算得到的之间的差，即一阶差分。

使用泰勒展开进行近似

利用泰勒展开，我们可以近似P，即随机变量Xβ在点Xβ附近的任意函数，形式如下：

预测水平的泰勒展开

对于预测水平，泰勒展开为：

注意这里的最后一项实际上为 0，但保留以展示泰勒展开的形式。实际应用中，我们会根据具体的值来计算。

分类变量预测效应的泰勒展开

对于分类变量的预测效应，我们试图估计在处的效应，其泰勒展开为：

这里，我们比较了两个不同值 ( 和 ) 下的效应。

连续变量预测效应的导数

对于连续变量的预测效应，边际效应是一个导数，因此其泰勒展开涉及二阶导数甚至更高阶导数。

2.2 delta method 的工作原理

在上述估计中，的方差是未知的，但可以用 delta method 计算近似的方差。

由于是一个已知点，其方差为零，因此上式简化为：

方差计算

上式中的第一项是雅可比矩阵，矩阵中的元素是预测水平关于回归变量的偏导数。并且，通过回归，我们直接获得的方差，即方差-协方差矩阵。方差的计算公式如下：

步骤总结

计算雅可比矩阵：首先，计算的逆链接函数的雅可比矩阵。
获取方差-协方差矩阵：从回归输出中获取方差-协方差矩阵，或以其他方式进行计算。
矩阵相乘：将雅可比矩阵的转置、原始方差-协方差矩阵、雅可比矩阵三者依次相乘 (“三明治”乘法)，所得矩阵的元素即是所求方差。

3. Stata 实操

3.1 命令简介

margins 命令可以进行边际预测，以及边际效应计算，通过 expression() 选项可以用于任何函数形式。在本文中，我们主要介绍如何通过 margins 命令实现 delta method，计算并提取回归的标准误。

3.2 具体应用

3.2.1 通过数值方法计算标准误

. use https://www.stata-press.com/data/r16/cattaneo2, clear
. gen sm_age = mbsmoke * mage
. qui reg bweight mbsmoke mage sm_age
. matrix list e(V)

symmetric e(V)[4,4]
            mbsmoke        mage      sm_age       _cons
mbsmoke   10733.267
   mage   71.343306   2.6610229
 sm_age  -403.72658  -2.6610229   15.868305
  _cons  -1997.5365  -71.343306   71.343306   1997.5365

* var(beta2)
. di e(V)[2,2]
2.6610229
* var (beta3)
. di e(V)[3,3]
15.868305
* cov(b2,b3)
. di e(V)[3,2]
-2.6610229
* 当mbsmoke=0时，边际效应的标准差是
. di sqrt(e(V)[2,2])
1.6312642
* 当mbsmoke=1时，边际效应的标准差是
. di sqrt(e(V)[2,2] + e(V)[3,3] + 2*e(V)[3,2])
3.6341825

3.2.2 使用 `margins` 命令计算标准误

. qui reg bweight i.mbsmoke##c.mage
. margins, dydx(mage) at(mbsmoke=(0 1)) vsquish

Average marginal effects                                 Number of obs = 4,642
Model VCE: OLS
Expression: Linear prediction, predict()
dy/dx wrt:  mage
1._at: mbsmoke = 0
2._at: mbsmoke = 1
------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
mage         |
         _at |
          1  |   11.36258   1.631264     6.97   0.000     8.164523    14.56063
          2  |  -3.950895   3.634182    -1.09   0.277    -11.07562    3.173831
------------------------------------------------------------------------------

使用 margins 命令计算出的 Delta-method std. err. 和上文计算结果一致。

3.2.3 复现 delta method 计算标准误的原理

. qui reg bweight i.mbsmoke##c.mage
. margins mbsmoke, nofvlabel

Predictive margins                                       Number of obs = 4,642
Model VCE: OLS
Expression: Linear prediction, predict()
------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     mbsmoke |
          0  |   3409.435   9.221561   369.72   0.000     3391.356    3427.514
          1  |   3132.374   19.85928   157.73   0.000      3093.44    3171.308
------------------------------------------------------------------------------

* 提取雅可比矩阵并保存
. matrix list r(Jacobian)

r(Jacobian)[2,6]
                    0b.           1.               0b.mbsmoke#   1.mbsmoke#             
               mbsmoke      mbsmoke         mage      co.mage       c.mage        _cons
0.mbsmoke            0            0    26.504524            0            0            1
1.mbsmoke            0            1    26.504524            0    26.504524            1

. matrix J = r(Jacobian)

* 保存回归的方差-协方差矩阵
. matrix list e(V)

symmetric e(V)[6,6]
                     0b.           1.               0b.mbsmoke#   1.mbsmoke#             
                mbsmoke      mbsmoke         mage      co.mage       c.mage        _cons
0b.mbsmoke            0
 1.mbsmoke            0    10733.267
      mage            0    71.343306    2.6610229
0b.mbsmoke#
   co.mage            0            0            0            0
1.mbsmoke#
    c.mage            0   -403.72658   -2.6610229            0    15.868305
     _cons            0   -1997.5365   -71.343306            0    71.343306    1997.5365

* 复现delta method计算的标准差
. matrix Vrep = J*e(V)*J'
. matrix list Vrep

symmetric Vrep[2,2]
                   1.         1.
             mbsmoke    mbsmoke
0.mbsmoke  85.037195
1.mbsmoke  1.300e-12  394.39086

* 当mbsmoke=0时，边际效应的标准差是
. di sqrt(Vrep[1,1])
9.2215614
* 当mbsmoke=1时，边际效应的标准差是
. di sqrt(Vrep[2,2])
19.859277

需要注意的是，输入的 mbsmoke 变量取值有 2 种，回归输出的变量有 6 个，这种情况下雅可比矩阵应当是 6×2 矩阵，但 margins 命令返回的雅可比矩阵是 2×6 矩阵，实际上是一般雅可比矩阵的转置形式，所以此处计算方差的公式为。

4. R 语言实操

4.1 包简介

modmarg 包是基于 Stata 的 margins 命令，利用 delta method 进行计算的包。modmarg 通过 CRAN (Comprehensive R Archive Network) 直接提供，命令安装：

install.packages("modmarg")

4.2 具体应用

4.2.1 使用数据集 `margex` 进行逻辑回归分析

library(modmarg)
data(margex)
lg <- glm(outcome ~ treatment * age, data = margex, family = 'binomial')
summary(lg)

Call:
glm(formula = outcome ~ treatment * age, family = "binomial", 
    data = margex)
Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.3200  -0.6033  -0.3212  -0.1580   2.9628  
Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -7.03092    0.50247 -13.993   <2e-16 ***
treatment      1.35170    0.62208   2.173   0.0298 *  
age            0.11060    0.01069  10.347   <2e-16 ***
treatment:age -0.01046    0.01301  -0.804   0.4216    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
    Null deviance: 2732.1  on 2999  degrees of freedom
Residual deviance: 2169.4  on 2996  degrees of freedom
AIC: 2177.4
Number of Fisher Scoring iterations: 6

treatment 变量在 0 或 1 处的平均结果是什么？仅通过回归结果，我们不能直接解释系数，因为 treatment 的作用还受 age 的水平影响。

4.2.2 计算边际效应

# Extract the n x k matrix of data
x <- model.matrix(lg)
# Extract the coefficients from the model (a column vector with k entries)
beta <- matrix(lg$coefficients)
# CONTROL:
# Set treatment and treatment:age to 0 for all observations
x[, "treatment"] <- 0
x[, 'treatment:age'] <- x[, 'treatment'] * x[, 'age']
# Get linear predictors
pred_ctl <- x %*% beta
# Apply the inverse link function to get predicted probabilities
pp_ctl <- 1 / (1 + exp(-pred_ctl))
# Get the average predicted probability
mean_pp_ctl <- mean(pp_ctl)
# TREATMENT:
# Set treatment to 1 and treatment:age to age for all observations
x[, "treatment"] <- 1
x[, 'treatment:age'] <- x[, 'treatment'] * x[, 'age']
# Get linear predictors
pred_treat <- x %*% beta
# Apply the inverse link function to get predicted probabilities
pp_treat <- 1 / (1 + exp(-pred_treat))
# Get the average predicted probability
mean_pp_treat <- mean(pp_treat)
# RESULTS:
mean_pp_ctl
## [1] 0.1126685
mean_pp_treat
## [1] 0.208375

4.2.3 应用 delta method 计算标准误

# Get the data
x <- model.matrix(lg)
# CONTROL ERROR
# Apply the derivative of the inverse link function to the linear predictors
deriv <- as.vector(exp(-pred_ctl) / (1 + exp(-pred_ctl))^2)
# Set treatment to 0
x[, 'treatment'] <- 0
x[, 'treatment:age'] <- x[, 'treatment'] * x[, 'age']
# Complete the chain rule by matrix-multiplying the derivatives by the data,
# now we have the jacobian
j <- deriv %*% x / nrow(x)
# The variance of our estimate is the cross product of the jacobian and the model's
# variance-covariance matrix
variance <- j %*% vcov(lg) %*% t(j)
# The error is the square root of that
se_ctl <- sqrt(diag(variance))
# TREATMENT ERROR: same logic
deriv <- as.vector(exp(-pred_treat) / (1 + exp(-pred_treat))^2)
x <- model.matrix(lg)
x[, 'treatment'] <- 1
x[, 'treatment:age'] <- x[, 'treatment'] * x[, 'age']
j <- deriv %*% x / nrow(x)
variance <- j %*% vcov(lg) %*% t(j)
se_treat <- sqrt(diag(variance))
se_ctl
## [1] 0.009334252
se_treat
## [1] 0.009089667

4.2.4 汇总计算结果

result <- data.frame(
  Label = c("treatment = 0", "treatment = 1"),
  Margin = c(mean_pp_ctl, mean_pp_treat),
  Standard.Error = c(se_ctl, se_treat)
)

result
          Label    Margin Standard.Error
1 treatment = 0 0.1126685    0.009334252
2 treatment = 1 0.2083750    0.009089667

5. 相关推文

Note：产生如下推文列表的 Stata 命令为：
lianxh 边际效应, m
安装最新版 lianxh 命令：
ssc install lianxh, replace

专题：交乘项-调节-中介

刘聪聪, 陈点点, 2020, Stata：interflex-交乘项该这么分析!, 连享会 No.121.

专题：Probit-Logit

展一帆, 周依仿, 2021, Logit-Probit：非线性模型中交互项的边际效应解读, 连享会 No.616.

专题：分位数回归

曹钰潇, 2022, Stata分位数回归I：理解边际效应和条件边际效应, 连享会 No.857.

专题：回归分析

江鑫, 2021, Stata：边际效应知多少？f_able命令（上）, 连享会 No.689.
江鑫, 2021, Stata：边际效应知多少？f_able命令（下）, 连享会 No.690.
祁本章, 2021, Logit-Probit中的交乘项及边际效应图示, 连享会 No.575.
谭炜荣, 2024, Stata：组间边际效应差异检验-gsem, 连享会 No.1442.

专题：Stata教程

连享会, 2021, Stata-Python交互-5：边际效应三维立体图示, 连享会 No.555.
连享会, 2020, Stata: 手动计算和图示边际效应, 连享会 No.226.
连玉君, 2020, Stata：图示连续变量的连续边际效应, 连享会 No.189.
连玉君, 杨柳, 2020, Stata: 边际效应分析, 连享会 No.64.
连玉君, 杨柳, 2020, Stata：Logit模型一文读懂, 连享会 No.170.

专题：面板数据

郭盼亭, 2022, Stata：面板Logit的边际效应和处理效应估计-mfelogit, 连享会 No.1127.
陈希, 2022, Stata：如何理解三个变量的交乘项？, 连享会 No.1118.
陈美琪, 2023, Stata：展示OLS和GLM的交乘项(一)-icalc, 连享会 No.1157.
陈美琪, 2023, Stata：展示OLS和GLM的交乘项(二)-icalc, 连享会 No.1166.

专题：交互项

陈贤孟, 2022, 解读交互项及其边际效应, 连享会 No.1075.