双重机器学习--ddml操作案例

学术 2024-11-21 23:22 陕西

双重机器学习--ddml操作案例（一）

本文列出了Stata进行双重机器学习ddml操作部分案例

完整ddml论文复刻结果等详见推文：

🔺DDML：双重机器学习（Stata中Python相关设置） (qq.com)

😆2024Stata暑假班--精彩片段--机器学习与断点回归

😅2024Stata暑假班--精彩片段--机器学习与合成控制法

🌈2024Stata暑假班--精彩片段--双重机器学习DDML

👉《数量经济技术经济研究》双重机器学习论文复刻结果

🤣机器学习&因果推断--2024Stata暑假班--精彩片段 (qq.com)

🎈AER大运河论文讲解--2024Stata暑假班--精彩片段--异质性DID专题 (qq.com)

1、查看命令版本号

which ddml
which pystacked

2、Partially linear model

2.1 ddml crossfit报错

 use sipp1991.dta, clear

global Y net_tfa

global D e401

global X tw age inc fsize educ db marr twoearn pira hown

set seed 42

ddml init partial, kfolds(2)


ddml E[Y|X]: reg $Y $X

如下命令报错

ddml estimate, robust

报错提示为：
ddml model not cross-fitted; call `ddml crossfit` first
r(198);

解决方案：需要先进行ddml crossfit

ddml命令升级更新命令为

ddml update
help ddml

use sipp1991.dta, clear

. global Y net_tfa

. global D e401

. global X tw age inc fsize educ db marr twoearn pira hown

. set seed 42

. ddml init partial, kfolds(2)
warning - model m0 already exists
all existing model results and variables will
be dropped and model m0 will be re-initialized

. ddml E[Y|X]: reg $Y $X
Learner Y1_reg added successfully.

. ddml E[Y|X]: pystacked $Y $X, type(reg) method(rf)
Learner Y2_pystacked added successfully.

. ddml E[D|X]: reg $D $X
Learner D1_reg added successfully.

. ddml E[D|X]: pystacked $D $X, type(reg) method(rf)
Learner D2_pystacked added successfully.

. ddml desc

Model:                  partial, crossfit folds k=2, resamples r=1
Dependent variable (Y): net_tfa
 net_tfa learners:      Y1_reg Y2_pystacked
D equations (1):        e401
 e401 learners:         D1_reg D2_pystacked
Specifications:         4 possible specs

. ddml estimate, robust
ddml model not cross-fitted; call `ddml crossfit` first
r(198);

. ddml crossfit
Cross-fitting E[y|X] equation: net_tfa
Cross-fitting fold 1 2 ...completed cross-fitting
Cross-fitting E[D|X] equation: e401
Cross-fitting fold 1 2 ...completed cross-fitting

. ddml estimate, robust

DDML estimation results:
spec  r     Y learner     D learner         b        SE
 opt  1  Y2_pystacked        D1_reg  7044.518(1126.896)
opt = minimum MSE specification for that resample.

Min MSE DDML model
y-E[y|X]  = Y2_pystacked_1                         Number of obs   =      9915
D-E[D|X,Z]= D1_reg_1
------------------------------------------------------------------------------
             |               Robust
     net_tfa | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        e401 |   7044.518   1126.896     6.25   0.000     4835.843    9253.193
       _cons |  -317.8379   352.8666    -0.90   0.368    -1009.444     373.768
------------------------------------------------------------------------------


. ddml estimate, robust allcombos

DDML estimation results:
spec  r     Y learner     D learner         b        SE
   1  1        Y1_reg        D1_reg  5397.208(1130.776)
   2  1        Y1_reg  D2_pystacked  6705.740 (878.656)
*  3  1  Y2_pystacked        D1_reg  7044.518(1126.896)
   4  1  Y2_pystacked  D2_pystacked  6979.699 (753.471)
* = minimum MSE specification for that resample.

Min MSE DDML model
y-E[y|X]  = Y2_pystacked_1                         Number of obs   =      9915
D-E[D|X,Z]= D1_reg_1
------------------------------------------------------------------------------
             |               Robust
     net_tfa | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        e401 |   7044.518   1126.896     6.25   0.000     4835.843    9253.193
       _cons |  -317.8379   352.8666    -0.90   0.368    -1009.444     373.768
------------------------------------------------------------------------------


. ddml estimate, robust spec(1) replay

DDML estimation results:
spec  r     Y learner     D learner         b        SE
 opt  1  Y2_pystacked        D1_reg  7044.518(1126.896)
opt = minimum MSE specification for that resample.

DDML model, specification 1
y-E[y|X]  = Y1_reg_1                               Number of obs   =      9915
D-E[D|X,Z]= D1_reg_1
------------------------------------------------------------------------------
             |               Robust
     net_tfa | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        e401 |   5397.208   1130.776     4.77   0.000     3180.928    7613.488
       _cons |   -104.854   397.9023    -0.26   0.792     -884.728    675.0201
------------------------------------------------------------------------------

案例2


. webuse cattaneo2, clear
(Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138–154)

. global Y bweight

. global D mbsmoke

. global X prenatal1 mmarried fbaby mage medu

. set seed 42

. ddml init interactive, kfolds(5) reps(5)
warning - model m0 already exists
all existing model results and variables will
be dropped and model m0 will be re-initialized

. ddml E[Y|X,D]: pystacked $Y $X, type(reg) methods(ols gradboost)
Learner Y1_pystacked added successfully.

. ddml E[D|X]: pystacked $D $X, type(class) methods(logit gradboost)
Learner D1_pystacked added successfully.

. ddml crossfit

. ddml estimate

案例3、双重机器学习--工具变量法估计

  . use AJR.dta, clear
    global Y logpgp95
    global D avexpr
    global Z logem4
    global X lat_abst edes1975 avelf temp* humid* steplow-oilres
    set seed 42

    ddml init iv, kfolds(30)

    
    ddml E[Y|X]: reg $Y $X
    ddml E[Y|X], vtype(none): rforest $Y $X, type(reg)
    ddml E[D|X]: reg $D $X
    ddml E[D|X], vtype(none): rforest $D $X, type(reg)
    ddml E[Z|X]: reg $Z $X
    ddml E[Z|X], vtype(none): rforest $Z $X, type(reg)

    ddml crossfit, shortstack
    ddml estimate, robust

http://mp.weixin.qq.com/s?__biz=MjM5NTM4NjU2OA==&mid=2650760563&idx=2&sn=f97b16c94c32e73e779e89e6351ccc69