很实用! 一张图就把基准回归, 稳健性检验, 异质性分析, 安慰剂检验以及进一步拓展结果全囊括了

学术   2024-12-17 08:31   江苏  


凡是搞计量经济的,都关注这个号了
箱:econometrics666@126.com
所有计量经济圈方法论丛的code程序, 宏微观数据库和各种软件都放在社群里.欢迎到计量经济圈社群交流访问.

下面这种图表非常直观,有时候一张图的效果胜过千言万语,甚至比几张表格还要有效。例如,在完成基准回归分析之后,进行稳健性检验、异质性分析、安慰剂检验等进一步分析时,完全可以将这些回归结果汇总在一张图表中。这样做不仅便于与基准回归结果进行比较,还能清晰地展示它们之间的差异和相似之处。

下面给出了(Furman, Jeffrey L., Markus Nagler, and Martin Watzinger. 2021. "Disclosure and Subsequent Innovation: Evidence from the Patent Depository Library Program." American Economic Journal: Economic Policy, 13 (4): 239–70)里的数据和代码,经过一些简单的调试,就可以直接运行出与原文一样的结果图。

除此之外,也可以参看:以图形方式呈现异质性分析结果, 留下一个较直观的印象!

*群友可前往社群下载完整数据和代码.

**计量经济圈已调试,可以直接运行出结果,当然中间需要下载一些命令,比如gtools。

use "D:/下载/129581-V1/Kit/data/main_work", clear
gen post   = yearsopen10>=0
xi i.post*patent_lib, noomit
xtset identifier yearsopen10
cem pat_ID(#0) stateid(#0), treatment(treated)
drop if cem_matched==0

* Baseline
eststo r0: xtreg pat_pop_distance _IposXpaten_1 post  patent_lib i.yearID2 [aweight=cem_weights],  fe cluster(pat_ID) 
sum pat_pop_distance 
estadd scalar mean_scalar = r(mean)

* Add matching on university library
drop uni_library
gen uni_library=regexm(ParentInstitutionofLibrary, "University")
gen control_uni=regexm(LibraryType, "Academic")
replace uni_library = 0 if control_uni==0
xi i.post*patent_lib, noomit
xtset identifier yearsopen10

preserve
cem pat_ID(#0) stateid(#0) uni_library(#0), treatment(treated)
drop if cem_matched==0

eststo r3: xtreg pat_pop_distance _IposXpaten_1 post  patent_lib i.yearID2 [aweight=cem_weights],  fe cluster(pat_ID) 
sum pat_pop_distance 
estadd scalar mean_scalar = r(mean)
restore

* Add matching on patenting per capita
preserve
bysort pat_ID identifier: egen sim_size=sum((yearsopen10<0)*pat_pop_distance)
cem pat_ID(#0) stateid(#0)  uni_library(#0) sim_size(#5), treatment(treated)
drop if cem_matched==0

eststo r4: xtreg pat_pop_distance _IposXpaten_1 post  patent_lib i.yearID2 [aweight=cem_weights],  fe cluster(pat_ID) 
sum pat_pop_distance 
estadd scalar mean_scalar = r(mean)
restore

* Add matching on patenting per capita and on population
preserve
bysort pat_ID identifier: egen sim_size=sum((yearsopen10<0)*pat_pop_distance)
bysort pat_ID identifier: egen sim_size_pop=sum((yearsopen10<0)*pop_15m)
cem pat_ID(#0) stateid(#0)  uni_library(#0) sim_size(#5) sim_size_pop(#5), treatment(treated)
drop if cem_matched==0

eststo r5: xtreg pat_pop_distance _IposXpaten_1 post  patent_lib i.yearID2 [aweight=cem_weights],  fe cluster(pat_ID) 
sum pat_pop_distance 
estadd scalar mean_scalar = r(mean)
restore

* Distance match: < 100 miles
use "D:/下载/129581-V1/Kit/data/main_work_distance_matched", clear
keep if pat_FDL_dist<=100 | pat_FDL_dist==.
gen post   = yearsopen10>=0
xi i.post*patent_lib, noomit
xtset identifier yearsopen10
cem pat_ID(#0), treatment(treated)
drop if cem_matched==0

eststo r6: xtreg pat_pop_distance _IposXpaten_1 post  patent_lib i.yearID2 [aweight=cem_weights],  fe cluster(pat_ID) 
sum pat_pop_distance 
estadd scalar mean_scalar = r(mean)
* Distance match: 5 closest
use "D:/下载/129581-V1/Kit/data/main_work_distance_matched", clear
replace pat_FDL_dist = -pat_FDL_dist
bysort pat_ID yearsopen patent_lib : egen rank = rank(pat_FDL_dist) if pat_FDL_dist!=0,  track
keep if rank<6 | rank==.
egen yearID = group(yearsopen10)
gen post   = yearsopen10>=0
xi i.post*patent_lib, noomit
xtset identifier yearsopen10
cem pat_ID(#0), treatment(treated)
drop if cem_matched==0

eststo r7: xtreg pat_pop_distance _IposXpaten_1 post  patent_lib i.yearID2 [aweight=cem_weights],  fe cluster(pat_ID) 
sum pat_pop_distance 
estadd scalar mean_scalar = r(mean)

* Synthetic

u "D:/下载/129581-V1/Kit/data/patent_characteristics", clear
drop if appln_y>2010
drop if publn_y>2010
drop if lat<23
drop if lon<-130
replace appln_id = inpadoc_family_id
gen summer_other=1 
keep appln_id appln_y summer_other
gduplicates drop
collapse (sum) summer_other, by(appln_y)
save "D:/下载/129581-V1/Kit/data/patent_no_by_year.dta", replace

use "D:/下载/129581-V1/Kit/data/main_work", clear
keep if patent_lib==1
duplicates drop pat_ID yearsopen10, force
drop patent_lib
merge m:1 appln_y using "D:/下载/129581-V1/Kit/data/patent_no_by_year.dta"
keep if _merge==3
gen share = pat_15m/(summer_other)
replace share = . if yearsopen10!=-1
bysort pat_ID: egen mean_share = sum(share)
gen pat_pop_distance_0 = (mean_share*(summer_other))/pop_15m*100000
ren pat_pop_distance pat_pop_distance_1
reshape long pat_pop_distance_, i(pat_ID yearsopen10) j(patent_lib)
gen post   = yearsopen10>=0
xi i.post*patent_lib, noomit
drop identifier
egen identifier = group(pat_ID patent_lib)
xtset identifier yearsopen10

eststo rs: xtreg pat_pop_distance_ i.yearID2 _IposXpaten_1 post  patent_lib ,  fe cluster(pat_ID) 
sum pat_pop_distance_
estadd scalar mean_scalar = r(mean)

erase "D:/下载/129581-V1/Kit/data/patent_no_by_year.dta"

* Test of SUTVA
use "D:/下载/129581-V1/Kit/data/main_work", clear
cap drop strata_id mean
drop if patent_lib==1
bys pat_ID: egen cf=min(pat_FDL_dist)
replace patent_lib=(pat_FDL_dist==cf)
replace treated = patent_lib

egen strata_id = group(pat_ID stateid)
bysort strata_id: egen mean = mean(treated)
drop if mean==0 | mean==1
keep if yearsopen10>=-5 & yearsopen10<=5
gen post   = yearsopen10>=0
xi i.post*patent_lib, noomit
xtset identifier yearsopen10
cem pat_ID(#0) stateid(#0), treatment(treated)
drop if cem_matched==0
eststo r2: xtreg pat_pop_distance _IposXpaten_1 post  patent_lib i.yearID2 [aweight=cem_weights], fe cluster(pat_ID) 
sum pat_pop_distance
estadd scalar mean_scalar = r(mean)

* Distance
use "D:/下载/129581-V1/Kit/data/main_work", clear
gen post   = yearsopen10>=0
xi i.post*patent_lib, noomit
xtset identifier yearsopen10
cem pat_ID(#0) stateid(#0), treatment(treated)
drop if cem_matched==0

foreach x in 50 100  {
eststo rd`x': xtreg pat_pop_`x'm _IposXpaten_1 post  patent_lib i.yearID2 [aweight=cem_weights] , fe cluster(pat_ID) 
sum pat_pop_`x'm
estadd scalar mean_scalar = r(mean)
}

label var _IposXpaten_1 "Pat Lib x Post "
label var post   "Post "
label var patent_lib "Patent Library"

esttab r0 r3 r4 r5 r6 r7 rs r2  r0 rd50 rd100  using "D:/下载/129581-V1/Kit/results/TabB2.tex",  posthead("") keep( _IposXpaten_1 post )order( post _IposXpaten_1 )   fragment compress nogaps nodepvar noisily ///
stats(mean_scalar r2 N, fmt(%9.1f %9.2f %18.0g) labels(`"Mean Dep."' `"R2 (within)"' `"Obs."')) nonotes nonumber nomtitle b(1) se(1) obslast  ///
star(* 0.10 ** 0.05 *** 0.01) label  replace

mat b= r(coefs)
cap drop b se counter
gen b = .
gen se = . 
gen counter = .
forvalues i = 1(1)11{
local j = (`i'*3)-2
local k = `j'+1
replace b = b[2,`j'] in `i'
replace se = b[2,`k'] in `i'
replace counter = `i' in `i'
}

gen upper = b + 1.68*se
gen lower = b - 1.68*se

replace counter = (counter*-1)+12.5

gen significant = upper<0 | lower>0
replace lower=. if upper==.

replace counter = -4 if counter == 1.5
replace counter = -3 if counter == 2.5
replace counter = -2 if counter == 3.5
replace counter = 0.5 if counter == 4.5
replace counter = 2 if counter == 5.5
replace counter = 3.5 if counter == 6.5
replace counter = 4.5 if counter == 7.5
replace counter = 7 if counter == 8.5
replace counter = 8 if counter == 9.5
replace counter = 9 if counter == 10.5

gen mlabel = ""
replace mlabel = "Increase by 3.2 / 18% relative to baseline"  if counter == 11.5
replace mlabel = "2.9 / 17%"  if counter == 9
replace mlabel = "2.4 / 19%"  if counter == 8
replace mlabel = "2.4 / 20%"  if counter == 7
replace mlabel = "1.9 / 8%"  if counter == 4.5
replace mlabel = "3.3 / 16%"  if counter == 3.5
replace mlabel = "2.4 / 12%"  if counter == 2
replace mlabel = "-0.5 / -3%"  if counter == 0.5
replace mlabel = "3.2 / 18%"  if counter == -2
replace mlabel = "2.1 / 12%"  if counter == -3
replace mlabel = "-1.4 / -7%"  if counter == -4

twoway (rcap upper lower counter, horizontal color(edkblue))  (scatter counter b if significant==1, color(red) msymbol(d) mlabel(mlabel) mlabcolor(black) mlabsize(2) mlabp(6) ) (scatter counter b if significant==0, color(edkblue) msymbol(dh) mlabel(mlabel) mlabcolor(black) mlabsize(2) mlabp(6)), ytitle("") legend(off) xline(0)  ylabel(-4 "11) in 50-100 mi"  -3 "10) in 15-50 mi" -2 "9) <= 15 mi" -1 "{bf:Patents p.c. in...}" 0.5 "8) Pseudo opening"  2 "7) Synthetic libraries" 3.5 "6) 5 closest" 4.5 "5) <100 mi" 5.5 "{bf:Distance Match...}" 7 "4) + Population" 8 "3)  + Patent p.c." 9 "2) +University" 10 "{bf:Add matching on...}" 11.5 "1) Baseline", angle(0)      )   ysize(6)  xline(3.216079, lc(gs12)) yscale(range(-4.5 11.5))
graph export "results\Fig6.pdf", replace 
**计量经济圈已调试,可以直接运行,当然中间需要下载一些命令,比如labmask。

use "D:/下载/129581-V1/Kit/data/main_work", clear
egen yearID = group(yearsopen10)
gen post   = yearsopen10>=0
xi i.post*patent_lib, noomit
xtset identifier yearsopen10
cem pat_ID(#0) stateid(#0) $matched, treatment(treated)
drop if cem_matched==0

gen b = .
gen se = .
gen subcategory = ""
gen counter = .
ren pat_pop_distance pat_mainarea30_0
unab k: pat_mainarea30_*

local i = 1
foreach x in  `k' {
quietly{
xtreg `x' _IposXpaten_1 post  patent_lib i.yearID2 [aweight=cem_weights], fe cluster(pat_ID) 
replace b = _b[_IposXpaten_1] in `i'
replace se = _se[_IposXpaten_1] in `i'
replace subcategory = "`x'" in `i'
replace counter = `i' in `i'
local i = `i'+1
}
}

keep if counter!=.
gen upper = b + 1.68*se
gen lower = b - 1.68*se

replace subcategory = subinstr(subcategory, "pat_mainarea30_", "", .)

destring subcategory, replace
ren subcategory mainarea30
replace counter = (counter*-1)+37
joinby mainarea30 using "D:/下载/129581-V1/Kit/data/mainarea35_area35", unmatched(master)
drop mainarea30 _merge
joinby area30 using "D:/下载/129581-V1/Kit/data/mainarea35_area35", unmatched(master)
replace mainarea30=0 if mainarea30==.
keep b se mainarea30 counter upper lower
duplicates drop
sort mainarea30
decode mainarea30, gen(label_ex)

cap drop counter
drop if upper==.
gsort - b
gen counter = _n

replace counter = 7-counter
labmask counter, values(label_ex) 

gen significant = upper<0 | lower>0
replace lower=. if upper==.
twoway (rcap upper lower counter, horizontal color(edkblue))  (scatter counter b if significant==1, color(red) msymbol(d)) (scatter counter b if significant==0, color(edkblue) msymbol(dh)), ytitle("") legend(off) xline(0)  ylabel(0  1 2 3 4 5 6 "Baseline" ,  valuelabel     )   ysize(4)  
graph export "results\Fig7.pdf", replace 

use "D:/下载/129581-V1/Kit/data/main_work", clear


egen yearID = group(yearsopen10)

gen post   = yearsopen10>=0

xi i.post*patent_lib, noomit

xtset identifier yearsopen10

cem pat_ID(#0) stateid(#0), treatment(treated)

drop if cem_matched==0

egen group_year = group(appln_y pat_ID)

gen b = . 

gen se = .

gen name = ""

gen counter = 0


local i = 1

foreach x in pat_pop_distance pat_independent_pc pat_company_pc   pat_uni_hosp_pc pat_non_profit_pc  {

xtreg `x' _IposXpaten_1 post  patent_lib i.yearID2 [aweight=cem_weights],fe  cluster(pat_ID) 

sum `x'

estadd scalar mean_scalar = r(mean)

replace b = _b[ _IposXpaten_1] in `i'

replace se = _se[ _IposXpaten_1] in `i'

replace counter = `i' in `i'

replace name = "`x'" in `i'

local i = `i'+1


}


xtreg pat_pop_distance _IposXpaten_1 post  patent_lib i.yearID2 [aweight=cem_weights] if hist_high==1,fe  cluster(pat_ID)

sum pat_pop_distance if hist_high==1

estadd scalar mean_scalar = r(mean)

replace b = _b[ _IposXpaten_1] in `i'

replace se = _se[ _IposXpaten_1] in `i'

replace counter = `i' in `i'

replace name = "High" in `i'

local i = `i'+1


xtreg pat_pop_distance _IposXpaten_1 post  patent_lib i.yearID2 [aweight=cem_weights] if hist_high==0,  fe cluster(pat_ID)

sum pat_pop_distance if hist_high==0

estadd scalar mean_scalar = r(mean)

replace b = _b[ _IposXpaten_1] in `i'

replace se = _se[ _IposXpaten_1] in `i'

replace counter = `i' in `i'

replace name = "Low" in `i'

local i = `i'+1


bysort pat_ID: egen uni_pat_lib = max(patent_lib*uni_library)


xtreg pat_pop_distance _IposXpaten_1 post  patent_lib i.yearID2  [aweight=cem_weights] if uni_pat_lib==1 ,  fe cluster(pat_ID)

sum pat_pop_distance if uni_library==1

estadd scalar mean_scalar = r(mean)


xtreg pat_pop_distance _IposXpaten_1 post  patent_lib i.yearID2 [aweight=cem_weights] if uni_pat_lib==0 , fe  cluster(pat_ID)

sum pat_pop_distance if uni_library==0

estadd scalar mean_scalar = r(mean)


gen upper = b + 1.68*se

gen lower = b - 1.68*se


replace counter = (counter*-1)+8


gen significant = upper<0 | lower>0

twoway (rcap upper lower counter, horizontal)  (scatter counter b if significant==1, color(red) msymbol(d)) (scatter counter b if significant==0, color(edkblue) msymbol(dh)), ytitle("") ysize(4) legend(off) xline(0) ylabel(1 "Historically Low Patenting Regions" 2 "Historically High Patenting Regions" 3 "Patents assigned to Government / Military / Non-Profit" 4 "Patents assigned to Universities" 5 "Patents assigned to Companies" 6 "Patents assigned to Individual Inventor" 7 "Baseline" ,  valuelabel labsize(small)) 

graph export "D:/下载/129581-V1/Kit/results/FigB3.pdf", replace

use "D:/下载/129581-V1/Kit/data/main_work", clear
egen yearID = group(yearsopen10)
gen post   = yearsopen10>=0
xi i.post*patent_lib, noomit
xtset identifier yearsopen10
cem pat_ID(#0) stateid(#0) $matched, treatment(treated)
drop if cem_matched==0

gen b = .
gen se = .
gen subcategory = ""
gen counter = .
unab k: pat_area30_*

local i = 1
foreach x in  `k' {
quietly{
xtreg `x' _IposXpaten_1 post  patent_lib i.yearID2 [aweight=cem_weights], fe cluster(pat_ID) 
replace b = _b[_IposXpaten_1] in `i'
replace se = _se[_IposXpaten_1] in `i'
replace subcategory = "`x'" in `i'
replace counter = `i' in `i'
local i = `i'+1
}
}

keep if counter!=.

gen upper = b + 1.68*se
gen lower = b - 1.68*se

replace subcategory = subinstr(subcategory, "pat_area30_", "", .)

destring subcategory, replace
ren subcategory area30
replace counter = (counter*-1)+37

merge m:1 area30 using "D:/下载/129581-V1/Kit/data/area30_label", update

cap drop counter
drop if upper==.
gsort - area30
gen counter = _n

replace counter = counter+2 if counter>2
replace counter = counter+2 if counter>10
replace counter = counter+2 if counter>17
replace counter = counter+2 if counter>27
replace counter = counter+2 if counter>33

set obs 36
replace counter = 3 in 31
replace counter = 11 in 32
replace counter = 18 in 33
replace counter = 28 in 34
replace counter = 34 in 35
replace counter = 41 in 36

replace label_ex = "Other Fields" if counter==3
replace label_ex = "Mechanical Engineering"  if counter==11
replace label_ex = "Process Engineering"  if counter==18
replace label_ex ="Chemistry"  if counter==28
replace label_ex = "Instruments"  if counter==34
replace label_ex = "Electrical Engineering"  if counter==41

labmask counter, values(label_ex) 
gen significant = upper<0 | lower>0

preserve 
keep if counter>18
twoway (rcap upper lower counter, horizontal)  (scatter counter b if significant==1 , color(red) msymbol(d)) (scatter counter b if significant==0, color(edkblue) msymbol(dh)), ytitle("") legend(off) xline(0)  ylabel(  20  21 22 23 24 25 26 27 28   30 31 32 33 34  36 37 38 39 40 41,  valuelabel     tlcolor(none) )   ysize(5)  yline( 4 12 29 35)  
graph export "D:/下载/129581-V1/Kit/results/FigC3a.pdf", replace 
restore

preserve 
keep if counter<=18
twoway (rcap upper lower counter, horizontal)  (scatter counter b if significant==1 , color(red) msymbol(d)) (scatter counter b if significant==0, color(edkblue) msymbol(dh)), ytitle("") legend(off) xline(0)  ylabel(1 2 3  5 6 7 8 9 10 11  13 14 15 16 17 18 ,  valuelabel     tlcolor(none) )   ysize(5)  yline( 4 12 29 35)  
graph export "D:/下载/129581-V1/Kit/results/FigC3b.pdf", replace 
restore

*完整版数据和代码可以直接前往社群下载。
关于多期DID或交叠DID: 1.DID相关前沿问题“政策交错执行+堆叠DID+事件研究”, 附完整slides,2.交错(渐进)DID中, 用TWFE估计处理效应的问题, 及Bacon分解识别估计偏误,3.典范! 这篇AER在一图表里用了所有DID最新进展方法, 审稿人直接服了!4.最新Sun和Abraham(2020)和TWFE估计多期或交错DID并绘图展示结果!详细解读code!5.多期DID或渐进DID或交叠DID, 最新Stata执行命令整理如下供大家学习,6.多期DID前沿方法大讨论, e.g., 进入-退出型DID, 异质性和动态性处理效应DID, 基期选择问题等,7.交叠DID中平行趋势检验, 事件研究图绘制, 安慰剂检验的保姆级程序指南!8.欣慰! 营养午餐计划终于登上TOP5! 交叠DID+异质性稳健DID!9.用事件研究法开展政策评估的过程, 手把手教学文章!10.从双重差分法到事件研究法, 双重差分滥用与需要注意的问题,11.系统梳理DID最新进展: 从多期DID的潜在问题到当前主流解决方法和代码! 12.标准DID中的平行趋势检验,动态效应, 安慰剂检验, 预期效应教程,13.DID从经典到前沿方法的保姆级教程, 释放最完整数据和代码!
下面这些短链接文章属于合集,可以收藏起来阅读,不然以后都找不到了。

7年,计量经济圈近2000篇不重类计量文章,

可直接在公众号菜单栏搜索任何计量相关问题,

Econometrics Circle




数据系列空间矩阵 | 工企数据 | PM2.5 | 市场化指数 | CO2数据 |  夜间灯光 官员方言  | 微观数据 | 内部数据
计量系列匹配方法 | 内生性 | 工具变量 | DID | 面板数据 | 常用TOOL | 中介调节 | 时间序列 | RDD断点 | 合成控制 | 200篇合辑 | 因果识别 | 社会网络 | 空间DID
数据处理Stata | R | Python | 缺失值 | CHIP/ CHNS/CHARLS/CFPS/CGSS等 |
干货系列能源环境 | 效率研究 | 空间计量 | 国际经贸 | 计量软件 | 商科研究 | 机器学习 | SSCI | CSSCI | SSCI查询 | 名家经验
计量经济圈组织了一个计量社群,有如下特征:热情互助最多前沿趋势最多、社科资料最多、社科数据最多、科研牛人最多、海外名校最多。因此,建议积极进取和有强烈研习激情的中青年学者到社群交流探讨,始终坚信优秀通过感染优秀而互相成就彼此。

计量经济圈
凡是搞计量经济的,都关注这个号了。
 最新文章