机器学习在经管领域的应用有哪些经典论文？

学术 2024-11-09 14:11 北京

李斌

李斌，现为武汉大学经济与管理学院教授、博士生导师，担任金融系支部书记、副主任和金融研究中心主任。研究方向是实证资产定价、金融机器学习与金融科技等。

李斌教授具有金融+科技的跨学科背景与研究能力，在金融会计类刊物《Journal of Accounting Research》、《金融研究》、《中国工业经济》、《管理科学学报》等和计算机CCF A类期刊和会议AIJ、JMLR、ICML、IJCAI 等发表论文多篇，在美国CRC出版社出版专著《Online Portfolio Selection: Principles and Algorithms 》。

李斌教授目前有多篇关于机器学习的论文在金融学顶级刊物RFS、JFE、MS等R&R，是国内研究机器学习与金融的代表性学者。

写写机器学习+金融的，主要聚焦于金融会计顶尖刊物或知名学者的工作论文。教学的时候正好有个列表，经不经典自己分辨。

机器学习主要有三类金融应用：

1. 机器学习与金融预测，旨在提升金融预测的能力；

2. 机器学习与代理变量构造，旨在从传统和另类数据中提取新的代理变量；

3. 机器学习与因果推断，主要是Susan Athey的一系列论文。

机器学习与金融预测

Gu, S., Kelly, B. T., and Xiu, D. (2020). Empirical Asset Pricing via Machine Learning. Review of Financial Studies, 33(5), 2223–2273. （机器学习与美国股票截面收益预测）
Leippold, M., Wang, Q., and Zhou, W. (2021). Machine learning in the Chinese stock market. Journal of Financial Economics, forthcoming. （机器学习与中国股票截面收益预测）
李斌，邵新月, 和李玥阳 (2019). 机器学习驱动的基本面量化投资研究. 中国工业经济, 08, 61–79. (机器学习与中国股票截面收益预测)
Kaniel, R., Lin, Z., Pelger, M., and Van Nieuwerburgh, S. (2021). Machine-Learning the Skill of Mutual Fund Managers. Working paper. （神经网络与基金截面收益预测）
Li, B., and Rossi, A. G. (2020). Selecting Mutual Funds from the Stocks They Hold: A Machine Learning Approach. Working Paper. （树与基金截面收益预测）
Li, B., Rossi, A., Yan, S., and Zheng, L. (2021). Real-time Machine Learning in the Cross-Section of Stock Returns: Evidence from Fundamental Signals. Working Paper. （实时机器学习预测）
Bianchi, D., Büchner, M., and Tamoni, A. (2021). Bond Risk Premiums with Machine Learning. The Review of Financial Studies, 34(2), 1046–1089. （机器学习与债券的可预测性）
Bali, T. G., Goyal, A., Huang, D., Jiang, F., and Wen, Q. (2020). The Cross-Sectional Pricing of Corporate Bonds Using Big Data and Machine Learning. Working Paper. （股债可预测性）
Chinco A, Clark-Joseph AD, Ye M. Sparse Signals in the Cross-Section of Returns. The Journal of Finance. 2019;74(1):449–92. (LASSO与高频截面收益预测)
Martin I, Nagel S. (2021). Market Efficiency in the Age of Big Data. Journal of Financial Economics, forthcoming
Dong, Xi, Yan Li, David Rapach, and Guofu Zhou. 2021. Anomalies and the Expected Market Return. Journal of Finance, forthcoming. （异象收益与股票市场收益）
Wu, W., J. Chen, Z. (Ben) Yang, and M. L. Tindall. 2020. A Cross-Sectional Machine Learning Approach for Hedge Fund Return Prediction and Selection. Management Science. (机器学习与对冲基金截面收益)
Gu S, Kelly B, Xiu D. Autoencoder asset pricing models[J]. Journal of Econometrics, 2021, 222(1): 429-450. (Autoencoder预测收益率）
Freyberger, J., Neuhierl, A., and Weber, M., 2020. Dissecting Characteristics Nonparametrically. The Review of Financial Studies 33, 2326–77. (adaptive group Lasso预测收益)
Adämmer, P. and Schüssler, R.A., 2020. Forecasting the Equity Premium: Mind the News! Review of Finance 24, 1313-1355. (提取新闻主题预测风险溢价)
Kozak S, Nagel S, Santosh S. Shrinking the cross-section[J]. Journal of Financial Economics, 2020, 135(2): 271-292. (ML方法预测SDF)
Gathergood, J., Mahoney, N., Stewart, N., and Weber, J., 2019. How Do Individuals Repay Their Debt? The Balance-Matching Heuristic. American Economic Review 109, 844–75. (ML方法用信用卡交易数据预测还款)
Fuster A, Goldsmith‐Pinkham P, Ramadorai T, et al. Predictably unequal? The effects of machine learning on credit markets[J]. The Journal of Finance, 2022, 77(1): 5-47. (ML方法预测信贷决策）
Bao, Y., Ke, B., Li, B., Yu, Y.J., and Zhang, J., 2020. Detecting Accounting Fraud in Publicly Traded U.S. Firms Using a Machine Learning Approach. Journal of Accounting Research 58, 199–235. (BRT方法预测财务欺诈)
Brown, N.C., Crowley, R.M., and Elliott, W.B., 2020. What Are You Saying? Using Topic to Detect Financial Misreporting. Journal of Accounting Research 58, 237–91. (主题模型用年报预测欺诈)
Chen, X., Ha (tony) Cho, Y., Dou, Y., and Lev, B. (2022). Predicting Future Earnings Changes Using Machine Learning and Detailed Financial Data. Journal of Accounting Research, forthcoming.(机器学习预测盈利变化)

机器学习与代理变量的构造

Loughran, T., and McDonald, B., "Textual Analysis in Accounting and Finance: A Survey." Journal of Accounting Research, 2016, 54(4), 1187-1230.
Gentzkow, M., Kelly,T. B. and Taddy, M., ``Text as Data", Journal of Economic Literature, 2019, 57 (3), 535-74.
马长峰、陈志娟、张顺明,《基于文本大数据分析的会计和金融研究综述》，《管理科学学报》，2020年第9期，第19-30页。

Obaid, K., and Pukthuanthong, K., A Picture is Worth a Thousand Words: Measuring Investor Sentiment by Combining Machine Learning and Photos from News, Journal of Financial Economics, 2021, forthcoming. (图像处理和情绪指标)
Edmans, Alex, Adrian Fernandez-Perez, Alexandre Garel, and Ivan Indriawan. ``Music Sentiment and Stock Returns around the World." Journal of Financial Economics, August 24, 2021. (音乐情绪与股票收益)
Loughran, T. and McDonald, B., 2011. When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks. The Journal of Finance 66, 35–65. (文本分析词典法提取年报情绪指标）
Antweiler, W. and Frank, M.Z., 2004. Is All That Talk Just Noise? The Information Content of Internet Stock Message Boards. The Journal of Finance 59, 1259–94. (ML方法分析Yahoo财经帖子提取市场情绪)
Bartov, E., Faurel, L., and Mohanram, P.S., 2017. Can Twitter Help Predict Firm-Level Earnings and Stock Returns? The Accounting Review 93, 25–57. (从Twitter帖子提取市场情绪)
Barbon, A., Di Maggio, M., Franzoni, F., and Landier, A., 2019. Brokers and Order Flow Leakage: Evidence from Fire Sales. The Journal of Finance 74, 2707–49. (naive Bayes从企业新闻提取市场情绪)
Huang, A.H., Zang, A.Y., and Zheng, R., 2014. Evidence on the Information Content of Text in Analyst Reports. The Accounting Review 89, 2151–80. (naive Bayes分析分析师报告提取市场情绪)
Manela, A. and Moreira, A., 2017. News Implied Volatility and Disaster Concerns. Journal of Financial Economics 123, 137–62. (华尔街日报封面文章提取情绪)
Tang, V.W., 2018. Wisdom of Crowds: Cross-Sectional Variation in the Informativeness of Third-Party-Generated Product Information on Twitter. Journal of Accounting Research 56, 989–1034. (从Twitter提取商品市场情绪)
Hsieh, T.-S., Kim, J.-B., Wang, R.R., and Wang, Z., 2020. Seeing Is Believing? Executives’ Facial Trustworthiness, Auditor Tenure, and Audit Fees. Journal of Accounting and Economics 69,101260. (图像识别判断高管可信度)
Bandiera, O., Prat, A., Hansen, S., and Sadun, R., 2020. CEO Behavior and Firm Performance. Journal of Political Economy 128, 1325–69. (用调查数据判断CEO行为影响)
Cookson, J. A. and Niessner, M., ``Why Don't We Agree?Evidence from a Social Network of Investors", Journal of Finance, 2020, 75(1),pp.173-228. (从帖子提取情绪)
Li, Kai, et al. ``Measuring corporate culture using machine learning." The Review of Financial Studies 34.7 (2021): 3265-3315. (电话会议提取5项企业指标)
Buehlmaier, M.M. and Whited, T.M., 2018. Are Financial Constraints Priced? Evidence from Textual Analysis. The Review of Financial Studies 31, 2693–2728. (ML方法从年报提取财务约束指标)
Lowry M, Michaely R, Volkova E. Information Revealed through the Regulatory Process: Interactions between the SEC and Companies ahead of Their IPO. The Review of Financial Studies, 2020, 33(12): 5510-5554. (ML方法从SEC文件提取监管指标)
Hanley, K.W. and Hoberg, G., 2019. Dynamic Interpretation of Emerging Risks in the Financial Sector. The Review of Financial Studies, 32, 4543–4603. (从银行年报提取金融部门风险敞口)
Bubna, A., Das, S.R., and Prabhala, N., 2020. Venture Capital Communities. Journal of Financial and Quantitative Analysis, 55, 621–51. (ML方法构造风险投资相关性)
Baker, S., Bloom N., and Davis, S., ``Measuring Economic Policy Uncertainty", Quarterly Journal of Economics,2016, 131(4), 1593-1636.（政策不确定性）
Hassan, T., Hollander, S., van Lent, L., Tahoun, A., ``Firm-Level Political Risk: Measurement and Effects", The Quarterly Journal of Economics, 2019, 134(4), 2135-2202. （政治风险）
Hoberg G., and Phillips, G., ``Text-Based Network Industries and Endogenous Product Differentiation", Journal of Political Economy, 2016, 124(5), 1423-1465. (用文本对行业重新分类)
Manela, A., and Alan Moreira, A., ``News Implied Volatility and Disaster Concerns", Journal of Financial Economics, 2017, 123, 137-162.（隐含波动率）

机器学习与因果推断

Athey, Susan and Guido W. Imbens. 2019. "Machine Learning Methods That Economists Should Know About." Annual Review of Economics 11 (1): 685–725.（综述文章，描述了现有的机器学习方法以及其对解决计量问题的作用，重点在于方法）
Mullainathan, Sendhil and Jann Spiess. 2017. "Machine Learning: An Applied Econometric Approach." Journal of Economic Perspectives 31 (2): 87–106.（综述文章，从机器学习理念和适用性角度分析对计量问题的帮助）
Belloni, Alexandre, Daniel Chen, Viktor Chernozhukov, and Christian Hansen. 2012. "Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain." Econometrica 80 (6): 2369–2429.(使用LASSO选取工具变量)
Carrasco, Marine. 2012. "A Regularization Approach to the Many Instruments Problem." Journal of Econometrics 170 (2): 383–398. (使用Ridge解决多工具变量问题)
Hansen, Christian and Damian Kozbur. 2014. "Instrumental Variables Estimation with Many Weak Instruments Using Regularized JIVE." Journal of Econometrics 182 (2): 290–308. (使用Ridge解决弱工具变量问题)
Hartford, Jason, Greg Lewis, Kevin Leyton-Brown, and Matt Taddy. 2017. ``Deep IV: A Flexible Approach for Counterfactual Prediction." In Proceedings of the 34th International Conference on Machine Learning - Volume 70, 1414–23. ICML’17. (使用神经网络方法优化两阶段OLS)
Angrist, Joshua and Brigham Frandsen. 2019. "Machine Labor." NBER Working Paper No. 26584. National Bureau of Economic Research.(机器学习IV估计并没有比传统方法表现更好)
Athey, Susan and Guido W. Imbens. 2016. "Recursive Partitioning for Heterogeneous Causal Effects." Proceedings of the National Academy of Sciences 113 (27): 7353–7360.（使用树方法估计Treatment effect）
Stefan, Wager and Athey Susan. 2018. “Estimation and Inference of Heterogeneous Treatment Effects using Random Forests” Journal of the American Statistical Association 113(523): 1228-1242.(使用因果森林估计Treatment effect，并验证了统计性质）
Athey, Susan and Stefan Wager. 2019. "Estimating Treatment Effects with Causal Forests: An Application." Observational Studies 5 (2): 37–51.（因果森林的应用）
Athey, Susan, Mohsen Bayati, Guido W. Imbens, and Zhaonan Qu. 2019. "Ensemble Methods for Causal Effects in Panel Data Settings." AEA Papers and Proceedings 109: 65–70.（截面数据拓展至面板数据）

Lee, Brian K., Justin Lessler, and Elizabeth A. Stuart. 2010. "Improving Propensity Score Weighting Using Machine Learning." Statistics in Medicine 29 (3): 337–46.（树方法对PSM的拓展）
Athey, Susan, Guido W. Imbens, Jonas Metzger, and Evan M. Munro. 2021. "Using Wasserstein Generative Adversarial Networks for the Design of Monte Carlo Simulations." Journal of Econometrics, Forthcoming.（提出了一种基于蒙特卡洛模拟的GANs方法估计反事实进一步计算Treatment effect）

Chernozhukov, Victor, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, and Whitney Newey. 2017. "Double/Debiased/Neyman Machine Learning of Treatment Effects." American Economic Review, 107 (5): 261–265.

O'Malley, Terry. 2018. "The Impact of Repossession Risk on Mortgage Default." The Journal of Finance 76 (2): 623–650.（使用因果森林估计房屋收回风险的立法变化对抵押贷款违约的Treatment effect和异质性）

Python 可以说是现在最流行的机器学习语言，而且你也能在网上找到大量的资源。你现在也在考虑从 Python 入门机器学习吗？我们精心设计了一门课程《从零基础掌握Python机器学习与深度学习》，本课程能帮你成功上手，从零基础掌握 Python机器学习与深度学习。

课程内容

第一模块 Python基础知识串讲

1、Python软件和模块安装、编程环境搭建

2、Python基本语法和操作

3、Python流程控制

4、Python函数与对象创建与使用

第二模块 Python科学计算

1、Matplotlib的安装与图形绘制

2、Numpy常用函数简介与使用

3、Pandas常用函数简介与使用

4、cipy常用函数简介与使用

5、实操练习

第三模块、Python统计分析与可视化

1、统计学基础知识（描述统计、统计推断、时空统计）

2、统计数据的描述与可视化

4、回归分析（多元线性回归模型、变量选择与模型优化、多重共线性与解决方法；Ridge回归；LASSO回归；ElasticNet回归等）

5、实操练习

第四模块、Python树模型（4课时）

1、机器学习概述、数据预测与预测建模

2、决策树模型

3、随机森林模型

4、Bagging与Boosting的区别与联系

5、AdaBoost vs. Gradient Boosting的工作原理

6、常用的GBDT算法框架（XGBoost、LightGBM）

7、SHAP法解释特征重要性与可视化（SHAP值的可视化与特征重要性解释）

8、部分依赖分析

9、实操练习

第五模块因果推断与因果学习（4课时）

1、因果推断概述

2、因果推断范式（潜在因果推断、结构因果推断）

3、因果推断实现（传统方法、机器学习方法）

4、因果机制识别与发现（因果效应估计、非线性因果效应估计、因果作用未来预测）

5、时空因果推断

6、实操练习

第六模块 Python深度学习（6课时）

1、人工神经网络

2、深度学习模型原理

3、深度学习模型实现

4、案例演示与实操

第七模块典型论文讲解与研究设计复现、答疑与交流（2课时）

1、树模型案例分析与实现

2、深度学习模型案例分析与实现

3、因果学习案例分析与实现

4、课程相关资料拷贝与分享

5、答疑与讨论（大家提前把问题整理好）

讲授/答疑老师：

李老师，985高校本硕，中科院博士，本科高校副教授，长期从事机器学习、贝叶斯统计、医学统计等科研和教学工作，具有丰富的科研和授课经验。某省高等学校优秀青年学术带头人、拔尖骨干人才、C刊学术编辑。主持国社科基金、教育部人文社科项目等课题多项，参与国社科重大、科技部973项目、国自然基金等项目多项，以第一或通讯作者在Lancet子刊、PloS Medidcine、Environment International、《统计研究》、《数理统计与管理》、《经济地理》等高质量期刊发表论文30多篇，担任多个SSCI\SCI期刊的审稿人。

目标受众：

零基础可学，适用于经济管理、医学、社会学以及公共卫生等领域的本科生、硕博研究生和青年教师，尤其是实证基础薄弱但是希望能够掌握Python机器学习与深度学习同学和老师。

上课时间（暂定）：

11月23日 8：30-12：00 14：00-17：30

11月24日14：00-17：30 18：30-21：30

11月30日 8：30-12：00 14：00-17：30

【可长期回放+1年答疑】

课程特色：

提供ppt讲义+案例+数据+python代码

课程价格：优惠价1299元

学术严选会员及老学员有优惠，具体请联系陈老师（微信 xsyxkf001）

报名：倘若您对课程感兴趣，扫描上方二维码可直接购买，

可开培训费、资料费、技术服务费、信息服务费等，可公对公转账，提供培训通知。请扫描下方二维码可添加助教微信询问课程详情及发票事宜。

助教二维码，联系咨询

http://mp.weixin.qq.com/s?__biz=Mzg2NTQyMzE0OA==&mid=2247544176&idx=1&sn=3b27f9d67d0bfe3a567c055ea9041d42

学术严选

致力于学术资源创作、搬运、遴选，提供高质量的学术资源下载

最新文章

长时间迟到、多次离席关闭摄像头……19位评审专家被处理

【241113】保姆级机器学习代码，包含常用的10个机器学习模型

缅怀！著名经济学家、62岁长江学者、国家哲学社会科学领军人才，因病逝世

机器学习在经济学中的相关应用

Pubmed停止更新了?

教育部新任新闻发言人陈星亮相

中国家庭追踪调查（CFPS）公卫数据库挖掘课程！欢迎报名！

【241111】企业层面出口产品质量：原始数据+代码+测算结果(2000-2016年)