管理学与经济学系列前沿讲座之四八四讲
讲座主题
Discrete-Time Mean Variance Strategy based on Reinforcement Learning
活动时间
2024年11月8日15:00 - 2024年11月8日16:00
活动地址
广州校区东校园兰园6号管理学院333课室
主讲人
崔翔宇副教授,上海财经大学统计与管理学院
主持人
朱书尚教授,中山大学管理学院
主办单位
中山大学管理学院财务与投资教研室
嘉宾简介
崔翔宇,中国科学技术大学学士,硕士,香港中文大学博士。现为上海财经大学统计与管理学院常任教职副教授(已获得终身教职),研究员,博导。管理科学与工程学会金融计量和风险管理研究会秘书长,中国管理现代化研究会管理与决策科学专业委员会理事。主要研究领域包括行为金融,数量金融,风险管理,在Operations Research,INFORMS Journal on Computing,Journal of Econometrics,Mathematical Finance,Journal of Economic Dynamics & Control等国际著名SSCI/SCI期刊发表论文40篇。
讲座简介
We have considered a discrete-time mean-variance model based on reinforcement learning. Compared to its continuous-time counterpart of Wang and Zhou (2020), the discrete-time model makes more general assumptions about the asset distribution. By using entropy to measure the cost of exploration, we derive the optimal investment strategy, which is also Gaussian type. Additionally, we design a corresponding reinforcement learning algorithm. Simulation experiments and empirical analysis indicate that the discrete-time model exhibits better applicability when analyzing real-world data compared to the continuous-time model.
管理学与经济学系列前沿讲座之四八五讲
讲座主题
An End-to-End Direct Reinforcement Learning Approach for Multi-factor Based Portfolio Management
活动时间
2024年11月8日16:00 - 2024年11月8日17:00
活动地址
广州校区东校园兰园6号管理学院333课室
主讲人
周科副教授,湖南大学工商管理学院
主持人
朱书尚教授,中山大学管理学院
主办单位
中山大学管理学院财务与投资教研室
嘉宾简介
湖南大学工商管理学院副教授,博士生导师。2006年毕业于华中科技大学数学系,2009年硕士毕业于山东大学,2014年博士毕业于香港中文大学师从李端教授学习金融优化。现任湖南大学工商管理学院金融科技(Fintech)MBA项目主任。研究方向:金融风险管理、 风险度量、动态投资组合。主持国家自然科学基金青年项目一项,省自科青年项目一项,参与国家级以上项目多项。曾在SIAM Journal on Control and Optimization与 Quantitative Finance等主流杂志发表论文多篇。为香港中文大学,香港城市大学访问学者,副研究员等。中国运筹学会金融工程与金融风险管理分会常务理事,副秘书长。
讲座简介
This paper introduces an end-to-end online portfolio decision model within the framework of direct reinforcement learning, seamlessly integrating the multi-factor model and mean-variance (MV) portfolio optimization. Recognizing that classical methods, which separate estimation and portfolio optimization into a two-step scheme, may accumulate estimation errors jeopardizing overall performance, our approach unifies these steps into a performance-oriented online decision process. This integration is achieved by tuning the neural network parameters directly with respect to the reward function, designed as a combination of the prediction error and realized MV utility. Specifically, we employ a neural network to estimate future returns and generate the factor loading matrix, enabling the computation of inputs for the MV portfolio optimization model. Implementing the resulting portfolio further provides the realized utility. The network parameters are optimized with respect to the updated reward using the gradient method. We develop an online updating scheme for computing the gradient in backpropagation by providing explicit formulas for MV portfolio derivatives through the portfolio optimization layer. Utilizing real market data, we evaluate the proposed method against several benchmark portfolios in out-of-sample tests. The experiments demonstrate that our approach not only outperforms these benchmarks across various performance metrics but is also transparent to factor analysis, a favorable trait for practitioners.
来源:财务与投资教研室
撰稿:于玲玲
编审:朱书尚、韦立坚、蒋莉
责任编辑:罗萍
初审:陈融融
审核:张毅芳
审核发布:张俊生、钟一彪
欢迎投稿
bspr@mail.sysu.edu.cn