人工智能与金融决策
黄昊,尤海峰,庄倩倩
摘要(Abstract):
由机器学习算法驱动的人工智能(AI)能够从大数据中有效地提取信息,因此在金融决策方面有很大的应用潜力。在本文中,我们总结了AI在这方面的重要应用。首先,我们回顾了从非结构化数据中提取信息的AI算法,尤其是自然语言处理算法。随后,我们探讨了AI如何从非结构化和结构化数据中提取和汇总信息,以促进投资和金融科技借贷等金融决策。最后,我们讨论了AI和人类在改善金融决策方面的互补作用。
关键词(KeyWords): 人工智能;机器学习;自然语言处理;金融决策;股权投资;金融科技借贷
基金项目(Foundation): 香港研究资助局“促进香港成为全球的金融科技枢纽”项目(T31-604/18-N)
作者(Author): 黄昊,尤海峰,庄倩倩
DOI: 10.16453/j.2096-5036.2023.02.001
参考文献(References):
- [1]BRYNJOLFSSON E,MCAFEE A.Artificial intelligence,for real[J].Harvard Business Review,2017,1:1-31.
- [2]Pricewaterhouse Cooper.Sizing the prize[EB/OL].2018.https://www.pwc.com/gx/en/issues/data-and-analytics/publications/artificial-intelligence-study.html.
- [3]COLBACK L.The impact of AI on business and society[EB/OL].(2020-10-16)[2023-03-15].https://www.ft.com/content/e082b01dfbd6-4ea5-a0d2-05bc5ad7176c.
- [4]The Economist.The world's most valuable resource is no longer oil,but data[EB/OL].(2017-05-06)[2023-03-15].https://www.economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data.
- [5]BOCHKAY K,BROWN S V,LEONE A J,et al.Textual analysis in accounting:what's next?[J].Social Science Research Network,2022.
- [6]LI F,LUNDHOLM R,MINNIS M.A measure of competition based on 10‐K filings[J].Journal of Accounting Research,2013,51:399-436.
- [7]HENRY E.Market reaction to verbal components of earnings press releases:event study using a predictive algorithm[J].Journal of Emerging Technologies in Accounting,2006,3:1-19.
- [8]HENRY E.Are investors influenced by how earnings press releases are written?[J].Journal of Business Communication,2008,45:363-407.
- [9]KOTHARI S P,LI X,SHORT J E.The effect of disclosures by management,analysts,and business press on cost of capital,return volatility,and analyst forecasts:a study using content analysis[J].The Accounting Review,2009,84:1639-1670.
- [10]LOUGHRAN T,MCDONALD B.When is a liability not a liability?textual analysis,dictionaries,and 10-Ks[J].Journal of Finance,2011,66:35-65.
- [11]CAMPBELL J L,CHEN H,DHALIWAL D S,et al.The information content of mandatory risk factor disclosures in corporate filings[J].Review of Accounting Studies,2014,19:396-455.
- [12]LI K,MAI F,SHEN R,et al.Measuring corporate culture using machine learning[J].Review of Financial Studies,2021,34:3265-3315.
- [13]LI K,LIU X,MAI F,et al.The role of corporate culture in bad times:evidence from the COVID-19 pandemic[J].Journal of Financial and Quantitative Analysis,2021,56:2545-2583.
- [14]IWASAKI H,CHEN Y,HUANG A H,et al.Neural Network Translated into Bag-of-Words Using Attentions[J].Social Science Research Network,2022.
- [15]CAO S,JIANG W,YANG B,et al.How to talk when a machine is listening:corporate disclosure in the age of AI[J].Social Science Research Network,2020.
- [16]FRANKEL R,JENNINGS J,LEE J.Disclosure sentiment:machine learning vs.dictionary methods[J].Management Science,2022,68:5514-5532.
- [17]LI F.The information content of forward-looking statements in corporate filings-a Na?ve Bayesian machine learning approach[J].Journal of Accounting Research,2010,48:1049-1102.
- [18]BROWN S V,HINSON L A,TUCKER J W.Financial statement adequacy and firms’MD&A disclosures[J].Social Science Research Network,2021.
- [19]MEHRAN A,ANUP A.Is Positive Sentiment in Corporate annual reports informative?evidence from deep learning[J].The Review of Asset Pricing Studies,2021,11:762-805.
- [20]FRANKEL R,JENNINGS J,LEE J.Using unstructured and qualitative disclosures to explain accruals[J].Journal of Accounting and Economics,2016,62:209-227.
- [21]BLEI D M,NG A Y,JORDAN M I.Latent dirichlet allocation[J].Journal of Machine Learning Research,2003,3:993-1022.
- [22]HUANG A H,LEHAVY R,ZANG A Y,et al.Analyst information discovery and interpretation roles:a topic modeling approach[J].Management Science,2018,64:2833-2855.
- [23]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[EB/OL].(2013-09-07)[2023-03-15].https://arxiv.org/abs/1301.3781.
- [24]PENNINGTON J,SOCHER R,MANNING C D.Glove:global vectors for word representation[C]//EMNLP 2014.ACL,2014:1532-1543.
- [25]HOWARD J,RUDER S.Universal language model fine-tuning for text classification[EB/OL].(2018-05-23)[2023-03-15].https://arxiv.org/abs/1801.06146.
- [26]PETERS M E,NEUMANN M,IYYER M,et al.Deep contextualized word representations[EB/OL].(2018-03-22)[2023-03-15].https://arxiv.org/abs/1802.05365.
- [27]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pre-training[J].Social Science Research Network,2018.
- [28]DEVLIN J,CHANG M W,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding[EB/OL].(2019-05-24)[2023-03-15].https://arxiv.org/abs/1810.04805.
- [29]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[J].Advances in Neural Information Processing Systems,2017,30.
- [30]YANG Y,UY M,HUANG A.Fin BERT:a pre-trained language model for financial communications[EB/OL].(2020-07-09)[2023-03-15].https://arxiv.org/abs/2006.08097.
- [31]HUANG A H,WANG H,YANG Y.Fin BERT:A large language model for extracting information from financial text[J].Contemporary Accounting Research,2022.
- [32]HUANG A H,ZANG A Y,ZHENG R.Evidence on the information content of text in analyst reports[J].The Accounting Review,2014,89:2151-2180.
- [33]SIANO F,WYSOCKI P.Transfer learning and textual analysis of accounting disclosures:applying big data methods to small(er)datasets[J].Accounting Horizons,2021,35:217-244.
- [34]DYER T,LANG M,STICE-LAWRENCE L.The evolution of 10-K textual disclosure:evidence from Latent Dirichlet Allocation[J].Journal of Accounting and Economics,2017,64:221-245.
- [35]GOEL S,GANGOLLY J,FAERMAN S R,et al.Can linguistic predictors detect fraudulent financial filings?[J].Journal of Emerging Technologies in Accounting,2010,7:25-46.
- [36]KIM A G,YOON S.Corporate bankruptcy prediction with domain-adapted BERT[C]//EMNLP 2021.ACL,2021.
- [37]CECCHINI M,AYTUG H,KOEHLER G J,et al.Detecting management fraud in public companies[J].Management Science,2010,56:1146-1160.
- [38]PURDA L,SKILLICORN D.Accounting variables,deception,and a bag of words:assessing the tools of fraud detection[J].Contemporary Accounting Research,2015,32:1193-1223.
- [39]MAYEW W J,SETHURAMAN M,VENKATACHALAM M.MD&A disclosure and the firm's ability to continue as a going concern[J].The Accounting Review,2015,90:1621-1651.
- [40]MAI F,TIAN S,LEE C,et al.Deep learning models for bankruptcy prediction using textual disclosures[J].European Journal of Operational Research,2019,274:743-758.
- [41]DONOVAN J,JENNINGS J,KOHARKI K,et al.Measuring credit risk using qualitative disclosure[J].Review of Accounting Studies,2021,26:815-863.
- [42]CAMPBELL D W,SHANG R.Tone at the bottom:measuring corporate misconduct risk from the text of employee reviews[J].Management Science,2021.
- [43]SONG S.The informational value of segment data disaggregated by underlying industry:evidence from the textual features of business descriptions[J].The Accounting Review,2021,96:361-396.
- [44]GOW I D,KAPLAN S N,LARCKER D F,et al.CEO personality and firm policies[J].Social Science Research Network,2016.
- [45]HARRISON JS,THURGOOD G R,BOIVIE S,et al.Measuring CEO personality:developing,validating,and testing a linguistic tool[J].Strategic Management Journal,2019,40:1316-1330.
- [46]YANG K,LAU R Y K,ABBASI A.Getting personal:a deep learning artifact for text-based measurement of personality[J].Information Systems Research,2022.
- [47]DE LA PARRA D.Disclosure softness of corporate language[J].Social Science Research Network,2021.
- [48]BUEHLMAIER M,WHITED T M.Are financial constraints priced?Evidence from textual analysis[J].Review of Financial Studies,2018,31:2693-2728.
- [49]ABIS S,LINES A.Text-based mutual fund peer groups[C]//ASSA 2021.ASSA,2021.
- [50]ABIS S.Man vs.machine:quantitative and discretionary equity management[J].Quantitative and Discretionary Equity Management,2020.
- [51]MANELA A,MOREIRA A.News implied volatility and disaster concerns[J].Journal of Financial Economics,2017,123:137-162.
- [52]BUEHLMAIER M,ZECHNER J.Financial media,price discovery,and merger arbitrage[J].Review of Finance,2021,25:997-1046.
- [53]CHEN M A,WU Q,YANG B.How valuable is Fin Tech innovation?[J].Review of Financial Studies,2019,32:2062-2106.
- [54]BELLSTAM G,BHAGAT S,COOKSON J A.A text-based analysis of corporate innovation[J].Management Science,2021,67:4004-4031.
- [55]OBAID K,PUKTHUANTHONG K.A picture is worth a thousand words:Measuring investor sentiment by combining machine learning and photos from news[J].Journal of Financial Economics,2022,144:273-297.
- [56]CURTI F,KAZINNIK S.Let’s face it:quantifying the impact of nonverbal communication in FOMC press conferences[J].Social Science Research Network,2021.
- [57]GORODNICHENKO Y,PHAM T,TALAVERA O.The voice of monetary policy[J].Social Science Research Network,2021.
- [58]MAYEW W J,VENKATACHALAM M.The power of voice:managerial affective states and future firm performance[J].The Journal of Finance,2012,67:1-43.
- [59]HOBSON J L,MAYEW W J,VENKATACHALAM M.Analyzing speech to detect financial misreporting[J].Journal of Accounting Research,2012,50:349-392.
- [60]MUKHERJEE A,PANAYOTOV G,SHON J.Eye in the sky:private satellites and government macro data[J].Journal of Financial Economics,2021,141:234-254.
- [61]KATONA Z,PAINTER M O,PATATOUKAS P N,et al.On the capital market consequences of alternative data:evidence from outer space[J].Social Science Research Network,2021.
- [62]CHRIST M H,EMETT S A,SUMMERS S L,et al.Prepare for takeoff:improving asset measurement and audit quality with drone-enabled inventory audit procedures[J].Review of Accounting Studies,2021,26:1323-1343.
- [63]EULERICH M,PAWLOWSKI J,WADDOUPS N J,et al.A framework for using robotic process automation for audit tasks[J].Contemporary Accounting Research,2022,39:691-720.
- [64]CHEN X,CHO Y H,DOU Y,et al.Predicting future earnings changes using machine learning and detailed financial data[J].Journal of Accounting Research,2022,60:467-515.
- [65]CAO K,YOU H.Fundamental analysis via machine learning[J].Social Science Research Network,2021.
- [66]VAN BINSBERGEN J H,HAN X,LOPEZ-LIRA A.Textual analysis of short-seller research reports,stock prices,and real investment[J].Jacobs Levy Equity Management Center for Quantitative Financial Research Paper,2021.
- [67]BINZ O,SCHIPPER K,STANDRIDGE K.What can analysts learn from artificial intelligence about fundamental analysis?[J].Social Science Research Network,2022.
- [68]NISSIM D,PENMAN S H.Ratio analysis and equity valuation:from research to practice[J].Review of Accounting Studies,2001,6:109-154.
- [69]PEROLS J,BOWEN R,ZIMMERMANN C,et al.Finding needles in a haystack:using data analytics to improve fraud prediction[J].The Accounting Review,2017,92:221-245.
- [70]BAO Y,KE B,LI B,et al.Detecting accounting fraud in publicly traded U.S.firms using a machine learning approach[J].Journal of Accounting Research,2020,58:199-235.
- [71]CHINCO A,CLARK-JOSEPH A D,YE M.Sparse signals in the cross-section of returns[J].Journal of Finance,2019,74:449-492.
- [72]MURRAY S,XIAO H,XIA Y.Charting by machines[J].Social Science Research Network,2021.
- [73]JIANG J,KELLY B,XIU D.(Re-) imag(in) ing price trends[J].Social Science Research Network,2020.
- [74]LIGHT N,MASLOV D,RYTCHKOV O.Aggregation of information about the cross section of stock returns:a latent variable approach[J].Review of Financial Studies,2017,30:1339-1381.
- [75]WOLD H.Soft modelling by latent variables:the non-linear iterative partial least squares(NIPALS) approach[J].Journal of Applied Probability,1975,12:117-142.
- [76]WOLD H.Soft modeling:the basic design and some extensions[J].Systems Under Indirect Observation,1982,2:343.
- [77]GU S,KELLY B,XIU D.Empirical asset pricing via machine learning[J].Review of Financial Studies,2020,33:2223-2273.
- [78]HARVEY C R,LIU Y,ZHU H.?and the cross-section of expected returns[J].Review of Financial Studies,2016,29:5-68.
- [79]GREEN J,HAND J R M,ZHANG X F.The characteristics that provide independent information about average U.S.Monthly stock returns[J].The Review of Financial Studies,2017,30:4389-4436.
- [80]HOU K,XUE C,ZHANG L.Replicating anomalies[J].Review of Financial Studies,2020,33,2019-2133.
- [81]HOROWITZ J L.Variable selection and estimation in high‐dimensional models[J].Canadian Journal of Economics,2015,48:389-407.
- [82]RAPACH D E,STRAUSS J K,ZHOU G.International stock return predictability:what is the role of the United States?[J].Journal of Finance,2013,68:1633-1662.
- [83]RAPACH D E,STRAUSS J K,TU J.Industry return predictability:a machine learning approach[J].The Journal of Financial Data Science,2019,1:9-28.
- [84]HUANG J Z,SHI Z.Machine-learning-based return predictors and the spanning controversy in macro-finance[J].Management Science,2022.
- [85]FREYBERGER J,NEUHIERL A,WEBER M.Dissecting characteristics nonparametrically[J].Review of Financial Studies,2020,33:2326-2377.
- [86]LEIPPOLD M,WANG Q,ZHOU W.Machine learning in the Chinese stock market[J].Journal of Financial Economics,2021,145(2):64-82.
- [87]RASEKHSCHAFFE K C,JONES R C.Machine learning for stock selection[J].Financial Analysts Journal,2019,75:70-88.
- [88]ROSS S A.The arbitrage theory of capital asset pricing[J].Journal of Economic Theory,1976,13:341-360.
- [89]CHEN L,PELGER M,ZHU J.Deep learning in asset pricing[J].Social Science Research Network,2021.
- [90]CHAMBERLAIN G,ROTHSCHILD M.Arbitrage,factor structure,and mean-variance analysis on large asset markets[J].Econometrica,1983,51:1281-1304.
- [91]CONNOR G,KORAJCZYK R A.Risk and return in an equilibrium APT[J].Journal of Financial Economics,1988,21:255-289.
- [92]LETTAU M,PELGER M.Factors that fit the time series and cross-section of stock returns[J].The Review of Financial Studies,2020,33:2274-2325.
- [93]KELLY B T,PRUITT S,SU Y.Characteristics are covariances:a unified model of risk and return[J].Journal of Financial Economics,2019,134:501-524.
- [94]KELLY B T,PALHARES D,PRUITT S.Modeling corporate bond returns[J].Social Science Research Network,2021.
- [95]BüCHNER M,KELLY B.A factor model for option returns[J].Journal of Financial Economics,2022,143:1140-1161.
- [96]GU S,KELLY B,XIU D.Autoencoder asset pricing models[J].Journal of Econometrics,2021,222:429-450.
- [97]KOZAK S,NAGEL S,SANTOSH S.Shrinking the cross-section[J].Journal of Financial Economics,2020,135:271-292.
- [98]HANSEN L P,JAGANNATHAN R.Implications of security market data for models of dynamic economies[J].Journal of Political Economy,1991,99:225-262.
- [99]KELLY B T,PRUITT S.Market expectations in the cross-section of present values[J].Journal of Finance,2013,68:1721-1756.
- [100]DONG X,LI Y,RAPACH D E,et al.Anomalies and the expected market return[J].Journal of Finance,2022,77:639-681.
- [101]MARKOWITZ H.Portfolio selection[J].The Journal of Finance,1952,7:77-91.
- [102]MARKOWITZ H.The optimization of a quadratic function subject to linear constraints[J].Naval Research Logistics,1956,3:111-133.
- [103]MARKOWITZ H M.Portfolio Selection:Efficient Diversification of Investment[J].The Journal of Finance,1959,15(3).
- [104]DEMIGUEL V,GARLAPPI L,UPPAL R.Optimal versus naive diversification:how inefficient is the 1/N portfolio strategy?[J].Review of Financial Studies,2009,22:1915-1953.
- [105]ELAVIA T,KOTHARI S P,LI X,et al.Gains from Markowitz optimization:evidence from reoptimization of mutual fund holdings[J].The Journal of Portfolio Management,2022,48:199-218.
- [106]AO M,YINGYING L,ZHENG X.Approaching mean-variance efficiency for large portfolios[J].The Review of Financial Studies,2019,32:2890-2919.
- [107]CONG L,TANG K,WANG J,et al.Alpha Portfolio for investment and economically interpretable AI[J].Social Science Research Network,2022.
- [108]KLEINBERG J,LAKKARAJU H,LESKOVEC J,et al.Human decisions and machine predictions[J].Quarterly Journal of Economics,2018,133:237-293.
- [109]FUSTER A,GOLDSMITH-PINKHAM P,RAMADORAI T.Predictably unequal?The effects of machine learning on credit markets[J].Journal of Finance,2022,77:5-47.
- [110]FUSTER A,PLOSSER M,SCHNABL P,et al.The role of technology in mortgage lending[J].Review of Financial Studies,2019,32:1854-1899.
- [111]DOBBIE W,LIBERMAN A,PARAVISINI D,et al.Measuring bias in consumer lending[J].Review of Economic Studies,2021,88:2799-2832.
- [112]JANSEN M,NGUYEN H,SHAMS A.Human vs.machine:underwriting decisions in finance[J].Social Science Research Network,2021.
- [113]LIU M.Assessing human information processing in lending decisions:a machine learning approach[J].Journal of Accounting Research,2022,60:607-651.
- [114]TANTRI P.Fin Tech for the poor:financial intermediation without discrimination[J].Review of Finance,2021,25:561-593.
- [115]LIU Y,LI X,ZHENG E.The mercy of AI:combating natural disasters through lending[J].Social Science Research Network,2022.
- [116]DING K,LEV B,PENG X,et al.Machine learning improves accounting estimates:evidence from insurance payments[J].Review of Accounting Studies,2020,25:1098-1134.
- [117]COLEMAN B,MERKLEY K J,PACELLI J.Human versus machine:a comparison of robo-analyst and traditional research analyst investment recommendations[J].The Accounting Review,2022.
- [118]BLANKESPOOR E,DEHAAN E,ZHU C.Capital market effects of media synthesis and dissemination:evidence from robojournalism[J].Review of Accounting Studies,2018,23:1-36.
- [119]CARDINAELS E,HOLLANDER S,WHITE B J.Automatic summarization of earnings releases:attributes and effects on investors’judgments[J].Review of Accounting Studies,2019,24:860-890.
- [120]COSTELLO A M,DOWN A K,MEHTA M N.Machine+man:a field experiment on the role of discretion in augmenting AI-based lending models[J].Journal of Accounting and Economics,2020,70:101360.
- [121]CAO S,JIANG W,WANG J,et al.From man vs.machine to man+machine:the art and AI of stock analyses[J].Social Science Research Network,2022.
- [122]XIE Y,WANG D,CHEN P Y,et al.A word is worth a thousand dollars:adversarial attack on tweets fools stock prediction[EB/OL].(2022-07-12)[2023-03-15].https://arxiv.org/abs/2205.01094.
- [123]?NKAL D,GOODWIN P,THOMSON M,et al.The relative influence of advice from human experts and statistical methods on forecast adjustments[J].Journal of Behavioral Decision Making,2009,22:390-409.
- [124]DIETVORST B J,SIMMONS J P,MASSEY C.Algorithm aversion:people erroneously avoid algorithms after seeing them err[J].Journal of Experimental Psychology:General,2015,144:114-126.
- [125]COMMERFORD B P,DENNIS S A,JOE J R,et al.Man versus machine:complex estimates and auditor reliance on artificial intelligence[J].Journal of Accounting Research,2022,60:171-201.
- [126]CASTELO N,BOS M W,LEHMANN D R.Task-dependent algorithm aversion[J].Journal of Marketing Research,2019,56:809-825.
- [127]YEOMANS M,SHAH A,MULLAINATHAN S,et al.Making sense of recommendations.[J]Journal of Behavioral Decision Making,2019,32:403-414.
- [128]DIETVORST B J,BHARTI S.People reject algorithms in uncertain decision domains because they have diminishing sensitivity to forecasting error[J].Psychological Science,2020,31:1302-1314.
- [129]Chuang C,Yang Y.Buy tesla,sell ford:Assessing implicit stock market preference in pre-trained language models[C]//ACL2022.ACL,2022:100-105.
- (1)我们在本章中重点讨论机器学习算法;关于文本分析的更广泛的回顾,请参阅Bochkay等[5]的文章。
- (2)大多数词表都是由研究人员手动标注构建的,但最近的研究也使用机器学习算法构建词表,从而降低成本并提高客观性。例如,Campbell等[11]使用主题建模构建风险因子类别的词典,Li等[12,13]和Iwasaki等[14]使用词嵌入分别构建了企业文化、Covid-19和情感的词典(见1.2节)。
- (3)替代人工标注的另一种方法是使用“自然发生的”标签,这通常被称为“弱监督学习”。这种方法比人工标注的成本低,但准确性也低,因为“自然发生的”标签往往是有噪声的。
- (4) Merriam-Webster词典和牛津英语词典包括了大约50万个英语词条,甚至特定领域的词表(如LM词典)也包含了多达数千个单词。
- (5)例如,Word2Vec使用维基百科作为训练数据集[23]。
- (6) Word2Vec的另一个目标函数是根据中间词预测周围的词。类似地,Glo Ve也使用单词的共现性,但是是在全局范围内,即使用文本中的所有的词,而不仅仅是中间词周围的词[24]。
- (7)例如,“bank”一词在“I accessed the bank account”的句子中和在“I walked along the bank of the river”的句子中会有不同的向量。
- (8)例如,谷歌的初始BERT模型有1.1亿个参数,而OpenAI的GPT-3有1750亿个参数。
- (9)人工智能在审计中的另一个突出用途是机器人流程自动化(RPA),它使用自主计算机程序实现重复性、常规业务流程的自动化。Eulerich等[63]讨论了内部和外部审计师使用RPA降低成本和扩大审计范围。RPA可以使用机器学习算法,包括图像识别(通常使用CNN)以及自然语言处理和生成。
- (10)自适应group LASSO方法在几个方面与LASSO不同。第一,预期收益被建模为个股特征的非参数函数的总和,它由二次样条近似而不是像LASSO这样的线性函数;第二,group LASSO不是对单个系数进行惩罚,而是对与特定特征(群体)相关的所有系数进行惩罚;第三,适应性步骤重新优化一个修改后的损失函数,该函数按组重新加权正则化系数,以便只使用第二步后保留的特征。
- (11)■的组合与■的历史平均值或潜在因子溢价的其他合理代理的组合可以得到资产收益的条件预期的估计。正如Kelly等[94]与Büchner和Kelly[95]讨论的那样,IPCA也可以应用于其他资产类别,如公司债券和期权。