人工智能大模型综述及金融应用展望
刘安平,金昕,胡国强
摘要(Abstract):
随着人工智能技术的持续发展,深度学习呈现出“大算力、大数据、大模型”的特点,而这些依赖算力和大量数据集训练的大模型也广泛应用到了人类的生产生活中。相比于大模型,更广泛使用的传统深度学习存在标注预料缺失、通用性泛化性不足、系统复杂度过大等问题,在金融行业这些问题尤为明显。为解决以上问题,学术界和产业界推出“大模型技术”以进一步推进人工智能规模化应用,随着一系列生成式大模型的爆发,大模型如今已经演变为了人工智能重大发展趋势。该技术具有标注数据需求少、建模速度快、能力通用、研发流程标准化等特点。本文将从深度学习的发展历程出发,详细介绍大模型技术的发展历史和当前现状,分析大模型技术的应用场景、应用价值以及深刻影响,并探讨大模型未来的演进趋势,并介绍其在金融行业的具体应用前景。
关键词(KeyWords): 大模型;生成式;通用人工智能;大模型金融应用
基金项目(Foundation):
作者(Author): 刘安平,金昕,胡国强
DOI: 10.16453/j.2096-5036.2023.02.003
参考文献(References):
- [1]BOMMASANI R,HUDSON D A,ADELI E,et al.On the opportunities and risks of foundation models[EB/OL].(2022-07-12)[2023-03-20].https://arxiv.org/abs/2108.07258. [2]OUYANG L,WU JEFF,JIANG XU,et al.Training language models to follow instructions with human feedback[EB/OL].(2022-05-04)[2023-03-20].https://arxiv.org/abs/2203.02155. [3]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pre-training[EB/OL].[2023-03-20].https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf. [4]RADFORD A,WU J,CHILD R.Language models are unsupervised multitask learners[EB/OL].[2023-03-20].http://static.cs.brown.edu/courses/cs146/assets/papers/language_models_are_unsupervised_multitask_learners.pdf. [5]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners[EB/OL].(2020-05-28)[2023-03-20].https://openai.com/research/language-models-are-few-shot-learners. [6]SUN YU,WANG S H,FENG S K,et al.ERNIE 3.0:large-scale knowledge enhanced pre-training for language understanding and generation[EB/OL].(2021-07-05)[2023-03-20].https://arxiv.org/abs/2107.02137. [7]SCHULMAN J,ZOPH B,KIM C,et al.Introducing Chat GPT[EB/OL].(2022-11-30)[2023-03-20].https://openai.com/blog/chatgpt. [8]FENG Z D,ZHANG Z Y,YU X T,et al.ERNIE-Vi LG 2.0:improving text-to-image diffusion model with knowledge-enhanced mixture-of-denoising-experts.(2022-10-27)[2023-03-20].https://arxiv.org/abs/2210.15257. [9]ZENG A H,LIU X,DU Z X,et al.GLM-130B:an open bilingual pre-trained model[EB/OL].(2022-10-05)[2023-03-20].https://arxiv.org/abs/2210.02414. [10]SCAO T L,FAN A,AKIKI C,et al.BLOOM:a 176B-parameter open-access multilingual language model[EB/OL].(2023-03-13)[2023-03-20].https://arxiv.org/abs/2211.05100. [11]Open AI.GPT-4 technical report[EB/OL].(2023-03-15)[2023-03-20].https://arxiv.org/abs/2303.08774. [12]RAMESH A,DHARIWAL P,NICHOL A.DALL·E 2[EB/OL].(2022-11-03)[2023-03-20].https://openai.com/product/dall-e-2. [13]YU J H,XU Y Z,KOH J Y.google-research/parti[EB/OL].(2022-06-29)[2023-03-20].https://github.com/google-research/parti. [14]WU C F,LIANG J,JI L,et al.NüWA:visual synthesis pre-training for neural visual world creation[C]//European Conference on Computer Vision(ECCV 2022).Cham:Springer,2022:720-736. [15]SINGER U,POLYAK A,HAYES T,et al.Make-a-video:text-to-video generation without text-video data[EB/OL].(2022-09-29)[2023-03-20].https://arxiv.org/abs/2209.14792.
- (1) OpenAI是埃隆马斯克牵头出资创立的AI技术研究公司,致力于通用人工智能技术的研究工作。目前,微软计划出资100亿美元收购Open AI 49%的股权。 (2) Transformer采用并行计算模式,训练数据足够丰富的前提下,一个模型可兼备单任务精通和多任务通用处理能力。传统CNN(卷积神经网络)或RNN(循环神经网络)采用串行计算,无法叠加过多的网络结构。