深度学习技术和平台发展综述
于佃海;吴甜;
摘要(Abstract):
<正>深度学习是驱动人工智能近年来取得突破的关键核心技术,带动了语音、视觉、自然语言相关技术领域的跨越式发展,加速推动了人工智能的产业化落地。深度学习框架和平台作为深度学习底层基础设施,对深度学习技术和应用的快速发展起到了重要推动作用。本文首先介绍深度学习的基本概念,概述深度学习的发展历史,接着具体介绍深度学习的核心技术特性、重要进展及热点趋势,然后介绍深度学习框架和平台的发展状况,最后介绍深度学习技术的应用情况。
关键词(KeyWords):
基金项目(Foundation):
作者(Authors): 于佃海;吴甜;
DOI: 10.16453/j.cnki.issn2096-5036.2020.03.001
参考文献(References):
- [1]Le Cun Y,Bengio Y,Hinton G.Deep learning[J].nature,2015,521(7553):436-444.
- [2]Sejnowski T J.The deep learning revolution[M].Mit Press,2018.
- [3]王海峰,余少华.中国电子信息工程科技发展研究深度学习专题[M].科学出版社.2019:1-75.
- [4]Deng L,Yu D.Deep learning:methods and applications[J].Foundations and trends in signal processing,2014,7(3-4):197-387.
- [5]Abadi M,Barham P,Chen J,et al.Tensorflow:A system for large-scale machine learning[C].12th{USENIX}Symposium on Operating Systems Design and Implementation({OSDI}16).2016:265-283.
- [6]Paszke A,Gross S,Chintala S,et al.Pytorch:Tensors and dynamic neural networks in python with strong gpu acceleration[J].Py Torch:Tensors and dynamic neural networks in Python with strong GPU acceleration,2017,6.
- [7]Yanjun Ma,Dianhai Yu,Tian Wu,Haifeng Wang.Paddle Paddle:An Open-Source Deep Learning Platform from Industrial Practice[J].Frontiers of Data and Domputing,2019,1(1):105-115.
- [8]Krizhevsky A,Sutskever I,Hinton G E.Imagenet classification with deep convolutional neural networks[C]//Advances in neural information processing systems.2012:1097-1105.
- [9]Deng J,Dong W,Socher R,et al.Imagenet:A large-scale hierarchical image database[C]//2009IEEE conference on computer vision and pattern recognition.Ieee,2009:248-255.
- [10]Hinton G,Deng L,Yu D,et al.Deep neural networks for acoustic modeling in speech recognition:The shared views of four research groups[J].IEEESignal processing magazine,2012,29(6):82-97.
- [11]Silver D,Huang A,Maddison C J,et al.Mastering the game of Go with deep neural networks and tree search[J].nature,2016,529(7587):484.
- [12]Hinton G E,Salakhutdinov R R.Reducing the dimensionality of data with neural networks[J].science,2006,313(5786):504-507.
- [13]Schmidhuber J.Deep learning in neural networks:An overview[J].Neural networks,2015,61:85-117.
- [14]Ivakhnenko A G.Polynomial theory of complex systems[J].IEEE transactions on Systems,Man,and Cybernetics,1971(4):364-378.
- [15]Mc Culloch W S,Pitts W.A logical calculus of the ideas immanent in nervous activity[J].The bulletin of mathematical biophysics,1943,5(4):115-133.
- [16]Rosenblatt F.The perceptron:a probabilistic model for information storage and organization in the brain[J].Psychological review,1958,65(6):386.
- [17]Minsky M,Papert S.Perceptrons:An essay in computational geometry[J].MIT Press.,1969.
- [18]Rumelhart D E,Hinton G E,Williams R J.Learning representations by back-propagating errors[J].nature,1986,323(6088):533-536.
- [19]Werbos P.Beyond regression:new tools for prediction and analysis in the behavioral sciences[J].Ph.D.dissertation,Harvard University,1974.
- [20]Fukushima K.Neocognitron:A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position[J].Biological cybernetics,1980,36(4):193-202.
- [21]Le Cun Y,Boser B,Denker J S,et al.Backpropagation applied to handwritten zip code recognition[J].Neural computation,1989,1(4):541-551.
- [22]Le Cun Y,Bottou L,Bengio Y,et al.Gradient-based learning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
- [23]Hochreiter S,Schmidhuber J.Long short-term memory[J].Neural computation,1997,9(8):1735-1780.
- [24]Bengio Y,Ducharme R,Vincent P,et al.A neural probabilistic language model[J].Journal of machine learning research,2003,3(Feb):1137-1155.
- [25]Graves A,Fernández S,Gomez F,et al.Connectionist temporal classification:labelling unsegmented sequence data with recurrent neural networks[C]//Proceedings of the 23rd international conference on Machine learning.2006:369-376.
- [26]Antol S,Agrawal A,Lu J,et al.Vqa:Visual question answering[C]//Proceedings of the IEEE international conference on computer vision.2015:2425-2433.
- [27]Dong D,Wu H,He W,et al.Multi-task learning for multiple language translation[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2015:1723-1732.
- [28]Devlin J,Chang M W,Lee K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].ar Xiv preprint ar Xiv:1810.04805,2018.
- [29]Sun Y,Wang S,Li Y,et al.Ernie 2.0:A continual pre-training framework for language understanding[J].ar Xiv preprint ar Xiv:1907.12412,2019.
- [30]Glorot X,Bengio Y.Understanding the difficulty of training deep feedforward neural networks[C]//Proceedings of the thirteenth international conference on artificial intelligence and statistics.2010:249-256.
- [31]Glorot X,Bordes A,Bengio Y.Deep sparse rectifier neural networks[C]//Proceedings of the fourteenth international conference on artificial intelligence and statistics.2011:315-323.
- [32]Hinton G E,Srivastava N,Krizhevsky A,et al.Improving neural networks by preventing co-adaptation of feature detectors[J].ar Xiv preprint ar Xiv:1207.0580,2012.
- [33]Ioffe S,Szegedy C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[J].ar Xiv preprint ar Xiv:1502.03167,2015.
- [34]Kingma D P,Ba J.Adam:A method for stochastic optimization[J].ar Xiv preprint ar Xiv:1412.6980,2014.
- [35]Nakkiran P,Kaplun G,Bansal Y,et al.Deep double descent:Where bigger models and more data hurt[J].ar Xiv preprint ar Xiv:1912.02292,2019.
- [36]Simonyan K,Zisserman A.Very deep convolutional networks for large-scale image recognition[J].ar Xiv preprint ar Xiv:1409.1556,2014.
- [37]Szegedy C,Liu W,Jia Y,et al.Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2015:1-9.
- [38]He K,Zhang X,Ren S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2016:770-778.
- [39]Sutskever I,Vinyals O,Le Q V.Sequence to sequence learning with neural networks[C]//Advances in neural information processing systems.2014:3104-3112.
- [40]Mnih V,Heess N,Graves A.Recurrent models of visual attention[C]//Advances in neural information processing systems.2014:2204-2212.
- [41]Bahdanau D,Cho K,Bengio Y.Neural machine translation by jointly learning to align and translate[J].ar Xiv preprint ar Xiv:1409.0473,2014.
- [42]Vaswani A,Shazeer N,Parmar N,et al.Attention is all you need[C]//Advances in neural information processing systems.2017:5998-6008.
- [43]Goodfellow I,Pouget-Abadie J,Mirza M,et al.Generative adversarial nets[C]//Advances in neural information processing systems.2014:2672-2680.
- [44]Sabour S,Frosst N,Hinton G E.Dynamic routing between capsules[C]//Advances in neural information processing systems.2017:3856-3866.
- [45]Mnih V,Kavukcuoglu K,Silver D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533.
- [46]Silver D,Schrittwieser J,Simonyan K,et al.Mastering the game of go without human knowledge[J].Nature,2017,550(7676):354-359.
- [47]Zoph B,Le Q V.Neural architecture search with reinforcement learning[J].ar Xiv preprint ar Xiv:1611.01578,2016.
- [48]Bergstra J,Breuleux O,Bastien F,et al.Theano:a CPU and GPU math expression compiler[C]//Proceedings of the Python for scientific computing conference(Sci Py).2010,4(3).
- [49]Jia Y,Shelhamer E,Donahue J,et al.Caffe:Convolutional architecture for fast feature embedding[C]//Proceedings of the 22nd ACM international conference on Multimedia.2014:675-678.
- [50]Oord A,Dieleman S,Zen H,et al.Wavenet:A generative model for raw audio[J].ar Xiv preprint ar Xiv:1609.03499,2016.
- [51]Mikolov T,Sutskever I,Chen K,et al.Distributed representations of words and phrases and their compositionality[C]//Advances in neural information processing systems.2013:3111-3119.Nguyen T,Rosenberg M,Song X,et al.MS MARCO:a human-generated machine reading comprehension dataset[J].2016.
- [52]He W,Liu K,Liu J,et al.Dureader:a chinese machine reading comprehension dataset from real-world applications[J].ar Xiv preprint ar Xiv:1711.05073,2017.