基于端云协同体系的预训练大模型及其服务化
杨洋;况琨;陈政聿;孙逸飞;方陶然;张圣宇;孙建凯;杨鑫;杨红霞;吴飞;
摘要(Abstract):
传统云计算模式将所有数据集中,以中心化原则在云上训练大模型,通过云服务支撑端侧多样服务需求,这一模式存在网络时延大、隐私安全低和算力成本高等不足。在“泛在互联、移动优先、AI赋能”时代,需要机器学习支撑丰富多样端侧应用,因此建立端云协同计算范式,既提供云上服务和端侧推理能力,又推动云上模型和端侧模型的协同进化,从云计算和端智能向端云协同进化计算模式进行跨越,充分发挥云上、端侧和端云链中各类计算资源的最佳效果是当前的重大挑战。本文围绕云上预训练大模型、端云协同体系、基于端云协同体系的预训练大模型服务化与隐私保护,以及未来挑战进行了讨论。
关键词(KeyWords): 端云协同;预训练大模型;大模型服务化;隐私保护
基金项目(Foundation):
作者(Authors): 杨洋;况琨;陈政聿;孙逸飞;方陶然;张圣宇;孙建凯;杨鑫;杨红霞;吴飞;
参考文献(References):
- [1]CAMBRIA E,WHITE B.Jumping NLP Curves:A Review of Natural Language Processing Research[Review Article][J].IEEEComputational Intelligence Magazine,2014,9:48-57.
- [2]aws.Amazon:Amazon Comprehend[EB/OL].[2022-11-20].https://aws.amazon.com/comprehend/.
- [3]Azure.Microsoft:Azure Cognitive Services[EB/OL].[2022-11-20].https://azure.microsoft.com/en-us/services/cognitive-services/.
- [4]Cloud G.Natural Language AI[EB/OL].[2022-11-20].https://cloud.google.com/natural-language.
- [5]DIFFBOT.Structure and Understand Natural Language[EB/OL].[2022-11-20].https://www.diffbot.com/products/naturallanguage/.
- [6]Monkey Learn.No-code text analytics[EB/OL].[2022-11-20].https://monkeylearn.com/.
- [7]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[EB/OL].(2019-05-24)[2022-11-20].https://arxiv.org/abs/1810.04805.
- [8]SANH V,DEBUT L,CHAUMOND J,et al.Distil BERT,a distilled version of BERT:smaller,faster,cheaper and lighter[EB/OL].(2020-05-01)[2022-11-20].https://arxiv.org/abs/1910.01108.
- [9]LAN Z Z,CHEN M D,GOODMAN S,et al.ALBERT:A Lite BERT for Self-supervised Learning of Language Representations[EB/OL].(2020-05-09)[2022-11-20].https://arxiv.org/abs/1909.11942.
- [10]BOSE A J,LING H,CAO Y S.Adversarial Contrastive Estimation[EB/OL].(2018-08-02)[2022-11-20].https://arxiv.org/abs/1805.03642.
- [11]CLARK K,LUONG M T,LE Q V,et al.ELECTRA:Pre-training Text Encoders as Discriminators Rather Than Generators[EB/OL].(2020-05-23)[2022-11-20].https://arxiv.org/abs/2003.10555.
- [12]SUN Z Q,YU H K,SONG S D,et al.Mobile BERT:a Compact Task-Agnostic BERT for Resource-Limited Devices[EB/OL].(2020-04-14)[2022-11-20].https://arxiv.org/abs/2004.02984.
- [13]ZHANG Y Y,YU J T,WANG K,et al.The Solution of Huawei Cloud&Noah's Ark Lab to the NLPCC-2020 Challenge:Light PreTraining Chinese Language Model for NLP Task[C]//Natural Language Processing and Chinese Computing(NLPCC 2020).Cham:Springer,2020.DOI:10.1007/978-3-030-60457-8_43.
- [14]WEI J Q,REN X Z,LI X G,et al.NEZHA:Neural Contextualized Representation for Chinese Language Understanding[EB/OL].(2021-11-19)[2022-11-20].https://arxiv.org/abs/1909.00204.
- [15]JIAO X Q,YIN Y C,SHANG L F,et al.Tiny BERT:Distilling BERT for Natural Language Understanding[EB/OL].(2020-10-16)[2022-11-20].https://arxiv.org/abs/1909.10351.
- [16]DRAKE M S,THORNOCK J R,TWEDT B J.The internet as an information intermediary[J].Review of Accounting Studies,2017,22:543-576.
- [17]LIU J Y,WU J,SUN L N,et al.Image data model optimization method based on cloud computing[J].Journal of Cloud Computing,2020,9:1-10.
- [18]LU Y.Image Classification Algorithm Based on Improved Alex Net in Cloud Computing Environment[C]//2020 IEEEInternational Conference on Industrial Application of Artifcial Intelligence(IAAI2020).IEEE:250-253.
- [19]QIN Z,YAN J B,REN K,et al.Towards Efficient Privacy-preserving Image Feature Extraction in Cloud Computing[C]//Proceedings of the 22nd ACM international conference on Multimedia.Association for Computing Machinery,2014:497-506.
- [20]LI P,LI T,YAO Z A,et al.Privacy-preserving outsourcing of image feature extraction in cloud computing[J].Soft Computing,2017,21:4349-4359.
- [21]QIN J H,LI H,XIANG X Y,et al.An Encrypted Image Retrieval Method Based on Harris Corner Optimization and LSH in Cloud Computing[J].IEEE Access,2019,7:24626-24633.
- [22]GUO W,GONG J Y,JIANG W S,et al.Open RS-Cloud:A remote sensing image processing platform based on cloud computing environment[J].Science China Technological Sciences,2010,53:221-230.
- [23]MARWAN M,KARTIT A,OUAHMANE H.A Framework to Secure Medical Image Storage in Cloud Computing Environment[J].Journal of Electronic Commerce in Organizations,2018,16(1):1-16.
- [24]MIRARAB A,FARD N G,SHAMSI M,et al.A cloud solution for medical image processing[J].International Journal of Engineering Research and Applications,2014,4:74-82.
- [25]MARWAN M,KARTIT A,OUAHMANE H,et al.Using cloud solution for medical image processing:Issues and implementation efforts[C]//2017 3rd International Conference of Cloud Computing Technologies and Applications(Cloud Tech).IEEE,2017:1-7.
- [26]MA Y J,ZHANG Y,WAN J F.Robot and cloud-assisted multi-modal healthcare system[J].Cluster Computing,2015,18:1295-1306.
- [27]BEKELE M K.Clouds-Based Collaborative and Multi-Modal Mixed Reality for Virtual Heritage[J].Heritage,2021,4(3):1447-1459.
- [28]JAYASENA K P N,LI L,XIE Q,et al.Multi-modal Multimedia Big Data Analyzing Architecture and Resource Allocation on Cloud Platform[J].Neurocomputing,2017,253:135-143.
- [29]MAMOONA H.Role of Emerging Io T Big Data and Cloud Computing for Real Time Application[J].International Journal of Advanced Computer Science and Applications,2022,11.DOI:10.14569/IJACSA.2020.0110466.
- [30]ELHOSENY M,ABDELAZIZ A,SALAMA A S,et al.A hybrid model of Internet of Things and cloud computing to manage big data in health services applications[J].Future Generation Computer System,2018,86:1383-1394.
- [31]ZHOU Z,CHEN X,LI E,et al.Edge intelligence:Paving the last mile of artificial intelligence with edge computing[J].Proceedings of the IEEE,2019,107(8):1738-1762.
- [32]KAIROUZ P,MCMAHAN H B,AVENT B,et al.Advances and open problems in federated learning[M]//Foundations and Trends in Machine Learning.Now Foundations and Trends,2021.
- [33]LI T,SAHU A K,TALWALKAR A.Federated learning:Challenges,methods,and future directions[J].IEEE Signal Processing Magazine,2020,37(3):50-60.
- [34]SIM K C,ZADRAZIL P,BEAUFAYS F.An investigation into on-device personalization of end-to-end automatic speech recognition models[C]//Interspeech 2019 ISCA,2019:774-778.DOI:10.21437/Interspeech.2019-1752.
- [35]MANSOUR Y,MOHRI M,RO J,et al.Three approaches for personalization with applications to federated learning[EB/OL].(2020-07-19)[2022-11-20].https://arxiv.org/abs/2002.10619.
- [36]WANG K K,MATHEWS R,KIDDON C.Federated evaluation of on-device personalization[EB/OL].(2019-10-22)[2022-11-20].https://arxiv.org/abs/1910.10252.
- [37]JENNY H,MOHRI M,SURESH A T.Fed Boost:A Communication-Efficient Algorithm for Federated Learning[C]//International Conference on Machine Learning 2020.PMLR,2020:3973-3983.
- [38]HOU C,THEKUMPARAMPIL K K,FANTI G,et al.Reducing the communication cost of federated learning through multistage optimization[EB/OL].(2022-07-04)[2022-11-20].https://arxiv.org/abs/2108.06869.
- [39]LIU L M,ZHANG J,SONG S H.Client-edge-cloud hierarchical federated learning[C]//2020 IEEE International Conference on Communications(ICC 2020).IEEE,2020:1-6.
- [40]LIM W Y B,NG J S,XIONG Z,et al.Decentralized edge intelligence:A dynamic resource allocation framework for hierarchical federated learning[J].IEEE Transactions on Parallel and Distributed Systems,2021,33(3):536-550.
- [41]LIM W Y B,NG J S,XIONG Z,et al.Dynamic edge association and resource allocation in self-organizing hierarchical federated learning networks[J].IEEE Journal on Selected Areas in Communications,2021,39(12):3640-3653.
- [42]LIM W Y B,GARG S,XIONG Z,et al.Uav-assisted communication efficient federated learning in the era of the artifcial intelligence of things[J].IEEE Network,2021,35(5):188-195.
- [43]LIM W Y B,HUANG J,XIONG Z,et al.Towards federated learning in uav-enabled internet of vehicles:A multi-dimensional contract-matching approach[J].IEEE Transactions on Intelligent Transportation Systems,2021,22(8):5140-5154.
- [44]LYU L,YU H,MA X,et al.Privacy and robustness in federated learning:Attacks and defenses[EB/OL].(2022-01-19)[2022-11-20].https://arxiv.org/abs/2012.06337.
- [45]LAMPORT L,SHOSTAK R,PEASE M.The byzantine generals problem[J].ACM Transactions on Programming Languages and Systems,1982,4(3):382-401.
- [46]WANG H,SREENIVASAN K,RAJPUT S,et al.Attack of the tails:Yes,you really can backdoor federated learning[C]//Thirtyfourth Conference on Neural Information Processing Systems.NeurIPS,2020.
- [47]ZHU L,HAN S.Deep leakage from gradients[M]//Federated learning.Cham:Springer,2020.
- [48]PAILLIER P.Public-key cryptosystems based on composite degree residuosity classes[C]//Proceedings of the 17th international conference on Theory and application of cryptographic techniques.Berlin:Springer-Verlag,1999.
- [49]DEMMLER D,SCHNEIDER T,ZOHNER M.Aby-a framework for efficient mixed-protocol secure two-party computation[C]//Network&Distributed System Security Symposium.NDSS,2015.
- [50]DWORK C,ROTH A.The algorithmic foundations of differential privacy[M].Now Foundations and Trends,2014.
- [51]DEY S,MONDAL J,MUKHERJEE A.Offloaded execution of deep learning inference at edge:Challenges and insights[C]//2019IEEE International Conference on Pervasive Computing and Communications Workshops(Per Com Workshops).IEEE,2019:855-861.
- [52]XU Z,ZHAO L,LIANG W,et al.Energy-aware inference offloading for dnn-driven applications in mobile edge clouds[J].IEEE Transactions on Parallel and Distributed Systems,2020,32(4):799-814.
- [53]PACHECO R G,COUTO R S,SIMEONE O.Calibration-aided edge inference offloading via adaptive model partitioning of deep neural networks[C]//ICC 2021-IEEE International Conference on Communications.IEEE,2021:1-6.
- [54]PACHECO R G,OLIVEIRA F D,COUTO R S.Early-exit deep neural networks for distorted images:Providing an efficient edge offloading[C]//2021 IEEE Global Communications Conference(GLOBECOM).IEEE,2022.
- [55]Zhou S,Jadoon W,Shuja J.Machine learning-based offloading strategy for lightweight user mobile edge computing tasks[J].Complexity,2021(3):1-11.
- [56]Shuja J,Mustafa S,Ahmad R W,et al.Analysis of vector code offloading framework in heterogeneous cloud and edge architectures[J].IEEE Access,2017,5:24542-24554.
- [57]GONG Y,JIANG Z,FENG Y,et al.Edgerec:recommender system on edge in mobile taobao[C]//Proceedings of the 29th ACMInternational Conference on Information&Knowledge Management(CIKM).New York:Association for Computing Machinery,2020:2477-2484.
- [58]QIAN X,XU Y,LV F,et al.Intelligent request strategy design in Recommender System[C]//Proceedings of the 28th ACM SIGKDDConference on Knowledge Discovery and Data Mining.New York:Association for Computing Machinery,2022:3772-3782.
- [59]BANITALEBI-DEHKORDI A,VEDULA N,PEI J,et al.Auto-split:A general framework of collaborative edge-cloud ai[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery&Data Mining.New York:Association for Computing Machinery,2021:2543-2553.
- [60]ASHERALIEVA A,NIYATO D,XIONG Z.Auction-and-learning based lagrange coded computing model for privacypreserving,secure,and resilient mobile edge computing[J].IEEE Transactions on Mobile Computing,2021.
- [61]NG J S,LIM W Y B,XIONG Z,et al.A double auction mechanism for resource allocation in coded vehicular edge computing[J].IEEE Transactions on Vehicular Technology,2022,71(2):1832-1845.
- [62]DING C,ZHOU A,LIU Y,et al.A cloud-edge collaboration framework for cognitive service[J].IEEE Transactions on Cloud Computing,2022,10(3):1489-1499.
- [63]YAO J,WANG F,JIA K,et al.Device-cloud collaborative learning for recommendation[C]//Proceedings of the 27th ACMSIGKDD Conference on Knowledge Discovery&Data Mining.New York:Association for Computing Machinery,2021:3865-3874.
- [64]CHEN Z,YAO J,WANG F,et al.Mc2-sf:Slow-fast learning for mobile-cloud collaborative recommendation[EB/OL].(2021-09-25)[2022-11-20].https://arxiv.org/abs/2109.12314.
- [65]ZHANG F,QI X,YANG R,et al.Domain-invariant stereo matching networks[C]//ECCV 2020.Cham:Springer,2020:420-439.
- [66]YEH Y R,HUANG C H,WANG Y C F.Heterogeneous domain adaptation and classifcation by exploiting the correlation subspace[J].IEEE Transactions on Image Processing,2014,23(5):2009-2018.
- [67]WANG H,WU X,JIA Y.Heterogeneous domain adaptation method for video annotation[J].IET Computer Vision,2017,11(2):181-187.
- [68]SAMAT A,PERSELLO C,GAMBA P,et al.Supervised and semi-supervised multi-view canonical correlation analysis ensemble for heterogeneous domain adaptation in remote sensing image classifcation Remote Sensing,2017,9(4):337.
- [69]ZHOU J T,TSANG I W,PAN S J,et al.Heterogeneous domain adaptation for multiple classes[C]//Proceedings of the Seventeenth International Conference on Artifcial Intelligence and Statistics.PMLR,2014:1095-1103.
- [70]KULIS B,SAENKO K,DARRELL T.What you saw is not what you get:Domain adaptation using asymmetric kernel transforms[C]//CVPR 2011.IEEE,2011.DOI:10.1109/CVPR.2011.5995702.
- [71]FEUZ K D,COOK D J.Transfer learning across feature-rich heterogeneous feature spaces via feature-space remapping[J].ACM Transactions on Intelligent Systems and Technology,2015,6(1):1-27.
- [72]ROSENFELD B,RAJENDRAN B,SIMEONE O.Fast on-device adaptation for spiking neural networks via online-within-online meta-learning[C]//2021 IEEE Data Science and Learning Workshop(DSLW).IEEE,2021.DOI:10.1109/DSLW51110.2021.9523405.
- [73]FINN C,ABBEEL P,LEVINE S.Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the34th International Conference on Machine Learning.Sydney:JMLR,2017,70:1126-1135.
- [74]HUANG L,ZHANG L,YANG S,et al.Meta-learning based dynamic computation task offloading for mobile edge computing networks[J].IEEE Communications Letters,2021,25(5):1568-1572.
- [75]ZHOU W,XU C,MCAULEY J J.Meta learning for knowledge distillation[EB/OL].(2022-04-01)[2022-11-20].https://arxiv.org/abs/2106.04570.
- [76]YE J,ZHANG S,WANG J.Hybrid network compression via meta-learning[C]//ACM Multimedia Conference.New York:Association for Computing Machinery,2021:1423-1431.
- [77]PAN H,WANG C,QIU M,et al.Meta-kd:A meta knowledge distillation framework for language model compression across domains[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.Association for Computational Linguistics,2021:3026-3036.
- [78]ZHANG M,WANG D,GAI S.Knowledge distillation for model-agnostic meta-learning[C]//European Conference on Artifcial Intelligence.IOS Press,2020:1355-1362.
- [79]PEARL J.Causality:Models,Reasoning,and Inference[M].Cambridge University Press,2009.
- [80]KUANG K,LI L,GENG Z,et al.Causal inference[J].Engineering,2020,6(3):253-263.
- [81]ROTMAN G,FEDER A,REICHART R.Model compression for domain adaptation through causal effect estimation[EB/OL].(2021-08-11)[2022-11-20].https://arxiv.org/abs/2101.07086.
- [82]TESHIMA T,SATO I,SUGIYAMA M.Few-shot domain adaptation by causal mechanism transfer[C]//Proceedings of the 37th International Conference on Machine Learning.JMLR,2020:1458-1469.
- [83]YANG M,SHEN Y,CHEN X,et al.Multi-source domain adaptation for sentiment classification with granger causal inference[C]//Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:Association for Computing Machinery,2020:1913-1916.
- [84]KUANG K,CUI P,ATHEY S,et al.Stable prediction across unknown environments[C]//The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:Association for Computing Machinery,2018:1617-1626.
- [85]ZHANG S,JIANG T,WANG T,et al.Devlbert:Learning deconfounded visio-linguistic representations[C]//Proceedings of the28th ACM International Conference on Multimedia.New York:Association for Computing Machinery,2020:4373-4382.
- [86]LECUN Y,DENKER J S,SOLLA S A,et al.Optimal Brain Damage[M]//Advances in neural information processing systems 2.San Francisco:Morgan Kaufmann Publishers Inc.,1990:853.
- [87]HASSIBI B,STORK D G.Second Order Derivatives for Network Pruning:Optimal Brain Surgeon[M].Advances in Neural Information Processing Systems 5.Morgan Kaufmann Publishers Inc.,1992:164-171.
- [88]YU S X,YAO Z W,GHOLAMI A,et al.Hessian-Aware Pruning and Optimal Neural Implant[C]//2022 IEEE/CVF Winter Conference on Applications of Computer Vision(WACV).IEEE,2022.
- [89]BEHNKE M,HEAFIELD K.Losing Heads In The Lottery:Pruning Transformer Attention In Neural Machine Translation[C]//Empirical Methods in Natural Language Processing.Association for Computational Linguistics(ACL),2020.
- [90]KRICHENE S,MüLLER T,EISENSCHLOS J M.Do T:An Efficient Double Transformer for NLP Tasks with Tables[EB/OL].(2021-06-01)[2022-11-20].https://arxiv.org/abs/2106.00479.
- [91]YANG Z Q,CUI Y M,CHEN Z G.Text Pruner:A Model Pruning Toolkit for Pre-Trained Language Models[EB/OL].(2022-05-30)[2022-11-21].https://arxiv.org/abs/2203.15996v1.
- [92]CHUNG I,KIM B,CHOI Y,et al.Extremely Low Bit Transformer Quantization For On-Device Neural Machine Translation[C]//Findings of the Association for Computational Linguistics:EMNLP 2020.Association for Computational Linguistics,2020:4812-4826.
- [93]XU J,HU S,YU J,et al.Mixed Precision Quantization of Transformer Language Models for Speech Recognition[C]//2021 IEEEInternational Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2021.
- [94]KIM S,GHOLAMI A,YAO Z W,et al.I-BERT:Integer-only BERT Quantization[C]//The Thirty-eighth International Conference on Machine Learning.ICML,2021.
- [95]BONDARENKO Y,NAGEL M,BLANKEVOORT T.Understanding and Overcoming The Challenges of Efficient Transformer Quantization[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2021:7947-7969.
- [96]LI Z X,YANG T,WANG P S,et al.Q-Vi T:Fully Differentiable Quantization for Vision Transformer[EB/OL].(2022-09-06)[2022-11-21].https://arxiv.org/abs/2201.07703.
- [97]LIN Y,ZHANG T Y,SUN P Q,et al.FQ-Vi T:Post-Training Quantization for Fully Quantized Vision Transformer[C]//Proceedings of the Thirty-First International Joint Conference on Artifcial Intelligence.IJCAI Organization,2022.
- [98]WANG J,HSIEH C Y,WANG M Y,et al.Multi-constraint Molecular Generation Based on Conditional Transformer,Knowledge Distillation and Reinforcement Learning[J].Nature Machine Intelligence,2021,3:914-922.
- [99]MASUMURA R,MAKISHIMA N,IHORI M,et al.Hierarchical Transformer-Based Large-Context End-To-End ASR with LargeContext Knowledge Distillation[C]//2021 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2021:5879-5883.
- [100]FANG Z Y,WANG J F,HU X W,et al.Compressing Visual-Linguistic Model Via Knowledge Distillation[C]//2021 IEEE/CVFInternational Conference on Computer Vision(ICCV).IEEE,2021.
- [101]LU C Q,ZHANG J W,CHU Y F,et al.Knowledge Distillation of Transformer-based Language Models Revisited[EB/OL].(2022-07-12)[2022-11-21].https://arxiv.org/abs/2206.14366.
- [102]HUANG C A Z,VASWANI A,USZKOREIT J,et al.Music Transformer:Generating Music with Long-Term Structure[C]//Seventh International Conference on Learning Representations.ICLR,2019.
- [103]GE T,WEI F R.Edge Former:A Parameter-Efficient Transformer for On-Device Seq2seq Generation[EB/OL].(2022-05-21)[2022-11-21].https://arxiv.org/abs/2202.07959v1.
- [104]GUO Q P,QIU X P,XUE X Y,et al.Low-Rank and Locality Constrained Self-Attention for Sequence Modeling[J].IEEE/ACMTransactions on Audio,Speech,and Language Processing,2019,27(12):2213-2022.
- [105]WINATA G I,CAHYAWIJAYA S,LIN Z,et al.Lightweight And Efficient End-To-End Speech Recognition Using Low-Rank Transformer[C]//2020 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2020.
- [106]LI D X,LUO Z Y.An Improved Transformer-Based Neural Machine Translation Strategy:Interacting-Head Attention[J].Computational Intelligence and Neuroscience,2022.
- [107]李国杰.大数据研究的科学价值[EB/OL].(2012-09)[2022-11-21].https://ict.cas.cn/liguojiewenxuan_162523/wzlj/lgjxsbg/201912/P020191227654597760261.pdf.
- [108]YANG J,LU J,LEE S,et al.Graph r-cnn for scene graph generation[C]//European Conference on Computer Vision.Cham:Springer,2018:690-706.
- [109]LANDRIEU L,SIMONOVSKY M.Large-scale point cloud semantic segmentation with superpoint graphs[C]//2018 IEEE/CVFConference on Computer Vision and Pattern Recognition.IEEE,2018.
- [110] MARCHEGGIANI D, BASTINGS J, TITOV I. Exploiting semantics in neural machine translation with graph convolutional networks[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational...