- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
加速数据中心的推理
展开查看详情
1 .Accelerating Inference in the Data Center Dr. Malini Bhandaru & Karol Zalewski Contributors: Santiago Mok, Konrad Kurdej, Sundar Nadathur, Alexander Kanevskiy, Ismo Puustinen Intel #HWCSAIS11
2 .Autonomous Vehicles – R & D Data Pressure 1-20 TB/car/hour # cameras, resolution, other sensor arrays Image Credit: https://clepa.eu/mediaroom/autonomous-vehicles-will-drive-change-auto-manufacturing-insurance/ https://ia.acs.org.au/article/2017/who-should-the-driverless-car-kill-.html #HWCSAIS11 2
3 .Inference Everywhere Faster Please! • Speed ground truth generation – Human improves upon automated • Speed Privacy transformations – Face/license plate obscurring • Speed simulation – Detect (edge-ish), Plan, Act https://medium.com/@xslittlegrass/self-driving-car-in-a-simulator-with-a-tiny-neural-network-13d33b871234 #HWCSAIS11 3
4 .Compute Continuum GPUs ASICs CPUs FPGAs, Flexibile,Slower Fixed, Faster Movidius Can Spark Leverage? Easily? #HWCSAIS11 4
5 .FPGA Movidus Chip • Logic blocks, memory, security, • Programmable, SDK variable sizes • Low Power • Programmable, OpenCL • Tuned for image processing • Fast but Expensive • Fast, Inexpensive • Applications: Networking, • Applications: Drones, Telecommunication, Research, Cameras, Augmented Reality Machine Learning #HWCSAIS11 5
6 .Data Center Platform Storage HDFS, Ceph, MySQL, S3 Drivers Orchestrator Stacks • Fungible Spark, Hadoop, Hetero Hardware CPUs • Dynamic YARN Kubernetes GPUS FPGAs • Resilient AI Frameworks Programmable Inference Chips TensorFlow • Easy to Use Caffe2, .. Glue Technologies • Kafka • Fast Containers • Oozie • Argo #ML9SAIS 6
7 .Environment • Kubernetes – resilient, auto scaling, easy to use • Spark – big data in memory processing, possible data locality #HWCSAIS11 7
8 . Kubernetes Device Plugin Enables use of new Resources MASTER NODE API Server CRI Authentication CRI Container Kubelet shim Runtime Authorization Admission Device Plugin API Control Device Vendor Plugin Driver etcd Controller Core Components Manager Scheduler Extensions/Plugins Device-Specific Software https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/ New Work for Device Plugins #HWCSAIS11 8 8
9 .Experiment • SqueezeNet 1.1 • gRPC calls 3-4 ms • Data pre-Processing 16-30 ms #HWCSAIS11 9
10 .FPGA Inference Supported Deep Learning Topologies • AlexNet • Model Size • GoogleNet v1 • VGG-16 & VCG-19 • FPGA Size • SqueezeNet 1.0 & 1.1 • • Trade-off • ResNet-18 SqueezeNet-based variant of – Model accuracy, speed SSD • GoogleNet-based variant of SSD – Compile to target hardware • VGG-based variant of SSD #HWCSAIS11 10
11 .Movidius USB Learnings & Workarounds Common Paradigm: • No Python support - loss of data locality TensorFlow USB • Model – as-a-service Serving • Access to host network (isolation loss) `--net=host` • Visibilibility into Device Manager events in Docker environment `libusb` • Privilege Escalation (insecure) • Movidius NCSDk2 – resolves some issues `--privileged` • Feedback to Movidius team • Access to Virtual File System to access • Service running on bare metal USB device from within container • Movidius PCIe device coming soon! `-v /dev:/dev` USB related issues moot #HWCSAIS11 11
12 .Movidius Next • SDK2 just released – Up to 10 models may co-exist on one device, – FIFO queue, – 32 bit floating point • Chip-2 Coming soon – at least an order of magnitude faster https://developer.movidius.com/start https://github.com/movidius/ncsdk https://github.com/kzzalews/sparkaisummit_movidius #HWCSAIS11 12
13 .Results CPU FPGA Movidius Software Tools CentOS 7.4 SDK 1 Intel Acceleration StackStack 1.0 Intel OpenVINO Toolkit with FPGA Support Hardware CPU: Intel Xeon CPU E5- FPGA: Arria 10 GX (1150K Movidius 1650 v2 @ 3.50GHz Logic elements, 8GB DDR4, PCIe Gen3) Inference Time/image 7.5 ms 3.2 ms 34 ms put your #assignedhashtag here by setting the footer in view-header/footer 13
14 .Demo https://videoportal.intel.com/media/0_selfn06l put your #assignedhashtag here by setting the footer in view-header/footer 14
15 .Future Work • Kubernetes Device Manager support for Movidius • Explore native Spark support for Movidius • Kubernetes/Spark Scheduler Enhancements – Wait for HW or launch anywhere? – Speed, power, and latency implications – Targeted models #HWCSAIS11 15
16 .Conclusion • FPGA support more mature • Give Movidius a try, delightful at its price point!! https://developer.movidius.com/start https://github.com/movidius/ncsdk https://github.com/kzzalews/sparkaisummit_movidius #HWCSAIS11 16
17 .References Kubernetes Device Plugin: • https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/ • https://kubernetes.io/docs/concepts/cluster-administration/device-plugins/ • https://github.com/kubernetes/community/blob/master/contributors/design- proposals/resource-management/device-plugin.md FPGAs and the Movidius Chip • https://venturebeat.com/2018/02/27/intel-makes-it-easier-to-bring-movidius-ai- accelerator-chip-into-production/ • https://newsroom.intel.com/editorials/introducing-myriad-x-unleashing-ai-at-the-edge/ • https://www.altera.com/products/fpga/stratix-series/stratix-10/overview.html • https://medium.com/@xslittlegrass/self-driving-car-in-a-simulator-with-a-tiny-neural- network-13d33b871234 SparkCL: A Unified Programming Framework for Accelerators on Heterogeneous Clusters: • https://arxiv.org/ftp/arxiv/papers/1505/1505.01120.pdf #HWCSAIS11 17
18 .Thank You! Karol.Zalewski@intel.com Malini.K.Bhandaru@intel.com