- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Building Resilient and Scalable Data Pipelines by Decoupling Compute and Storage
展开查看详情
1 . Building Resilient and Scalable Data Pipelines by Decoupling Compute and Storage Ivan Jibaja Software Engineer 1 © 2018 PURE STORAGE INC. PURE PROPRIETARY
2 .Our Log Analytics Pipeline in Numbers ü1.5 - 2M events / second ü0.5 - 1 PB of data / day ü5 seconds SLA ü(6) 9s of Reliability 2 © 2018 PURE STORAGE INC. PURE PROPRIETARY
3 . Data Pipeline – Early Stages 1,000+ 20,000+ VMs tests 100+ 12 FBs 16 12 12 40 16 12 12 40 6T 16 18T 12 18T 12 6G 40 400+ 16 12 12 40 clients 12 rsyslog 10+ Jenkins 6G 12 Custom code 3 © 2018 PURE STORAGE INC. PURE PROPRIETARY
4 . Data Pipeline - Now 12 12 120,000+ 12 12 tests / day 12 12 2,500+ 12 12 VMs 12 12 12 16 12 12 16 12 350+ 12 12 FBs 16 12 12 72T 16 24T 16 72T 12 12 800G 12 1,000+ 16 12 12 clients 12 rsyslog 12 ü Duplicate bug 12 12 20+ 12 ü Infrastructure failure Jenkins 12 200T 12 90G 12 ü Performance regression 12 12 12 12 12 189T 12 50G 12 ü Low level details 12 12 ü Easy to read graphs 4 © 2018 PURE STORAGE INC. PURE PROPRIETARY
5 .Reliability, Scalability, Flexibility 5 © 2018 PURE STORAGE INC. PURE PROPRIETARY
6 .Software Crashes Need to be able to restart each stage of your pipeline without affecting correctness Idempotency 6 © 2018 PURE STORAGE INC. PURE PROPRIETARY
7 .Growth Each stage of your pipeline may grow at different speeds Orchestration 7 © 2018 PURE STORAGE INC. PURE PROPRIETARY
8 .Efficiency and Flexibility 1. Application stack to solve every kind of problem and they are easy to setup 2. Application silos are inefficient and increase operational cost 3. Scale may require re-architecting a given stage Decouple compute and storage 8 © 2018 PURE STORAGE INC. PURE PROPRIETARY
9 .Technologies we use • Docker: Containers • Nomad: Orchestration • Prometheus: Monitoring • Grafana: Dashboards • Consul: Service discovery • Chef: Container build • Jenkins: Continuous Integration • Kafka Manager: Kafka Interface • Artifactory: Image repository • Ansible: Configuring servers 9 © 2018 PURE STORAGE INC. PURE PROPRIETARY
10 .Takeaways • Reliability: Idempotency • Scalability: Orchestration • Flexibility and Efficiency: Decoupled compute and storage 10 © 2018 PURE STORAGE INC. PURE PROPRIETARY
11 . QUESTIONS? 11 © 2018 PURE STORAGE INC. PURE PROPRIETARY