- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
当分布式数据库遇到云时 — 汲取经验教训
展开查看详情
1 .When Distributed Database Meets Cloud Lessons Learned Yanqing Weng iweng@pivotal.io © Copyright 2017 Pivotal Software, Inc. All rights Reserved. Version 1.0
2 . ■ Principal Software Engineer in Pivotal ■ Apache HAWQ Committer About me ■ Apache HAWQ PMC Member
3 .Agenda ■ Introduction to Distributed Database ■ Distributed Database on Cloud ■ Lessons Learned ■ Q&A Cover w/ Image
4 .Distributed Database
5 . ■ Apache Hadoop Native SQL, Advanced, MPP, Elastic Query Engine. Apache HAWQ ■ Apache Top Level Project in 2018.8
6 .Apache HAWQ Architecture
7 .Apache HAWQ Query Processing Slice: 1. a portion of the plan that segments can work on independently. 2. a query plan is sliced wherever a motion operation occurs in the plan. Motion: 1. an operation involves moving tuples between the segments during query processing. 2. three types: redistribution, broadcast, Virtual Segment: gather motion. 1. a resource unit for QD and Resource Manager 2. an execution unit 3. VSEG number determines the degree of parallelism of a query SELECT COUNT(*) FROM lineitem, part WHERE dynamically. p_partkey=l_partkey AND p_brand = 'Brand#23'
8 .Virtual Segment ■ Resource allocation unit ■ Query execution unit ■ Variable virtual segment number ■ Place on any physical segment
9 . ■ High Performance ■ Storage computing separation Summary ■ Fine-grained resource management ■ Elastic query execution engine ■ Stateless segment
10 .Cloud Database
11 . ■ Database as Service ■ Efficient Resource Management Requirement ■ Infrastructure Agnostic ■ DBA Free
12 . ■ Container VS. Virtual Machine Deployment & ■ Kubernetes Operation ○ Service discovery ○ Load balancing ○ Horizontal and Vertical auto scaling ○ Rolling upgrade ○ Monitor and metrics collection ○ …...
13 .Architecture
14 . ● Storage Service ○ Cloud Storage, Amazon S3, Hadoop…... ○ Unified Cache Lever by Alluxio Architecture ● Computing Service ○ Shared Segment Pool ○ Global Resource Management Service ● Database Service ○ Master/Standby as Database ○ Get Segments for Query on Demand ● Control Plane ○ Operator/Controllers as DBA
15 .Apache HAWQ on Kubernetes
16 .Custom Resource
17 . ■ HAWQ Operator ■ Resource Pool Controller Controller ■ Resource Pool AutoScaler ■ Resource Recommender ■ Query Controller
18 .Apache HAWQ on Kubernetes
19 .Lesson Learned
20 . ■ Service Oriented Architecture ○ Monolithic → Micro Service Architecture ■ Resource Centric ○ Abstract Component as Resource ○ Service for Resource Usage ○ Controller for Resource Management
21 . ■ Container != Image ■ Container != VM Containerization ■ Container = Fine-grained Resource
22 . ■ Traditional Database Resource ○ Fixed resources Management ○ Balance resource usage among queries ■ Cloud Database ○ Dynamic resources ○ Maximize resource sharing ○ Maximize resource utilization for each query
23 .Resource ● Database Monitoring & ○ Variant Query Workload ○ Data Size Tuning ○ …… ● Query Similarity and Classification ● Query Resource Monitoring ○ Pod Runtime Metrics ○ Application Logs ○ Kubernetes Events ○ …... ● Intelligent Resource Tuning ○ Resource Pool Definition ○ Horizontal & Vertical
24 . ■ Log Collection Kubernetes ○ Fluentd Ecosystem ■ Monitoring and Metrics Collection ○ Prometheus ■ Visualization ○ Grafana ■ …...
25 . ■ Management Utilities ■ Imperative VS. Declarative Others ■ Pod Priority ■ …...
26 .Thank you! Questions?