- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
含有 Kubernetes 和 Hadoop YARN 的混合型容器云端——Jian He,阿里巴巴和 Bushuang Gao,阿里巴巴
展开查看详情
1 .
2 . Jian He Staff Engineer @Alibaba cluster management team Staff Engineer @Hortonworks Hadoop Committer & Project Management Committee member Bushuang Gao Senior Engineer @Alibaba
3 .
4 .
5 . Gartner has long talked about the "80% rule": that 80 percent of IT budgets get spent simply "keeping the lights on” The average data center cpu utilization is about 10%
6 .
7 .
8 . Online service Batch jobs Category Online shopping web MR, spark, flink apps, payment service Latency Sensitive Insensitive Priority high low Traffic Peak at day time Peak at night time pattern Fault should not fail Fail and retry tolerance Complementary !
9 .
10 .
11 .
12 .
13 .Borg paper mentions 20% - 30% more machines if If segregating prod and non-prod workloads
14 .
15 .Retail search adds spark MR flink Sigma Fuxi Node Kubernetes YARN
16 .Co-located 40% Seperated 10% 30%
17 .
18 . Resource Scheduling contention Isolation Efficient placement of service container and tasks When placed together, don’t affect each other
19 .- Online workload low 1:00am – 6:00am - Offline jobs scale up while online workload remains idle - Offline jobs scale down while online workload comes back
20 .
21 .
22 .
23 .Kubernetes Focus on long running service. Driving current state towards desired state with control loops YARN Focus on scheduling jobs
24 .Kubernetes Container centric – bottom up. Container is the primitive. Other primitives such as replicaset, deployment are built around containers. YARN Application centric: top down. Scheduling sequence: Queue -> user -> application -> container request
25 .kubernetes Based on api-server watch mechanism Everything stored in etcd YARN Based on RPC Only application-level metadata persisted. Container data is not persisted. Recover from in-memory state from peers
26 .kubernetes CRI compatible. Docker etc. YARN Docker + TAR ball
27 .
28 . Resource Online service Offline jobs Console management RPC: VTRON RPC: VTRON Apiserver Co-location YARN-RM L&W Scheduler GRPC RPC L&W NODE kubelet agent YARN-NM cgroup pod pod pod pod pod task task task VTRON: Virtual Total Resources Of Node
29 .Online service usage Offline job resource usage Kubernetes YARN Online service resource quota Offline job resource quota