- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
云计算成本控制与Apache Spark
展开查看详情
1 .Cloud Cost Management and Apache Spark Xuan Wang, Databricks #DSSAIS13
2 .Introduction ● Goal of this talk ○ share our experience in managing cloud costs ○ tools and technologies ○ lessons learnt and good practices ○ go wide rather than go deep #DSSAIS13 2
3 .Introduction ● Goal of this talk ○ share our experience in managing cloud costs ○ tools and technologies ○ lessons learnt and good practices ○ go wide rather than go deep ● Why do we care about cloud cost? ○ growth in cloud revenue in Q1 2018: Amazon: 49%, Microsoft: 58% ● #DSSAIS13 3
4 .Databricks’ Unified Analytics Platform COLLABORATIVE NOTEBOOKS Unifies Data Engineers and Data Scientists Data Engineers Data Scientists DATABRICKS RUNTIME Unifies Data and Powered by AI Technologies Delta SQL Streaming XGBoost Eliminates CLOUD NATIVE SERVICE infrastructure complexity 4
5 .Three paths toward cost control ● Native reporting from cloud providers ○ Good general information and supports ○ Limited options, not scalable as environment grows ● Commercial tools ○ More details and flexibilities, connectors to raw data ○ Not enough customization, additional charges #DSSAIS13 5
6 .Three paths toward cost control ● Native reporting from cloud providers ○ Good general information and supports ○ Limited options, not scalable as environment grows ● Commercial tools ○ More details and flexibilities, connectors to raw data ○ Not enough customization, additional charges ● In-house solutions ○ Most flexible, deeper understanding of the costs ○ Opportunity costs #DSSAIS13 6
7 .Challenges in cloud cost control ● overwhelming and complex usage details ○ need to convert data into insights/actions ● gaps between “hands” and “wallets” ○ developers consume resources without realizing the charges ● evolving cloud landscape ○ external: new services, new discounts, ... ○ internal: new use cases, new architecture, ... #DSSAIS13 7
8 . Our solutions Analytics Raw Data Databricks cost and usage DATABRICKS Notebooks s3 access logs DELTA BI tools: s3 inventory DATA LAKE Superset, Tableau, ec2/rds snapshot ... reserved instances Monitors and alerts ... #DSSAIS13 8
9 . Our solutions Analytics Raw Data Databricks cost and usage DATABRICKS Notebooks s3 access logs DELTA BI tools: s3 inventory DATA LAKE Superset, Tableau, ec2/rds snapshot ... reserved instances Monitors and alerts ... The process problem: The data problem: prioritize, optimize, monitor, ETL and attribute costs automate #DSSAIS13 9
10 .The data problem ● cost and usage report (detailed billing) ○ CSV, grouped by month, updated daily #DSSAIS13 10
11 .The data problem ● cost and usage report (detailed billing) ○ CSV, grouped by month, updated daily ● EC2/RDS snapshots and reserved instances ○ JSON, from REST API #DSSAIS13 11
12 .The data problem ● cost and usage report (detailed billing) ○ CSV, grouped by month, updated daily ● EC2/RDS snapshots and reserved instances ○ JSON, from REST API ● S3 inventory ○ CSV/ORC, snapshot, updated daily/weekly ● S3 access logs ○ raw logs in text, updated multiple times a day #DSSAIS13 12
13 .The data problem ● cost and usage report (detailed billing) ○ CSV, grouped by month, updated daily ● EC2/RDS snapshots and reserved instances ○ JSON, from REST API ● S3 inventory ○ CSV/ORC, snapshot, updated daily/weekly ● S3 access logs ○ raw logs in text, updated multiple times a day #DSSAIS13 13
14 .Data pipelines with Spark Raw Data Data Lake Insight ETL Analytics Challenges ● Data corruptions ● Multiple jobs/staging tables ● Reliability and consistency #DSSAIS13 14
15 .Databricks Delta: Analytics Ready Data 1. Data Reliability 2. Query Performance ACID Compliant Transactions Very Fast at Scale Schema Enforcement & Evolution Indexing & Caching (10-100x Faster) LOTS OF NEW DATA Reporting Customer Data DATABRICKS Dashboards Click Streams DELTA Sensor data (IoT) Alerting DATA LAKE Video/Speech Machine Learning … 3. Simplified Architecture Unify batch & streaming Early data availability for analytics
16 .ETL: AWS cost and usage #DSSAIS13 16
17 .ETL: AWS cost and usage #DSSAIS13 17
18 .ETL: AWS s3 access logs #DSSAIS13 18
19 .Manage Databricks Delta tables ● Create table CREATE TABLE s3_access_logs USING delta LOCATION '$path' ● Optimize table OPTIMIZE s3_access_logs ZORDER BY bucket #DSSAIS13 19
20 .Manage Databricks Delta tables ● Create table CREATE TABLE s3_access_logs USING delta LOCATION '$path' ● Optimize table OPTIMIZE s3_access_logs ZORDER BY bucket ● Query table SELECT * FROM s3_access_logs WHERE bucket = 'my-bucket' Delta Logs: Files layout & File1 File2 File3 File1: min='a', max='g' statistics: File2: min='g', max='n' File3: min='o', max='z' #DSSAIS13 20
21 .Attributions ● Rule based attributions ○ accounts ■ dedicated accounts for different teams / use cases ○ tagging ■ tag resources with budget groups ○ manual rules ■ should avoid this as much as possible #DSSAIS13 21
22 .The process problem ● Prioritize ○ high data transfer cost ● Optimize ○ reserved instance purchases ● Monitor ○ predictions and alerts ● Automate ○ auto-shutdown unused resources #DSSAIS13 22
23 .Story: high data transfer cost ● Observation ○ Cross region data transfers are expensive ○ Two buckets cost about $1k/day #DSSAIS13 23
24 .Story: high data transfer cost ● Observation ○ Cross region data transfers are expensive ○ Two buckets cost about $1k/day ● Root cause ○ downloading spark images #DSSAIS13 24
25 .Story: high data transfer cost ● Actions ○ Distribute images to multiple regions. ○ Monitor on cross region cost #DSSAIS13 25
26 .Story: high data transfer cost ● Actions ○ Distribute images to multiple regions. ○ Monitor on cross region cost ● Results ○ Significantly reduced cost ○ Faster cluster creation #DSSAIS13 26
27 .Optimization: reserved instances ● Reserved instances (RI) ○ 1-yr/3-yr commitment in exchange for discounts ○ underutilized instances, upfront cost ○ significant discounts, availability #DSSAIS13 27
28 .Optimization: reserved instances ● Reserved instances (RI) ○ 1-yr/3-yr commitment in exchange for discounts ○ underutilized instances, upfront cost ○ significant discounts, availability ● Challenges ○ non-trivial to decide how much RI to purchase ○ need to predict the future #DSSAIS13 28
29 .Optimization: reserved instances ● Assign budgets to teams ● Provide tool to compute the optimal RI to buy ● Define process for RI purchase requests and approvals #DSSAIS13 29