- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
14/06 - Apache Cassandra Best Practices at Ebay
展开查看详情
1 .Cassandra Best Prac-ces at ebay inc Feng Qu principal database engineer, ebay inc September 11, 2014 CassandraSummit2014 | #CassandraSummit
2 .Agenda • ebay inc Cassandra footprints • NoSQL life cycle • Cassandra best prac?ces • Q&A CassandraSummit2014 | #CassandraSummit
3 .ebay inc CassandraSummit2014 | #CassandraSummit
4 .ebay inc Database Pla5orms • We manage thousands of databases powering eBay and PayPal CassandraSummit2014 | #CassandraSummit
5 .Why NoSQL? • Challenges of tradi?onal RDBMS • Performance penalty to maintain ACID features • Lack of na?ve sharding and replica?on features • Lack of linear scalability • Cost of soMware/hardware • Higher cost of commit • NoSQL used in eBay inc • Cassandra, Couchbase, MongoDB managed by DBA • HBase, Redis, OpenTSDB managed by developers CassandraSummit2014 | #CassandraSummit
6 .Cassandra @ ebay inc • Started in 2011 at eBay and later expanded to PayPal • Started with Apache Cassandra 0.8, now using Apache Cassandra 2.0 and DataStax Enterprise 4.0 • Over a dozen produc?on clusters on hundreds of servers across 3 data centers • Choices between dedicated cluster for large/cri?cal use case and mul?-‐tenant cluster for small use cases • Over 20 billions daily reads/writes to Cassandra • Cluster size varies from 4-‐node to 80-‐node • 100TB+ user data on HDD, local SSD and SSD array • One cluster is es?mated to grow over few PBs CassandraSummit2014 | #CassandraSummit
7 .NoSQL Life Cycle Use Case Analysis Data Operation Modeling Capacity Deployment Planning CassandraSummit2014 | #CassandraSummit
8 .Data Modeling Phase • Development team requests a review mee?ng for a new use case with data architect • Once data architect understands requirement and then recommends a proper data store. It could be either one of RDBMS or one of NoSQL products we support • Both par?es work on data modeling together • Outputs the engagement are a set of ?ckets, for tracking purpose, which captures project informa?on and data configura?on for chosen data store. CassandraSummit2014 | #CassandraSummit
9 .Data Modeling Best Prac-ces • Unlike tradi?onal RDBMS, data modeling for Cassandra is quite different. • Modeling around query pa_ern, not en?ty • De-‐normalize to improve read performance • Separate read heavy data from write heavy data • Store values in column names as names are physical sorted already • Former eBay architect Jay Patel published few technical blogs on Cassandra data modeling. CassandraSummit2014 | #CassandraSummit
10 .Data Modeling Best Prac-ces -‐ indexing • Secondary index + Less overhead as built in + data and index are changed atomically -‐ not scale well with high cardinality data • Column family as index + No hot spot -‐ index is maintained manually by applica?on -‐ index change is not atomically • Avoid secondary index and use column family as index if possible CassandraSummit2014 | #CassandraSummit
11 .Benchmark Tes-ng • Benchmark tes?ng is key to capacity planning • Performance baseline with near-‐real traffic in produc?on size environment • for different type of hardware • for different soMware release • for different use case or workload • A proac?ve and repe??ve process CassandraSummit2014 | #CassandraSummit
12 .Capacity Planning Phase • Is key to avoid surprise in produc?on • The concept behind capacity planning is simple, but the mechanics are harder. • Business requirements may increase, need to forecast how much resource must be added to the system to ensure that user experience con?nues uninterrupted • Input: clearly defined capacity goal coming from business requirement and performance baseline from benchmark test • Output: Iden?fy resources to be added, such as memory, CPU, storage, I/O, network • Always prepare for peak + headroom CassandraSummit2014 | #CassandraSummit
13 .Deployment Best Prac-ces • SoMware packages with customized op?miza?on • kernel, JVM heap, compac?on • Deployment automa?on for efficiency • Mul? data center deployment for load balancing and disaster recovery • Vnode is a must for manageability • SSD as default storage requires addi?onal OS level tuning CassandraSummit2014 | #CassandraSummit
14 .Opera-on Best Prac-ces • Collect system and database metrics • Monitoring and aler?ng • event driven and metrics driven alerts • Opera?on runbook • Reduce human error • Performance tuning runbook • nodetool tpstats for dropped requests • nodetool cdistograms for latency distribu?on • Troubleshoo?ng runbook • Document previous incidents as future reference CassandraSummit2014 | #CassandraSummit
15 .Opera-on Best Prac-ces • Rou?ne repair is not really needed if there is no deletes. You s?ll need run repair aMer bringing up a down node if it is dead for a while • Use CNAME in client configura?on to avoid client conf change in case of hardware replacement with new IP/ name • Reduce gc_grace to reduce overall data size • Disable row cache, unless you have <100K rows • Collect sta?s?cs, real-‐?me or historical, to monitor overall system performance • Disable swap to avoid a slow node CassandraSummit2014 | #CassandraSummit
16 .Capacity Review • Rou?ne capacity review and adjustment • When to scale up and when to scale out • In general, scale out by adding nodes to increase capacity with NoSQL • Some?mes, it’s cost efficient to scale up at component level by iden?fying scaling bo_leneck, then resolve it accordingly • Network bandwidth: upgrade to 10 Gbps network • I/O latency: upgrade to (be_er) SSD • Storage: add/expand data volume CassandraSummit2014 | #CassandraSummit
17 .Typical Use Cases • Write Intensive: metrics collec?on, logging • Collec?ng metrics from tens of thousands devices periodically • Read Intensive: home page feeds • Recommenda?on backend to generate dynamic taste graph • Mixed workload: personaliza?on, classifica?on • Data is loaded from data warehouse periodically in bulk and from user events consistently • Data is retrieved in real ?me when user visits ebay site CassandraSummit2014 | #CassandraSummit
18 .Metrics Collec-on Applica-on CassandraSummit2014 | #CassandraSummit
19 .The End • We are hiring for NoSQL talent. • Contact: • fengqu@ebay.com • www.linkedin.com/in/fengqu/ • Q&A CassandraSummit2014 | #CassandraSummit