- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
HBase 在网易的实践
展开查看详情
1 .Apache HBase At Netease XINXIN FAN,HONGXIANG JIANG
2 .Agenda n Overview HBase Service In Netease n Key Practices Over HBase n What We Have Done To HBase n What We Are Doing Now
3 .BigData System In Netease - mengma
4 .BigData System In Netease - youdata
5 .HBase In Netease HBase Users come from 6 major departments, more than 40 different applications
6 .HBase In Netease
7 .HBase In Netease 7 HBase Clusters 200+ RegionServers Hundreds of Terabytes Data
8 .Agenda n Overview HBase Service In Netease n Key Practices Over HBase n What We Have Done To HBase n What We Are Doing Now
9 .Key Practices - Linux System n Tuning transparent huge pages (THP) off n Set vm.swappiness = 0 n Set vm.min_free_kbytes to at least 1GB n Disable NUMA zone reclaim with vm.zone_reclaim_mode = 0
10 .Key Practices - Linux System
11 .Key Practices - Schema ² Not Use PREFIX_TREE DATA_BLOCK_ENCODING !!! n HBASE-12959 : compact never end n HBASE-12817(fixed) : Data missing while scanning
12 .Key Practices - Schema ü Use More Useful Table-Level Configuration !!! n MAX_FILESIZE n MEMSTORE_FLUSHSIZE n DFS_REPLICATION
13 .Key Practices - GC ü Use BucketCache(Offheap) Instead of LRUBlockCache !!! HBase流量(优化前) HBase流量(优化后) 160000 160000 140000 140000 120000 120000 100000 100000 80000 80000 60000 60000 40000 40000 20000 20000 0 0 19 55 109 127 145 163 181 199 217 235 253 271 37 73 91 1 1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154
14 .Key Practices - GC ü CMS GC ---- Xmn : 1~3G, -XX:SurvivorRatio=2
15 .Agenda n Overview HBase Service In Netease n Key Practices Over HBase n What We Have Done To HBase n What We Are Doing Now
16 .Request Queue At Table-Level Different workloads may influence each other frequently! n The write requests with large fields may influence the small write requests n The scan requests with high throughput may influence the other scan requests active handlers preemption? assign the independent request queue to the large requests
17 .Request Queue At Table-Level
18 .Request Queue At Table-Level
19 .Improvement – Table Metrics View n RegionServer Metrics? Region Metrics? n Sometimes, Table Metrics is more Useful!
20 .Improvement – Table Metrics View
21 .Improvement – Table Metrics View
22 .Improvement – Table Metrics View
23 .Improvement – Table Metrics View Block Hit Cache or Miss? Region StoreFile … Table … Block Store … Region StoreFile hot file or cold? Store
24 .Hot Files , You may do more n Compaction Policy Based on Hot Files? n Hierarchical Storage Policy Based on Hot Files?
25 .Improvement – Table JMX Metrics
26 .Others n Check and Merge the empty region periodically n Set the Request Priority per table n More configuration set to Table-Level u COMPACTION_THRESHOLD u MAJOR_COMPACTION_PERIOD
27 .What We Are Doing Now n Inverted Index n RegionServer Group n Highly Availlable HBase
28 .Improvement – InvertedIndex basic username age school city … uid1 uid2 … ü select * from table where uid = ‘xxx’ Full Table Scan select * from table where school = ‘shenzhen’ and age > 30
29 .Improvement – InvertedIndex shenzhen : <uid1, uid3, … uidy> InvertedIndex wuhan : <uid2, uid5, … uidx> … where city = “wuhan” Rowkeys : <uid2, uid5, … uidx> select * from user where uid in (uid2,uid5,…uidx) <user2, user5,…userx>