- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
HBase同步复制
展开查看详情
1 .HBase Synchronous Replication Meng Qingyi Shen Chunhui www.aliyun.com/aliware
2 .Outline Where to use replication Asynchronous replication Synchronous replication www.aliyun.com/aliware
3 . Where to use replication Master-Slave Multi Datacenters Data Export 40% DC A Cluster A 30% 30% Other Exporter System Cluster B DC B DC C www.aliyun.com/aliware
4 .Asynchronous replication Improve Enhance parallel on send Enhance batch on sink Use idle resources to reduce hotspot Online configuration change Replication failover isolation www.aliyun.com/aliware
5 . Asynchronous replication Reduce hotspot master HRegionserver Slave cluster request idle resource HRegionserver Slave cluster hlog replication HRegionserver Slave cluster hlog HRegionserver Slave cluster www.aliyun.com/aliware 5
6 . Asynchronous replication Replication topology Table scope replication Replication topology monitor Replication cycle Cluster 1 Cluster 2 Table A Table A Table B Table B Table C Cluster 3 Cluster 4 Table B Table B Table C Table C www.aliyun.com/aliware
7 .Synchronous replication Motivation Replication within two datacenter Access master on normal Switch to slave when master down Strong consistency on access Master Slave www.aliyun.com/aliware
8 .Synchronous replication Consistency semantic Write -Success when response is “success” -Unknown when response is “failure” Read -Data is always readable after it is written successfully In any circumstances, data remain eventual consistency between master and slave www.aliyun.com/aliware
9 .Synchronous replication Application Master 1 put Slave handler handler 3 2 local log 2 remote log Mem HDFS HDFS Mem Async- Async- Replication Replication Manager Manager www.aliyun.com/aliware
10 .Remote log Log content Data not yet replicated by asynchronous replication File format Same as hlog, collection of entries Log organization remote log and hlog is many to one relationship Use same prefix for file name Store on slave hdfs www.aliyun.com/aliware
11 .Remote log clean When to clean remote log? whenthe corresponding hlog is replicated by asynchronous replication Who clean remote log? Master cluster www.aliyun.com/aliware
12 .Remote log When need disable Remote log Beforeswitch. There may be some client still accessing master. How to disable Remote log Create lock file recover lease for current remote logs www.aliyun.com/aliware 12
13 .Failure scenarios Application Case : master crash 3 Switch client Master Slave handler handler 2 Replay remote log Mem HDFS HDFS Mem 1 Disable remote log Async-Replication Async-Replication Manager Manager www.aliyun.com/aliware
14 .Failure scenarios Case : master recovery Application 6 Switch client Master 1 Disable read/write Slave 4 Disable read/write handler handler 7 Enable read/write 2 Enable remote log Mem HDFS HDFS Mem Async-Replication Async-Replication Manager Manager 5 wait until consistent 3 wait sync delay < 10s www.aliyun.com/aliware
15 .Failure scenarios Application Case : slave crash Master Slave handler handler 1 degrade to async repliaction Mem HDFS HDFS Mem Async-Replication Async-Replication Manager Manager www.aliyun.com/aliware
16 . Consistency Case Action Consistency Local log success 1 Block and retry forever Keep consistence when retry Remote log fail 2 if server crash, write remote log success again on replay Local log fail Return fail to client 1 if client keep accessing master, Remote log success remote log will be delete and never replay on slave 2 before remote log is delete, client switch to slave. Remote log will be replay and seen by client, async- replication will deliver this log back to master Local log fail Return fail to client Remain consistence Remote log fail Local log success Return success to client Remain consistence Remote log success www.aliyun.com/aliware
17 .Switch support Availability monitor Network partition Node crash Error rate Switch API Define active and backup -Active cluster is the one access by clients -Backup cluster is disabled for access Define switch process from cluster A to cluster B -Switch A from active to backup -Switch B from backup active Unify synchronous and asynchronous Client switch Logical cluster address Push new cluster address www.aliyun.com/aliware
18 .Synchronous replication Use case Internal state for stream processing Sequential access: pub/sub system CheckAndPut operation Performance 2% throughput decline than async replication (network delay = 0.5ms) www.aliyun.com/aliware
19 .Synchronous vs. Asynchronous Asynchronous Synchronous Read Path No affect No affect Write Path No affect ~2% throughput decline Network 100% for asynchronous 200% for asynchronous replication replication and remote log Eventual consistency No if master crash and can Yes not recover Availability Blocking until master Block few minus waiting replication recovery which remote log replay may take hours on massive crash Storage space 2 copy 2 copy + remote log(small) www.aliyun.com/aliware
20 . Thanks tianwu.sch@alibaba-inc.com qingyi.mqy@alibaba-inc.com www.aliyun.com/aliware