- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 视频嵌入链接 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Apache Pulsar在BIGO的实践
陈航, BIGO 大数据平台消息系统负责人,Apache Pulsar & Bookkeeper Contributor
主要介绍 BIGO 引入 Pulsar 的初衷、Pulsar 性能调优上的实践,以及 Flink 与 Pulsar 结合在 BIGO 实时数据处理上的应用。
展开查看详情
1 .
2 .Apache Pulsar BIGO | BIGO
3 .About Me - BIGO - Apache Pulsar Contributor - Apache Bookkeeper Contributor - StreamNative/Pulsar-Flink-Connector Contributor
4 .Why BIGO Choose Apache Pulsar - BIGO BIGO Kafka Kafka o o ISR o catch-up PageCache o HDD o (KMM) o , BIGO Kafka 0.5 / 1 /
5 .Why BIGO Choose Apache Pulsar – Apache Pulsar Apache Pulsar Apache Ø Ø Yahoo - Pub-Sub Ø 5 ms Ø Pulsar Apache BookKeeper Kafka Ø BookKeeper IO catch-up Ø : Pulsar geo-replication , Ø : Pulsar
6 .Apache Pulsar BIGO 2019.12 2020.4 Apache Pulsar Apache Pulsar Apache Pulsar Apache Pulsar Apache Pulsar & Apache Kafka 2019.11 2020.4 2020.5
7 .BIGO Pulsar / … Pulsar Pulsar 2~3 GB/s
8 .Apache Pulsar BIGO Baina (BIGO C++ ) KMM (Kafka Mirror Maker) Flink SQL … … PUB - SUB
9 .Apache Pulsar BIGO … …
10 . StreamNative/Pulsar-flink-connector PR-115 Ø flink 1.11 BIGO DEMO – flink sql Ø Flink SQL ❶ Pulsar ❸ Hive
11 .BIGOer Apache Pulsar &
12 . (Apache Pulsar 2.5.1 HDD) Ø ZooKeeper Pulsar Ø Pulsar broker Pulsar client Lookup Timeout Exception” Ø Pulsar broker Ø Bookie Ø Pulsar broker Cache bookie Ø Reader API(eg. Pulsar Flink Connector) Pulsar topic Pulsar 2.5.2 Ø broker direct memory OOM Ø Journal HDD fsync bookie add entry 99th latency Ø bookie add entry latency
13 .Apache Pulsar ü Bookie Journal/Ledger § OS: 1 ~ 2 GB ü Journal/Ledger HDD ZooKeeper § JVM: 1/2 dataDir/dataLogDir Journal/Ledger § heap: 1/3 § direct memory: 2/3 § PageCache: 1/2 o jvm heap/gc o bytes in per broker o message in per broker o loadbalance o broker Cache o bookie client quarantine ratio o bookie client request queue Broker
14 .Pulsar Broker Ø ZooKeeper Ø auto bundle split Ø § Broker § Bookie Ø Ø Cache Bookkeeper Ø Journal Ø Ledger
15 . ZooKeeper § HDD ZooKeeper dataDir/dataLogDir IO bookie Journal/Ledger ) SSD § ZooKeeper dataDir dataLogDir SSD § broker/bookie ZooKeeper
16 .Pulsar Broker Ø ZooKeeper Ø auto bundle split Ø § Broker § Bookie Ø Ø Cache Bookkeeper Ø Journal Ø Ledger
17 . auto bundle split Pulsar bundle split producer/consumer/reader namespace bundle auto bundle split broker1 broker1 broker4 broker5 broker2 broker3
18 .Pulsar Broker Ø ZooKeeper Ø auto bundle split Ø § Broker § Bookie Ø Ø Cache Bookkeeper Ø Journal Ø Ledger
19 . —Broker (PR-6772) Broker1 Broker2 B1 B2 B5 new_avg = old_avg * factor + (1-factor) * avg B3 B4 B6 Broker1 Broker2 Loadbalance : B1 B2 B5 broker resource usage > average resource usage + threshold B3 B4 B4 B6 0 avg avg+threshold 100 loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl. ThresholdShedder
20 .Pulsar Broker Ø ZooKeeper Ø auto bundle split Ø § Broker § Bookie Ø Ø Cache Bookkeeper Ø Journal Ø Ledger
21 .—Bookie
22 .—Bookie Bookkeeper PR-2327 Bookie Client
23 .Pulsar Broker Ø ZooKeeper Ø auto bundle split Ø § Broker § Bookie Ø Ø Cache Bookkeeper Ø Journal Ø Ledger
24 .– Broker Ø
25 .Pulsar Broker Ø ZooKeeper Ø auto bundle split Ø § Broker § Bookie Ø Ø Cache Bookkeeper Ø Journal Ø Ledger
26 . Cache (PR-6769, PR-7894) Ø broker Cache Tailing Read Ø bookie write Cache(Memtable) Ø bookie read Cache Ø OS PageCache Has Active Cursor: § durable cursor § Cursor lag managedLedgerCursorBackloggedThreshold ( 1000 entry)
27 .Pulsar Broker Ø ZooKeeper Ø auto bundle split Ø § Broker § Bookie Ø Ø Cache Bookkeeper Ø Journal Ø Ledger
28 .Bookkeeper Journal
29 .Bookkeeper Journal Bookkeeper PR-2287 JOURNAL DEVICE fsync = true SSD Journal 1. Roll file 2. OS flush ( 30s) fsync = false 3. flush JOURNAL flush DEVICE PageCache HDD