- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Bestpay-Pulsar
通过此次分享您可以了解到:
Lambda架构和Apache Pulsar
Orange Financial如何使用Lambda进行风险控制决策部署,以及如何利用Pulsar提高效率
展开查看详情
1 . How Orange Financial combat financial frauds over 50M transactions a day using Apache Pulsar Vincent Xie (Bestpay), Jia Zhai (StreamNative)
2 .About us Vincent (Weisheng) Xie Jia Zhai ❏ Current Director @ Orange Financial ❏ Co-Founder of StreamNative ❏ Previous Tech lead of ML engineering ❏ Apache Pulsar PMC Member team @ Intel ❏ Apache BookKeeper PMC Member
3 .Agenda ❏ Background ❏ Apache Pulsar ❏ Unified Data Processing ❏ Our Practices ❏ Q&A
4 .Background Intro
5 .Orange Financial Orange Financial Services Group (Chinese: 甜橙金融), formerly known as Bestpay, is an affiliate company of China Telecom. It reached 1.13 trillion CNY ($18.37 Billion) transaction volume in 2018, with 500 million registered users and 41.9 million active users. Subsidiaries: Bestpay - a mobile wallet and payment app Jieqian - a consumer loan service Orange Wealth Orange Insurance Orange Credit Orange Financial Cloud
6 .
7 .Source: iiMedia Research Inc.
8 .High Industry Penetration Rate Source: China Unionpay
9 .Source: RSA
10 .Challenges ❏ High concurrency ❏ > 50M transactions, 1 billion events a day (peek: 35K/s) ❏ Low latency demand ❏ response < 200ms ❏ Large number of batch jobs and streaming jobs
11 .“A merchant’s total transaction volume ($) within the past month (30days) (current transaction included)” = sum($past_29days) + sum($today_upto_current) batch streaming
12 .Architecture V1 API Gateway
13 .Architecture V1 - Lambda Speed/Streaming Layer API Serving Gateway Layer Batch Layer
14 .Drawbacks ❏ S/W stacks complexity ❏ Realtime / Offline / Serving stacks ❏ Multiple clusters to maintain (Kafka / Hive / Spark / Flink) ❏ Different skill sets to manipulate (Scala / Java / SQL) ❏ Segmented Logics ❏ Historical/Current ❏ Data redundancy ❏ Multiple duplications to move over
15 .Introduce Apache Pulsar
16 .What is Apache Pulsar?
17 . “Flexible Pub/Sub Messaging Backed by durable log storage”
18 .Pulsar - A cloud-native architecture Stateless Serving Durable Storage
19 .Pulsar - Segment Centric Storage ❏ Topic Partition (Managed Ledger) ❏ The storage layer for a single topic partition ❏ Segment (Ledger) ❏ Single writer, append-only ❏ Replicated to multiple bookies
20 .Pulsar - Pub/Sub
21 .Pulsar - Topic Partitions
22 .Pulsar - Segments
23 .Pulsar - Stream
24 .Pulsar - Stream as a unified view on data
25 .Pulsar - Two levels of reading API ❏ Pub/Sub (Streaming) ❏ Read data from brokers ❏ Consume / Seek / Receive ❏ Subscription Mode - Failover, Shared, Key_Shared ❏ Reprocessing data by rewinding (seeking) the cursors ❏ Segment (Batch) ❏ Read data from storage (bookkeeper or tiered storage) ❏ Fine-grained Parallelism ❏ Predicate pushdown (publish timestamp)
26 .Unified Data Processing on Pulsar
27 .Architecture V2 Spark Structured Streaming Spark SQL API Gateway
28 .Architecture V2 Spark Structured Streaming Spark SQL API Gateway ❏ Single Data Store (Pulsar) ❏ Single Computing Engine (Spark) ❏ Unified API
29 .Pulsar-Spark ❏ Deeply integrated with Pulsar schema ❏ Pulsar topics as Structured Streams ❏ Pulsar Connectors for Spark Structured Streaming ❏ Pulsar Connectors for Spark SQL https://github.com/streamnative/pulsar-spark