- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
PingCAP-Infra-Meetup-105-Happy+Hacking+TiDB
展开查看详情
1 . Streaming, SQL and TiDB Batch Streaming SQL Presented By Du Chuan
2 .About me • Infrastructure engineer - working on cloud database • TiDB committer - mainly in the TiDB SQL layer • Technophile - database & distribute systems
3 .Agenda • Streaming Overview • SQL and Streaming SQL • Introduction to TBSSQL
4 .Part I – Streaming Overview
5 .“Before it’s news, it’s data. After it‘s news, it’s data.”
6 .Batch systems VS Streaming systems
7 .Batch VS Streaming • Processing on data • Get data via processing • Finite data set • Infinite data streaming
8 .But wait… What’s streaming, indeed?
9 .Essentials of Streaming • Bind with time • Strictly ordered by time • Infinite
10 .Streaming is the way the world organizing data
11 . Batch …… …… Streaming
12 .Batch is a snapshot of some Streaming At some point of time In some scope
13 .So... One system to rule them both ?
14 .Batch && Streaming systems • Using a batch runtime to simulate • Using a streaming runtime to simulate streaming batch
15 .What about the RDMS?
16 . An example – Ads click analysis Click- User ID Click User 1 ADS Firehose 1 DML Click- User ID Click User 2 ADS 2 Click- User ID Click User 3 ADS 3 Click- User ID Click User 1 ADS 1 Click- User ID Click User 3 ADS 3 Click- User ID User Table Ads Table Click User 1 ADS 1
17 .Part II – SQL and Streaming SQL
18 .The volcano model… Again
19 .SQL consists various kinds of operators
20 . The SQL operators SELECT t1.a, count(*) FROM t1 JOIN t2 ON t1.b = t2.b WHERE t1.c > 10 GROUP BY t1.a; Projection Join Selection Aggregation
21 .The SQL operator interface • Open: Init operator SQL Operator • Next: Consume input and generate output • Close: Do cleanup job
22 .The volcano model input = child.Next() return proj(input, t.a + t.b) SELECT t.a + t.b (". $ + ". &) FROM t input = child.Next() WHERE t.c > 10; return filter(input, t.c > 10) (t.c > 10) t return scan(t)
23 .The SQL operator reordering >1 ( ) ) ) > 2 < 2 > ( ) ) ) 2 < >1 2
24 . Aggregation operator • SELECT a, COUNT(c) AS cnt FROM t GROUP BY a HAVING cnt > 1; 1. Grouping 2. Apply aggregators 3. Filter using“Having”clause
25 . Join operator • SELECT * FROM t1 (LEFT/RIGHT/FULL OUTER) JOIN)t2 ON t1.a = t2.a; 1. Get Cartesian product 2. Filter using “On” . clause
26 .When SQL operator meets Streaming…
27 .SELECT a, COUNT(c) FROM t GROUP BY a; 1. Grouping(Never End) 2. Apply aggregators Infinite Data Stream
28 .Solution: limit the scope of streaming using “window”
29 . SELECT a, COUNT(c) FROM t GROUP BY a; 1. Grouping 2. Apply Aggregators Window1 Infinite Data Stream Window2