- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 视频嵌入链接 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
TGIP-CN-018: Pulsar Functions Deep Dive
展开查看详情
1 . © 2020 SPLUNK INC. Pulsar Functions A Deep Dive | Pulsar Summit 2020 Sanjeev Kulkarni sanjeevk@splunk.com
2 . © 2020 SPLUNK INC. Pulsar Functions:- A Deep Dive Agenda Brief introduction to Pulsar Functions Deep Dive into internals • Submission workflow • Scheduling workflow • Execution workflow • Java Instance concepts Current/Future Work
3 . © 2020 SPLUNK INC. Pulsar Functions:- A Deep Dive Agenda Brief introduction to Pulsar Functions Deep Dive into internals • Submission workflow • Scheduling workflow • Execution workflow • Java Instance concepts Current/Future Work
4 . © 2020 SPLUNK INC. Pulsar Functions:- A Brief Introduction Core Concept Bringing Serverless concepts to the Abstract View streaming world. Execute processing logic per message on input topic Function output goes to an output topic • Optional
5 . © 2020 SPLUNK INC. Pulsar Functions:- A Brief Introduction Simple API Emphasis on simplicity SDK-less API Great for 90% use-cases on streams • Filtering import java.util.function.Function; public class ExclamationFunction implements Function<String, String> { • Routing @Override • Enrichment public String apply(String input) { return input + "!"; } } Not meant to replace Spark/Flink
6 . © 2020 SPLUNK INC. Pulsar Functions:- A Brief Introduction Function lifecycle Flexible execution environments • Pulsar managed – Thread – Process • Externally managed – Kubernetes CRUD based Rest API
7 . © 2020 SPLUNK INC. Pulsar Functions:- A Deep Dive Agenda Brief introduction to Pulsar Functions Deep Dive into internals • Submission workflow • Scheduling workflow • Execution workflow • Java Instance concepts Current/Future Work
8 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Function Representation Submit to any worker FunctionConfig public class FunctionConfig { Json repr of FunctionConfig private String tenant; private String namespace; • tenant/namespace/name private String name; • Input/Output private String className; private Collection<String> inputs; • configs private String output; • lot more knobs …. private ProcessingGuarantees processingGuarantees; private Map<String, Object> userConfig; private Map<String, Object> secrets; private Integer parallelism; Function Code private Resources resources; • jars/.py/zip/etc ... }
9 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Submission Checks AuthN/AuthZ checks FunctionMetaData FunctionConfig validation message FunctionMetaData { • missing parameters FunctionDetails functionDetails; • Incorrect parameters PackageLocationMetaData packageLocation; • Local Configs uint64 version; uint64 createTime; map<int32, FunctionState> instanceStates; Function Code Validation FunctionAuthenticationSpec functionAuthSpec; • class presence, etc } Copy Code to Bookeeper
10 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Function MetaData Manager Function MetaData Manager System of record Stores all Functions • map from <FQFN, FunctionMetaData> foo -> {functionDetails : {...}, MetaData Topic Tailer version: 2, …} FQFN:- Fully Qualified Function Name Backed by Pulsar Topic MetaData Topic • Function MetaData Topic Contains a MetaData Topic Tailer
11 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Function MetaData Manager:- Update State Machine Just before Function creation/update/delete foo -> {functionDetails : {...}, MetaData Topic Tailer version: 2, …} MetaData Topic
12 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Function MetaData Manager:- Update State Machine Make a copy of the current state foo -> {functionDetails : {...}, version: 2, …} foo -> {functionDetails : {...}, MetaData Topic Tailer version: 2, …} MetaData Topic
13 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Function MetaData Manager:- Update State Machine Make a copy of the current state foo -> {functionDetails : {......}, version: 2, …} Merge the updates foo -> {functionDetails : {...}, MetaData Topic Tailer version: 2, …} MetaData Topic
14 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Function MetaData Manager:- Update State Machine Make a copy of the current state foo -> {functionDetails : {......}, version: 3, …} Merge the updates foo -> {functionDetails : {...}, Increment the version MetaData Topic Tailer version: 2, …} MetaData Topic
15 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Function MetaData Manager:- Update State Machine Make a copy of the current state Merge the updates foo -> {functionDetails : {...}, Increment the version MetaData Topic Tailer version: 2, …} Write to MetaData Topic MetaData Topic
16 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Function MetaData Manager:- Update State Machine Make a copy of the current state Merge the updates foo -> {functionDetails : {...}, Increment the version MetaData Topic Tailer version: 2, …} Write to MetaData Topic Tailer reads and verifies MetaData Topic
17 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Function MetaData Manager:- Update State Machine Make a copy of the current state Merge the updates foo -> {functionDetails : {.....}, Increment the version MetaData Topic Tailer version: 3, …} Write to MetaData Topic Tailer reads and verifies MetaData Topic Upon no conflict, tailer updates
18 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Function MetaData Manager:- When do conflicts occur? Worker 2 Worker 1 Multiple Workers foo -> {functionDetails : {...}, foo -> {functionDetails : {...}, version: 2, version: 2, …} …} MetaData Topic Tailer MetaData Topic Tailer MetaData Topic
19 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Function MetaData Manager:- When do conflicts occur? Worker 2 Worker 1 Multiple Workers foo -> {functionDetails : {...}, foo -> {functionDetails : {...}, Concurrent updates to same function version: 2, …} version: 2, …} MetaData Topic Tailer MetaData Topic Tailer MetaData Topic
20 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Function MetaData Manager:- When do conflicts occur? Worker 2 Worker 1 Multiple Workers foo -> {functionDetails : {...}, foo -> {functionDetails : {...}, Concurrent updates to same function version: 3, …} version: 3, …} First Writer Wins MetaData Topic Tailer MetaData Topic Tailer MetaData Topic
21 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Function MetaData Manager:- When do conflicts occur? Worker 2 Worker 1 Multiple Workers foo -> {functionDetails : {...}, foo -> {functionDetails : {...}, Concurrent updates to same function version: 3, …} version: 3, …} First Writer Wins MetaData Topic Tailer MetaData Topic Tailer Others are rejected MetaData Topic
22 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Advantages Worker 2 Worker 1 Submit to any worker foo -> {functionDetails : {...}, foo -> {functionDetails : {...}, Validation load scales linearly version: 3, …} version: 3, …} Deterministic State Machine MetaData Topic Tailer MetaData Topic Tailer MetaData Topic is audit log MetaData Topic
23 . © 2020 SPLUNK INC. Pulsar Functions:- Submission Workflow Pitfalls Worker 2 Worker 1 MetaData topic topic growth foo -> {functionDetails : {...}, foo -> {functionDetails : {...}, MetaData Topic compaction version: 3, version: 3, …} …} non-trivial Worker Start time MetaData Topic Tailer MetaData Topic Tailer All Workers know everything MetaData Topic
24 . © 2020 SPLUNK INC. Pulsar Functions:- A Deep Dive Agenda Brief introduction to Pulsar Functions Deep Dive into internals • Submission workflow • Scheduling workflow • Execution workflow • Java Instance concepts Current/Future Work
25 . © 2020 SPLUNK INC. Pulsar Functions:- Scheduling Workflow Pluggable Scheduler IScheduler Interface Abstracts out Scheduler Executed only on a Leader public interface IScheduler { List<Assignment> schedule(<List<Instance> unassigned, List<Instance> current, Invoked when Set<String> workers); • Function CRUD operations } – create/update – delete • Worker Changes – Unresponsive/dead workers – New workers – Periodic – Leadership changes
26 . © 2020 SPLUNK INC. Pulsar Functions:- Scheduling Workflow Leader Election Leader Election Empty Coordination Topic Failover Subscription Worker 3 Worker 2 Worker 1 Active Consumer is the Leader Coordination Topic
27 . © 2020 SPLUNK INC. Pulsar Functions:- Scheduling Workflow Function Assignments Worker 3 Worker 2 Worker 1 Assignment Topic {foo, 1} : worker-1, {foo, 1} : worker-1, {foo, 1} : worker-1, ... ... ... Written by the Leader Compacted based on key(FQFN + Assignment Tailer Assignment Tailer Assignment Tailer Instance Id) All workers know about all assignments Assignment Topic
28 . © 2020 SPLUNK INC. Pulsar Functions:- Scheduling Workflow Assignment Topic Assignment Stores Assignment message Instance { Compacted FunctionMetaData functionMetaData = 1; int32 instanceId = 2; Key -> (FQFN + InstanceId) } message Assignment { Instance instance = 1; string workerId = 2; }
29 . © 2020 SPLUNK INC. Pulsar Functions:- A Deep Dive Agenda Brief introduction to Pulsar Functions Deep Dive into internals • Submission workflow • Scheduling workflow • Execution workflow • Java Instance concepts Current/Future Work