- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 视频嵌入链接 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Apache Kyuubi 在 eBay 的实践 -王斐
王斐
软件工程师
eBay 软件工程师,Apache Kyuubi PPMC Member
**演讲主题:**Apache Kyuubi 在 eBay 的实践
**演讲概要: **Apache Kyuubi 的基本架构和使用场景, 以及在 eBay 的实践 - 基于 Apache Kyuubi 构建 Unified & Serverless Spark Gateway。
展开查看详情
1 .Apache Kyuubi(Incubating) in eBay 2022-03-12
2 .Agenda • Introduction • Use case • Improvements for eBay internal requirements • Unified & Serverless Spark gateway Apache Kyuubi in eBay 1
3 .What is Kyuubi Apache Kyuubi(Incubating) is a distributed multi-tenant Thrift JDBC/ODBC server for large-scale data management, processing, and analytics, built on top of Apache Spark and designed to support more engines (i.e., Flink, Trino) Apache Kyuubi in eBay 2
4 .Spark Thrift Server(STS) vs Kyuubi Kyuubi - a better Spark SQL Gateway STS Kyuubi submission JDBC / Spark SQL JDBC / Spark SQL compile Spark Catalyst / Server Spark Catalyst / Engine execution all the queries share the same Spark app multiple kinds of spark app share policies Apache Kyuubi in eBay 3
5 .The Architecture of Kyuubi Kyuubi Server - Distributed & Lightweight - Cloud Native - High available & load balancing - Shared for all clients Kyuubi Engine - Pre-programed & Extensible - Full Spark Sql functionality - Isolated/Shared by tenants Apache Kyuubi in eBay 4
6 .Engine Share Level - trade-off between Isolation & Resource • CONNECTION • USER • POOL connection pool user Apache Kyuubi in eBay 5
7 .Engine - Dynamic Resource Scaling Executor - Spark Dynamic Resource Allocation spark.dynamicAllocation.enabled=true spark.dynamicAllocation.minExecutors=0 spark.dynamicAllocation.executorIdleTimeout=120 spark.dynamicAllocation.cachedExecutorIdleTimeout=300 Driver - Kyuubi Engine Share Level kyuubi.engine.share.level=CONNECTION|USER kyuubi.session.engine.idle.timeout=PT12H For CONNECTION share level, the engine stop directly after client connection closes. Apache Kyuubi in eBay 6
8 .Use case - Spark SQL & Scala SQL cover 80% case SQL + Scala cover 95% case KyuubiStatement::executeQuery KyuubiStatement::executeScala Apache Kyuubi in eBay 7
9 .Kyuubi In eBay - Background Apache Kyuubi in eBay 8
10 .Improvements for eBay Internal requirements 1. Support KERBEROS and PLAIN authentication for thrift jdbc at the same time (#1262) 2. Support launch query engine asynchronously during opening session(#1346) 3. Support to get launch engine log asynchronously for JDBC(#1377) 4. Extend Hive BeeLine to support fetch launch engine log with beeline(#1414) 6. Extend Kyuubi thrift api to support upload/download API 7. Support cluster(data center) selector to serve requests from multiple clusters(based on #687) 8. Support to renew token for kyuubi engines with multiple clusters case 9. Support SPNEGO and BASIC authentication for Restful API(#2049) 10. Enable restful api to support run sql query and batch job submission Apache Kyuubi in eBay 9
11 .Hive Thrift Protocol - sync vs async 1 2 1 client server engine client server engine 4 3 2 Sync OpenSession ASync OpenSession (OpenSession + Launch Engine Operation) Apache Kyuubi in eBay 10
12 .Extend hive thrift rpc in a compatible way struct TOpenSessionReq { struct TOpenSessionResp { 1: required TProtocolVersion client_protocol 2: optional string 1: required TStatus status username 2: required TProtocolVersion serverProtocolVersion 3: optional string password 3: optional TSessionHandle sessionHandle 4: optional map<string, string> configuration 4: optional map<string, string> configuration } } struct TExecuteStatementReq { struct TExecuteStatementResp { 1: required TSessionHandle sessionHandle 1: required TStatus status 2: required string statement 2: optional TOperationHandle operationHandle 3: optional map<string, string> confOverlay } 4: optional bool runAsync = false 5: optional i64 queryTimeout = 0 } Enable: kyuubi.session.engine.launch.async=true TOperationHandle = TOpenSessionResp + configuration:kyuubi.session.engine.launch.handle.guid + configuration:kyuubi.session.engine.launch.handle.secret Apache Kyuubi in eBay 11
13 .Hive Thrift Protocol - async open session Apache Kyuubi in eBay 12
14 .eBay Deployment - Unified & Serverless Spark ❖ Unified ➢ Endpoint ■ kyuubi.k8s-lb.ebay.com:10009 ➢ Clusters ■ Cluster-A ● jdbc://kyuubi.k8s-lb.ebay.com:10009/default#kyuubi.sessio n.cluster=Cluster-A ■ Cluster-B ● jdbc://kyuubi.k8s-lb.ebay.com:10009/default#kyuubi.sessio n.cluster=Cluster-B ➢ Authentication ■ Kerberos ■ LDAP ➢ Functions ■ Ad hoc ● Spark-SQL ● Spark-Scala ■ ETL Spark submission ❖ Serverless ➢ Cloud-native Servers ➢ HA & LB ➢ Multi tenancy Engines Apache Kyuubi in eBay 13
15 .eBay Deployment - Unified & Serverless Spark ❖ Cluster selector ➢ Properties for per cluster ■ env map(#687) ● HADOOP_CONF_DIR ● HIVE_CONF_DIR ● SPARK_CONF_DIR ■ cluster zookeeper conf ■ other cluster specified conf ➢ Hadoop authentication per cluster ■ proxy user verification ➢ Hadoop Credentials Manager per cluster ■ Hadoop fs delegation token ■ Hive delegation token Apache Kyuubi in eBay 14
16 .Restful API(WIP) ❖ Authentication(#2049) ➢ NEGOTIATE(SPNEGO) ➢ BASIC(Password authentication) ❖ Sources ➢ Sessions(for admin) ■ POST /sessions ■ DELETE /sessions/{sessionId} ➢ SQL ■ POST /sql ➢ Batch(submit spark app with jar) ■ POST /batches ■ GET /batches/{batchId} ■ GET /batches/{batchId}/log Apache Kyuubi in eBay 15
17 .Who is using Apache Kyuubi(Incubating)? Apache Kyuubi in eBay 16
18 . QA Welcome to join Kyuubi community~ Apache Kyuubi in eBay 17