- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Intro to MLflow (2018-11-13)
展开查看详情
1 . : Platform for Complete Machine Learning Lifecycle Tomas Nykodym Nov 13, 2018
2 .Outline Overview of ML development challenges MLflow components Demo How to get started
3 . Machine Learning Development is Complex
4 . μ ML Lifecycle λθ Tuning Scale Data Prep μ λθ Tuning Delta Raw Data Training Scale Scale Deploy Governance Scale 4
5 .Example “I build 100s of models/day to lift revenue, using any library: MLlib, PyTorch, R, etc. There’s no easy way to see what data went in a model from a week ago, tune it and rebuild it.” -- Chief scientist at ad tech firm
6 .Example “Our company has 100 teams using ML worldwide. We can’t share work across them: when a new team tries to run some code, it often doesn’t even give the same result.” -- Large consumer electronics firm
7 .Introducing Open machine learning platform • Works with any ML library & language • Runs the same way anywhere (e.g. any cloud) • Designed to be useful for 1 or 1000+ person orgs
8 .MLflow Design Philosophy 1. “API-first”, open platform • Allow submitting runs, models, etc from any library & language • Example: a “model” can just be a lambda function that MLflow can then deploy in many places (Docker, Azure ML, Spark UDF, …) Key enabler: built around REST APIs and CLI
9 .MLflow Design Philosophy 2. Modular design • Let people use different components individually (e.g., use MLflow’s project format but not its deployment tools) • Easy to integrate into existing ML platforms & workflows Key enabler: distinct components (Tracking/Projects/Models)
10 .MLflow Components Tracking Projects Models Record and query Packaging format General model format experiments: code, for reproducible that supports diverse configs, results, runs deployment tools …etc on any platform 10
11 . MLflow Tracking Notebooks Python or REST API UI Local Apps Tracking Server API Cloud Jobs
12 .MLflow Tracking Example import mlflow with mlflow.start_run(): mlflow.log_param("layers", layers) mlflow.log_param("alpha", alpha) # train model mlflow.log_metric("mse", model.mse()) mlflow.log_artifact("plot", model.plot(test_df)) mlflow.tensorflow.log_model(model) 12
13 . MLflow Projects Local Execution Project Spec Code Config Data Remote Execution
14 .Example MLflow Project my_project/ ├── MLproject conda_env: conda.yaml │ entry_points: │ main: parameters: │ training_data: path lambda: {type: float, default: 0.1} │ command: python main.py {training_data} {lambda} │ ├── conda.yaml ├── main.py └── model.py $ mlflow run git://<my_project> ... mlflow.run(“git://<my_project>”, ...)
15 . MLflow Models Inference Code Model Format Flavor 1 Flavor 2 Batch & Stream Scoring Simple model flavors Run Sources usable by many tools Cloud Serving Tools
16 . Example MLflow Model my_model/ ├── MLmodel run_id: 769915006efd4c4bbd662461 time_created: 2018-06-28T12:34 │ flavors: │ tensorflow: Usable by tools that understand saved_model_dir: estimator │ signature_def_key: predict TensorFlow model format python_function: │ loader_module: mlflow.tensorflow Usable by any tool that can run │ Python (Docker, Spark, etc!) └── estimator/ ├── saved_model.pb └── variables/ ... >>> mlflow.tensorflow.log_model(...)
17 .Demo
18 .Goal: Predict Price of Airbnb Listings listing attributes bathrooms: 1 bedrooms: 2 accommodates: 4 total_reviews: 45 cleanliness_rating: 9 location_rating: 10 f (x) price: 150 checkin_rating: 10 Model zip_code: 94105 based on data from insideairbnb.com
19 .Advanced MLFlow - HyperParameters Projects Models HyperParam Train Model mlflow.log_artifact mlflow run ... Logged Model Search Run Run mlflow.get_metric() Tracking mlflow.log_metric() 19
20 .Advanced MLFlow - Multistep Workflow Data Collection ETL Model Training Streaming SQL CPU CPU CPU CPU GPU GPU
21 .Ongoing MLflow Roadmap • TensorFlow, Keras, PyTorch, H2O, MLlib integrations ✔ • R and Java language APIs ✔ • Multi-step workflows • Hyperparameter tuning • Data source API based on Spark data sources • Model metadata & management
22 .Get started with MLflow install.packages(“mlflow”) to get started Find docs & examples at mlflow.org tinyurl.com/mlflow-slack 22
23 .Thank you!
24 .Custom ML Platforms Facebook FBLearner, Uber Michelangelo, Google TFX + Standardize the data prep / training / deploy loop: if you work with the platform, you get these! Can we provide similar benefits in an open manner?
25 . MLflow Tracking Notebooks R or REST API UI Local Apps Tracking Server API Cloud Jobs
26 .Key Concepts in Tracking Parameters: key-value inputs to your code Metrics: numeric values (can update over time) Artifacts: arbitrary files, including models Source: what code ran?
27 .Takeaway Workflow tools can greatly simplify the ML lifecycle • Improve usability for both data scientists and engineers • Same way software dev lifecycle tools simplify development
28 .Example MLflow Project my_project/ ├── MLproject conda_env: conda.yaml │ entry_points: │ main: parameters: │ training_data: path lambda: {type: float, default: 0.1} │ command: python main.py {training_data} {lambda} │ ├── conda.yaml ├── main.py └── model.py $ mlflow run git://<my_project> ... mlflow_run(“git://<my_project>”, ...)
29 .Example MLflow Model my_model/ ├── MLmodel run_id: 769915006efd4c4bbd662461 time_created: 2018-06-28T12:34 │ flavors: │ tensorflow: Usable by tools that understand saved_model_dir: estimator │ signature_def_key: predict TensorFlow model format python_function: │ loader_module: mlflow.tensorflow Usable by any tool that can run │ Python (Docker, Spark, etc!) └── estimator/ ├── saved_model.pb └── variables/ ...