- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Performance Analysis and Troubleshooting Methodologies for Databases
数据库的性能分析和故障排除方法
展开查看详情
1 . Performance Analysis and Troubleshooting Methodologies for Databases Peter Zaitsev, CEO February 3nd, 2018 FOSDEM Monitoring and Cloud devroom © 2018 Percona. 1
2 .Databases and Performance Databases are frequent Performance Trouble Makers © 2018 Percona. 2
3 .Why Databases are Painful ? Generally Non-Linear Scalability Complex Often Poorly understood by developers © 2018 Percona. 3
4 .Performance Work with Databases Troubleshooting Capacity Planning Cost and Efficiency Optimization Change Management © 2018 Percona. 4
5 .Points of View BlackBox – WhiteBox – “Application “DBA, Ops” Developer” © 2018 Percona. 5
6 .Developer Point of View Database as a Blackbox I throw queries at it and it responds DBaaS bring this “promise” to OPS too © 2018 Percona. 6
7 .BlackBox Success Criteria for Databases Availability Response Time Correctness Cost © 2018 Percona. 7
8 .Ops Point of View Load Resource Utilization System/Hardware Problems Scaling/Capacity Planing © 2018 Percona. 8
9 .Methodologies for Performance Troubleshooting and Analyses © 2018 Percona. 9
10 .Typical Default Troubleshooting by Random Googling © 2018 Percona. 10
11 .Problems with Typical Approach Hard to Assure Outcome Hard to Train People Hard to Automate © 2018 Percona. 11
12 .Methodologies Save the Day USE (Utilization, Golden Signals (Latency RED (Rate, Errors(Rate), Saturation, Errors) - Traffic - Errors - Duration) Method Tom Method by Brendan Saturations) Method by Wilkie Gregg Rob Ewaschuk © 2018 Percona. 12
13 .USE Method © 2018 Percona. 13
14 .USE Method Basics Developed to Troubleshoot Server Performance Issues Resolve 80% of problems with 5% of Effort Operating System Specific Checklists Available © 2018 Percona. 14
15 .USE Method in One Sentence “For every resource, check utilization, saturation, and errors.” © 2018 Percona. 15
16 .USE Method Terminology Defitinions Resource • all physical server functional components (CPUs, disks, busses, ...) Utilization • the average time that the resource was busy servicing work Saturation • the degree to which the resource has extra work which it can't service, often queued Errors • the count of error events © 2018 Percona. 16
17 .USE Method Resources CPUs: sockets, cores, hardware threads (virtual CPUs) Memory: capacity Network interfaces Storage devices: I/O, capacity Controllers: storage, network cards Interconnects: CPUs, memory, I/O © 2018 Percona. 17
18 .USE Method with Software Same Basic Resources Apply Additional Software Resources Apply Mutex Locks File Descriptors Connections © 2018 Percona. 18
19 .USE Method Benefits Proven Track Record Broad Applicability Detailed Checklists Available © 2018 Percona. 19
20 .USE Method Drawbacks Requires Good Understanding of System Architecture Requires Access to Low Level Resources Monitoring Hard to apply in Service “Blackbox” environments © 2018 Percona. 20
21 .RED Method © 2018 Percona. 21
22 .RED Method Focus Microservices “Cattle not Pets” Mapping to Resources can be fluid © 2018 Percona. 22
23 .RED Method For every Service Check •Rate Request check these •Error (Rate) are within •Duration (Distribution) SLO © 2018 Percona. 23
24 .RED Method for Databases Looking at Service Level Looking at Individual Database Servers Can be applied to Components/Resources Can be applied to individual Types of Queries © 2018 Percona. 24
25 .RED Method Benefits Easily maps to what Developers Care About Does not require as deep understanding of architecture Does not need access to low lever resource monitoring © 2018 Percona. 25
26 .RED Method Drawbacks Does not have as More focused on much tools and Answering WHAT checklists support rather WHY yet © 2018 Percona. 26
27 .Four Golden Signals © 2018 Percona. 27
28 .Focus Monitoring Distributed Systems from SRE Book To Be used for Alerting, Troubleshooting, Trend Analyses © 2018 Percona. 28
29 .Four Golden Signals Latency • Distribution not just Average; Latency for Successful requests vs Errors Traffic • How much Demand is being placed on the System Errors • Error Codes are Easy; Bad Content is hard Saturation • How Full your system “capacity”. Forecast when Possible. © 2018 Percona. 29