- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
虚拟化
展开查看详情
1 .Advanced Operating Systems (CS 202) Virtualization
2 . Virtualization • One of the natural consequences of the extensibility research we discussed • What is virtualization and what are the benefits? 2
3 . Virtualization motivation • Cost: multiplex multiple virtual machines on one hardware machine – Cloud computing, data center virtualization – Why not processes? – Why not containers? • Heterogeneity: – Allow one machine to support multiple OS’s – Maintaining compatibility • Other: security, migration, energy optimization, customization, … 3
4 . How do we virtualize? • Create an operating system to multiplex resources among operating systems! – Exports a virtual machine to the Operating systems – Called a hypervisor of Virtual Machine Monitor 4
5 .VIRTUALIZATION MODELS 5
6 . Two types of hypervisors • Type 1: Native (bare metal) – Hypervisor runs on top of the bare metal machine – e.g., KVM • Type 2: Hosted – Hypervisor is an emulator 6 – e.g., VMWare, virtual box, QEMU
7 . Hybrid organizations • Some hybrids exist, e.g., Xen – Mostly bare metal – VM0/Dom0 to keep device drivers out of VMM 7
8 . Stepping back – some history • IBM VM 370 (1970s) • Microkernels (late 80s/90s) • Extensibility (90s) • SIMOS (late 90s) – Eventually became VMWare (2000) • Xen, Vmware, others (2000s) • Ubiquitous use, Cloud computing, data centers, … – Makes computing a utility 8
9 . Full virtualization • Idea: run guest operating systems unmodified • However, supervisor is the real privileged software • When OS executes privileged instruction, trap to hypervisor who executes it for the OS • This can be very expensive • Also, subject to quirks of the architecture – Example, x86 fails silently if some privileged instructions execute without privilege 9 – E.g., popf
10 . Example: Disable Interrupts • Guest OS tries to disable interrupts – the instruction is trapped by the VMM which makes a note that interrupts are disabled for that virtual machine • Interrupts arrive for that machine – Buffered at the VMM layer until the guest OS enables interrupts. • Other interrupts are directed to VMs that have not disabled them
11 . Binary translation--making full virtualization practical • Use binary translation to modify OS to rewrite silent failure instructions • More aggressive translation can be used – Translate OS mode instructions to equivalent VMM instructions • Some operations still expensive • Cache for future use • Used by VMWare ESXi and Microsoft Virtual Server • Performance on x86 typically ~80-95% of native 11
12 . Binary Translation Example Guest OS Assembly Translated Assembly do_atomic_operation: do_atomic_operation: cli call [vmm_disable_interrupts] mov eax, 1 mov eax, 1 xchg eax, [lock_addr] xchg eax, [lock_addr] test eax, eax test eax, eax jnz spinlock jnz spinlock … … … … mov [lock_addr], 0 mov [lock_addr], 0 sti call [vmm_enable_interrupts] ret ret 12
13 . Paravirtualization • Modify the OS to make it aware of the hypervisor – Canavoid the tricky features – Aware of the fact it is virtualized • Can implement optimizations • Comparison to binary translation? • Amount of code change? – 1.36% of Linux, 0.04% for Windows 13
14 . Hardware supported virtualization (Intel VT-x, AMD-V) • Hardware support for virtualization • Makes implementing VMMs much simpler • Streamlines communication between VM and OS • Removes the need for paravirtualization/binary translation • EPT: Support for shadow page tables • More later… 14
15 .NUTS AND BOLTS 15
16 . What needs to be done? • Virtualize hardware – Memory hierarchy – CPUs – Devices • Implement data and control transfer between guests and hypervisor • We’ll cover this by example – Xen paper – Slides modified from presentation by Jianmin Chen 16
17 . Xen • Design principles: – Unmodified applications: essential – Full-blown multi-task O/Ss: essential – Paravirtualization: necessary for performance and isolation
18 .Xen
19 .Implementation summary 19
20 . Xen VM interface: Memory • Memory management – Guest cannot install highest privilege level segment descriptors; top end of linear address space is not accessible – Guest has direct (not trapped) read access to hardware page tables; writes are trapped and handled by the VMM – Physical memory presented to guest is not necessarily contiguous
21 . Two Layers of Virtual Memory Physical address à machine address Host OS’s View of RAM Virtual address à physical address 0xFFFFFFFF Guest OS’s View of RAM Guest App’s 0xFFFF View of RAM Page 2 0xFF Page 0 Page 0 Page 3 Page 2 Page 1 Page 1 Page 3 Page 0 Page 3 Page 1 0x00 Page 2 Unknown to the Known to the 0x0000 guest OS guest OS 0x00000000
22 . Guest’s Page Tables Are Invalid • Guest OS page tables map virtual page numbers (VPNs) to physical frame numbers (PFNs) • Problem: the guest is virtualized, doesn’t actually know the true PFNs – The true location is the machine frame number (MFN) – MFNs are known to the VMM and the host OS • Guest page tables cannot be installed in cr3 – Map VPNs to PFNs, but the PFNs are incorrect • How can the MMU translate addresses used by the guest (VPNs) to MFNs? 22
23 . Shadow Page Tables • Solution: VMM creates shadow page tables that map VPN à MFN (as opposed to VPNàPFN) Guest Page Table Physical Memory VPN PFN 64 Page 3 • Maintained by the 00 (0) 01 (1) 48 guest OS Page 2 • Invalid for the MMU 01 (1) 10 (2) 32 Virtual Memory Page 1 10 (2) 11 (3) 16 64 11 (3) 00 (0) Page 0 Page 3 0 48 Page 2 32 Shadow Page Table Machine Memory Page 1 16 VPN MFN 64 0 Page 0 00 (0) 10 (2) Page 3 • Maintained by the 48 VMM 01 (1) 11 (3) Page 2 • Valid for the MMU 32 10 (2) 00 (0) Page 1 16 Page 0 23 11 (3) 01 (1) 0
24 . Building Shadow Tables • Problem: how can the VMM maintain consistent shadow pages tables? – The guest OS may modify its page tables at any time – Modifying the tables is a simple memory write, not a privileged instruction • Thus, no helpful CPU exceptions :( • Solution: mark the hardware pages containing the guest’s tables as read-only – If the guest updates a table, an exception is generated – VMM catches the exception, examines the faulting write, updates the shadow table 24
25 . More VMM Tricks • The VMM can play tricks with virtual memory just like an OS can • Balooning: – The VMM can page parts of a guest, or even an entire guest, to disk – A guest can be written to disk and brought back online on a different machine! • Deduplication: – The VMM can share read-only pages between guests – Example: two guests both running Windows XP 25
26 . Xen VM interface: CPU • CPU – Guest runs at lower privilege than VMM – Exception handlers must be registered with VMM – Fast system call handler can be serviced without trapping to VMM – Hardware interrupts replaced by lightweight event notification system – Timer interface: both real and virtual time
27 . Details: CPU • Frequent exceptions: – Software interrupts for system calls – Page faults • Allow “guest” to register a ‘fast’ exception handler for system calls that can be accessed directly by CPU in ring 1, without switching to ring-0/Xen – Handler is validated before installing in hardware exception table: To make sure nothing executed in Ring 0 privilege. – Doesn’t work for Page Fault
28 . Xen VM interface: I/O • I/O – Virtualdevices exposed as asynchronous I/O rings to guests – Event notification replaces interrupts
29 . Details: I/O 1 • Xen does not emulate hardware devices – Exposes device abstractions for simplicity and performance – I/O data transferred to/from guest via Xen using shared-memory buffers – Virtualized interrupts: light-weight event delivery mechanism from Xen-guest • Update a bitmap in shared memory • Optional call-back handlers registered by O/S