diff options
Diffstat (limited to 'doc/book-enea-nfv-access-platform-guide/doc/hypervisor_virtualization.xml')
-rw-r--r-- | doc/book-enea-nfv-access-platform-guide/doc/hypervisor_virtualization.xml | 328 |
1 files changed, 328 insertions, 0 deletions
diff --git a/doc/book-enea-nfv-access-platform-guide/doc/hypervisor_virtualization.xml b/doc/book-enea-nfv-access-platform-guide/doc/hypervisor_virtualization.xml new file mode 100644 index 0000000..092b52f --- /dev/null +++ b/doc/book-enea-nfv-access-platform-guide/doc/hypervisor_virtualization.xml | |||
@@ -0,0 +1,328 @@ | |||
1 | <?xml version="1.0" encoding="ISO-8859-1"?> | ||
2 | <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" | ||
3 | "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"> | ||
4 | <chapter id="hypervisor_virt"> | ||
5 | <title>Hypervisor Virtualization</title> | ||
6 | |||
7 | <para>The KVM, Kernel-based Virtual Machine, is a virtualization | ||
8 | infrastructure for the Linux kernel which turns it into a hypervisor. KVM | ||
9 | requires a processor with a hardware virtualization extension.</para> | ||
10 | |||
11 | <para>KVM uses QEMU, an open source machine emulator and virtualizer, to | ||
12 | virtualize a complete system. With KVM it is possible to run multiple guests | ||
13 | of a variety of operating systems, each with a complete set of virtualized | ||
14 | hardware.</para> | ||
15 | |||
16 | <section id="launch_virt_machine"> | ||
17 | <title>Launching a Virtual Machine</title> | ||
18 | |||
19 | <para>QEMU can make use of KVM when running a target architecture that is | ||
20 | the same as the host architecture. For instance, when running | ||
21 | qemu-system-x86_64 on an x86-64 compatible processor (containing | ||
22 | virtualization extensions Intel VT or AMD-V), you can take advantage of | ||
23 | the KVM acceleration, giving you benefit for your host and your guest | ||
24 | system.</para> | ||
25 | |||
26 | <para>Enea Linux includes an optimizied version of QEMU with KVM-only | ||
27 | support. To use KVM pass<command> --enable-kvm</command> to QEMU.</para> | ||
28 | |||
29 | <para>The following is an example of starting a guest:</para> | ||
30 | |||
31 | <programlisting>taskset -c 0,1 qemu-system-x86_64 \ | ||
32 | -cpu host -M q35 -smp cores=2,sockets=1 \ | ||
33 | -vcpu 0,affinity=0 -vcpu 1,affinity=1 \ | ||
34 | -enable-kvm -nographic \ | ||
35 | -kernel bzImage \ | ||
36 | -drive file=enea-image-virtualization-guest-qemux86-64.ext4,if=virtio,format=raw \ | ||
37 | -append 'root=/dev/vda console=ttyS0,115200' \ | ||
38 | -m 4096 \ | ||
39 | -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on \ | ||
40 | -numa node,memdev=mem -mem-prealloc</programlisting> | ||
41 | </section> | ||
42 | |||
43 | <section id="qemu_boot"> | ||
44 | <title>Main QEMU boot options</title> | ||
45 | |||
46 | <para>Below are detailed all the pertinent boot options for the QEMU | ||
47 | emulator:</para> | ||
48 | |||
49 | <itemizedlist> | ||
50 | <listitem> | ||
51 | <para>SMP - at least 2 cores should be enabled in order to isolate | ||
52 | application(s) running in virtual machine(s) on specific cores for | ||
53 | better performance.</para> | ||
54 | |||
55 | <programlisting>-smp cores=2,threads=1,sockets=1 \</programlisting> | ||
56 | </listitem> | ||
57 | |||
58 | <listitem> | ||
59 | <para>CPU affinity - associate virtual CPUs with physical CPUs and | ||
60 | optionally assign a default real time priority to the virtual CPU | ||
61 | process in the host kernel. This option allows you to start qemu vCPUs | ||
62 | on isolated physical CPUs.</para> | ||
63 | |||
64 | <programlisting>-vcpu 0,affinity=0 \</programlisting> | ||
65 | </listitem> | ||
66 | |||
67 | <listitem> | ||
68 | <para>Hugepages - KVM guests can be deployed with huge page memory | ||
69 | support in order to reduce memory consumption and improve performance, | ||
70 | by reducing CPU cache usage. By using huge pages for a KVM guest, less | ||
71 | memory is used for page tables and TLB (Translation Lookaside Buffer) | ||
72 | misses are reduced, thereby significantly increasing performance, | ||
73 | especially for memory-intensive situations.</para> | ||
74 | |||
75 | <programlisting>-object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on \</programlisting> | ||
76 | </listitem> | ||
77 | |||
78 | <listitem> | ||
79 | <para>Memory preallocation - preallocate huge pages at startup time | ||
80 | can improve performance but it may affect the qemu boot time.</para> | ||
81 | |||
82 | <programlisting>-mem-prealloc \</programlisting> | ||
83 | </listitem> | ||
84 | |||
85 | <listitem> | ||
86 | <para>Enable realtime characteristics - run qemu with realtime | ||
87 | features. While that mildly implies that "-realtime" alone might do | ||
88 | something, it's just an identifier for options that are partially | ||
89 | realtime. If you're running in a realtime or low latency environment, | ||
90 | you don't want your pages to be swapped out and mlock does that, thus | ||
91 | mlock=on. If you want VM density, then you may want swappable VMs, | ||
92 | thus mlock=off.</para> | ||
93 | |||
94 | <programlisting>-realtime mlock=on \</programlisting> | ||
95 | </listitem> | ||
96 | </itemizedlist> | ||
97 | |||
98 | <para>If the hardware does not have an IOMMU (known as "Intel VT-d" on | ||
99 | Intel-based machines and "AMD I/O Virtualization Technology" on AMD-based | ||
100 | machines), it will not be possible to assign devices in KVM. | ||
101 | Virtualization Technology features (VT-d, VT-x, etc.) must be enabled from | ||
102 | BIOS on the host target before starting a virtual machine.</para> | ||
103 | </section> | ||
104 | |||
105 | <section id="net_in_guest"> | ||
106 | <title>Networking in guest</title> | ||
107 | |||
108 | <section id="vhost-user-support"> | ||
109 | <title>Using vhost-user support</title> | ||
110 | |||
111 | <para>The goal of vhost-user is to implement a Virtio transport, staying | ||
112 | as close as possible to the vhost paradigm of using shared memory, | ||
113 | ioeventfds and irqfds. A UNIX domain socket based mechanism allows the | ||
114 | set up of resources used by a number of Vrings shared between two | ||
115 | userspace processes, which will be placed in shared memory.</para> | ||
116 | |||
117 | <para>To run QEMU with the vhost-user backend, you have to provide the | ||
118 | named UNIX domain socket which needs to be already opened by the | ||
119 | backend:</para> | ||
120 | |||
121 | <programlisting>-object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on \ | ||
122 | -chardev socket,id=char0,path=/var/run/openvswitch/vhost-user1 \ | ||
123 | -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \ | ||
124 | -device virtio-net-pci,netdev=mynet1,mac=52:54:00:00:00:01 \</programlisting> | ||
125 | |||
126 | <para>The vHost User standard uses a client-server model. The server | ||
127 | creates and manages the vHost User sockets and the client connects to | ||
128 | the sockets created by the server. It is recommended to use QEMU as | ||
129 | server so the vhost-user client can be restarted without affecting the | ||
130 | server, otherwise if the server side dies all clients need to be | ||
131 | restarted.</para> | ||
132 | |||
133 | <para>Using vhost-user in QEMU as server will offer the flexibility to | ||
134 | stop and start the virtual machine with no impact on virtual switch from | ||
135 | the host (vhost-user-client).</para> | ||
136 | |||
137 | <programlisting>-chardev socket,id=char0,path=/var/run/openvswitch/vhost-user1,server \</programlisting> | ||
138 | </section> | ||
139 | |||
140 | <section id="tap-interface"> | ||
141 | <title>Using TAP Interfaces</title> | ||
142 | |||
143 | <para>QEMU can use TAP interfaces to provide full networking capability | ||
144 | for the guest OS:</para> | ||
145 | |||
146 | <programlisting>-netdev tap,id=net0,ifname=tap0,script=no,downscript=no \ | ||
147 | -device virtio-net-pci,netdev=net0,mac=22:EA:FB:A8:25:AE \</programlisting> | ||
148 | </section> | ||
149 | |||
150 | <section id="vfio-passthrough"> | ||
151 | <title>VFIO passthrough VF (SR-IOV) to guest</title> | ||
152 | |||
153 | <para>KVM hypervisor support for attaching PCI devices on the host | ||
154 | system to guests. PCI passthrough allows guests to have exclusive access | ||
155 | to PCI devices for a range of tasks. PCI passthrough allows PCI devices | ||
156 | to appear and behave as if they were physically attached to the guest | ||
157 | operating system.</para> | ||
158 | |||
159 | <para>Preparing an Intel system for PCI passthrough:</para> | ||
160 | |||
161 | <itemizedlist> | ||
162 | <listitem> | ||
163 | <para>Enable the Intel VT-d extensions in BIOS</para> | ||
164 | </listitem> | ||
165 | |||
166 | <listitem> | ||
167 | <para>Activate Intel VT-d in the kernel by using | ||
168 | <literal>intel_iommu=on</literal> as a kernel boot parameter</para> | ||
169 | </listitem> | ||
170 | |||
171 | <listitem> | ||
172 | <para>Allow unsafe interrupts in case the system doesn't support | ||
173 | interrupt remapping. This can be done using | ||
174 | <literal>vfio_iommu_type1.allow_unsafe_interrupts=1</literal> as a | ||
175 | boot kernel parameter.</para> | ||
176 | </listitem> | ||
177 | </itemizedlist> | ||
178 | |||
179 | <para>Create guest with direct passthrough via VFIO framework like | ||
180 | so:</para> | ||
181 | |||
182 | <programlisting>-device vfio-pci,host=0000:03:10.2 \</programlisting> | ||
183 | |||
184 | <para>On the host, one or more VirtualFunctions (VFs) must be created in | ||
185 | order to be allocated for a guest network to access, before starting | ||
186 | QEMU:</para> | ||
187 | |||
188 | <programlisting>$ echo 2 > /sys/class/net/eno3/device/sriov_numvfs | ||
189 | $ modprobe vfio_pci | ||
190 | $ dpdk-devbind.py --bind=vfio-pci 0000:03:10.2</programlisting> | ||
191 | </section> | ||
192 | |||
193 | <section id="multiqueue"> | ||
194 | <title>Multi-queue</title> | ||
195 | |||
196 | <section id="qemu-multiqueue-support"> | ||
197 | <title>QEMU multi queue support configuration</title> | ||
198 | |||
199 | <programlisting>-chardev socket,id=char0,path=/var/run/openvswitch/vhost-user1 \ | ||
200 | -netdev type=vhost-user,id=net0,chardev=char0,queues=2 \ | ||
201 | -device virtio-net-pci,netdev=net0,mac=22:EA:FB:A8:25:AE,mq=on,vectors=6 | ||
202 | where vectors is calculated as: 2 + 2 * queues number.</programlisting> | ||
203 | </section> | ||
204 | |||
205 | <section id="inside-guest"> | ||
206 | <title>Inside guest</title> | ||
207 | |||
208 | <para>Linux kernel virtio-net driver (one queue is enabled by | ||
209 | default):</para> | ||
210 | |||
211 | <programlisting>$ ethtool -L combined 2 eth0 | ||
212 | DPDK Virtio PMD | ||
213 | $ testpmd -c 0x7 -- -i --rxq=2 --txq=2 --nb-cores=2 ...</programlisting> | ||
214 | |||
215 | <para>For QEMU documentation please see: <ulink | ||
216 | url="https://qemu.weilnetz.de/doc/qemu-doc.html">https://qemu.weilnetz.de/doc/qemu-doc.html</ulink>.</para> | ||
217 | </section> | ||
218 | </section> | ||
219 | </section> | ||
220 | |||
221 | <section id="libvirt"> | ||
222 | <title>Libvirt</title> | ||
223 | |||
224 | <para>One way to manage guests in Enea NFV Access is by using | ||
225 | <literal>libvirt</literal>. Libvirt is used in conjunction with a daemon | ||
226 | (<literal>libvirtd</literal>) and a command line utility (virsh) to manage | ||
227 | virtualized environments.</para> | ||
228 | |||
229 | <para>The libvirt library is a hypervisor-independent virtualization API | ||
230 | and toolkit that is able to interact with the virtualization capabilities | ||
231 | of a range of operating systems. Libvirt provides a common, generic and | ||
232 | stable layer to securely manage domains on a node. As nodes may be | ||
233 | remotely located, libvirt provides all methods required to provision, | ||
234 | create, modify, monitor, control, migrate and stop the domains, within the | ||
235 | limits of hypervisor support for these operations.</para> | ||
236 | |||
237 | <para>The libvirt daemon runs on the Enea NFV Access host. All tools built | ||
238 | on libvirt API connect to the daemon to request the desired operation, and | ||
239 | to collect information about the configuration and resources of the host | ||
240 | system and guests. <literal>virsh</literal> is a command line interface | ||
241 | tool for managing guests and the hypervisor. The virsh tool is built on | ||
242 | the libvirt management API.</para> | ||
243 | |||
244 | <para><emphasis role="bold">Major functionality provided by | ||
245 | libvirt</emphasis></para> | ||
246 | |||
247 | <para>The following is a summary from the libvirt <ulink | ||
248 | url="http://wiki.libvirt.org/page/FAQ#What_is_libvirt.3F">home | ||
249 | page</ulink> describing the major libvirt features:</para> | ||
250 | |||
251 | <itemizedlist> | ||
252 | <listitem> | ||
253 | <para><emphasis role="bold">VM management:</emphasis> Various domain | ||
254 | lifecycle operations such as start, stop, pause, save, restore, and | ||
255 | migrate. Hotplug operations for many device types including disk and | ||
256 | network interfaces, memory, and cpus.</para> | ||
257 | </listitem> | ||
258 | |||
259 | <listitem> | ||
260 | <para><emphasis role="bold">Remote machine support:</emphasis> All | ||
261 | libvirt functionality is accessible on any machine running the libvirt | ||
262 | daemon, including remote machines. A variety of network transports are | ||
263 | supported for connecting remotely, with the simplest being | ||
264 | <literal>SSH</literal>, which requires no extra explicit | ||
265 | configuration. For more information, see: <ulink | ||
266 | url="http://libvirt.org/remote.html">http://libvirt.org/remote.html</ulink>.</para> | ||
267 | </listitem> | ||
268 | |||
269 | <listitem> | ||
270 | <para><emphasis role="bold">Network interface management:</emphasis> | ||
271 | Any host running the libvirt daemon can be used to manage physical and | ||
272 | logical network interfaces. Enumerate existing interfaces, as well as | ||
273 | configure (and create) interfaces, bridges, vlans, and bond devices. | ||
274 | For more details see: <ulink | ||
275 | url="https://fedorahosted.org/netcf/">https://fedorahosted.org/netcf/</ulink>.</para> | ||
276 | </listitem> | ||
277 | |||
278 | <listitem> | ||
279 | <para><emphasis role="bold">Virtual NAT and Route based | ||
280 | networking:</emphasis> Any host running the libvirt daemon can manage | ||
281 | and create virtual networks. Libvirt virtual networks use firewall | ||
282 | rules to act as a router, providing VMs transparent access to the host | ||
283 | machines network. For more information, see: <ulink | ||
284 | url="http://libvirt.org/archnetwork.html">http://libvirt.org/archnetwork.html</ulink>.</para> | ||
285 | </listitem> | ||
286 | |||
287 | <listitem> | ||
288 | <para><emphasis role="bold">Storage management:</emphasis> Any host | ||
289 | running the libvirt daemon can be used to manage various types of | ||
290 | storage: create file images of various formats (raw, qcow2, etc.), | ||
291 | mount NFS shares, enumerate existing LVM volume groups, create new LVM | ||
292 | volume groups and logical volumes, partition raw disk devices, mount | ||
293 | iSCSI shares, and much more. For more details, see: <ulink | ||
294 | url="http://libvirt.org/storage.html">http://libvirt.org/storage.html</ulink>.</para> | ||
295 | </listitem> | ||
296 | |||
297 | <listitem> | ||
298 | <para><emphasis role="bold">Libvirt Configuration:</emphasis> A | ||
299 | properly running libvirt requires that the following elements be in | ||
300 | place:</para> | ||
301 | |||
302 | <itemizedlist> | ||
303 | <listitem> | ||
304 | <para>Configuration files, located in the directory | ||
305 | <literal>/etc/libvirt</literal>. They include the daemon's | ||
306 | configuration file <literal>libvirtd.conf</literal>, and | ||
307 | hypervisor-specific configuration files, like | ||
308 | <literal>qemu.conf</literal> for the QEMU.</para> | ||
309 | </listitem> | ||
310 | |||
311 | <listitem> | ||
312 | <para>A running libvirtd daemon. The daemon is started | ||
313 | automatically in Enea NFV Access host.</para> | ||
314 | </listitem> | ||
315 | |||
316 | <listitem> | ||
317 | <para>Configuration files for the libvirt domains, or guests, to | ||
318 | be managed by the KVM host. The specifics for guest domains shall | ||
319 | be defined in an XML file of a format specified at <ulink | ||
320 | url="http://libvirt.org/formatdomain.html">http://libvirt.org/formatdomain.html</ulink>. | ||
321 | XML formats for other structures are specified at <ulink type="" | ||
322 | url="http://libvirt.org/format.html">http://libvirt.org/format.html</ulink>.</para> | ||
323 | </listitem> | ||
324 | </itemizedlist> | ||
325 | </listitem> | ||
326 | </itemizedlist> | ||
327 | </section> | ||
328 | </chapter> \ No newline at end of file | ||