summaryrefslogtreecommitdiffstats
path: root/doc/book-enea-nfv-access-platform-guide/doc/hypervisor_virtualization.xml
blob: 092b52f30d2a5801af2113a71806d0cd78566bfa (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<chapter id="hypervisor_virt">
  <title>Hypervisor Virtualization</title>

  <para>The KVM, Kernel-based Virtual Machine, is a virtualization
  infrastructure for the Linux kernel which turns it into a hypervisor. KVM
  requires a processor with a hardware virtualization extension.</para>

  <para>KVM uses QEMU, an open source machine emulator and virtualizer, to
  virtualize a complete system. With KVM it is possible to run multiple guests
  of a variety of operating systems, each with a complete set of virtualized
  hardware.</para>

  <section id="launch_virt_machine">
    <title>Launching a Virtual Machine</title>

    <para>QEMU can make use of KVM when running a target architecture that is
    the same as the host architecture. For instance, when running
    qemu-system-x86_64 on an x86-64 compatible processor (containing
    virtualization extensions Intel VT or AMD-V), you can take advantage of
    the KVM acceleration, giving you benefit for your host and your guest
    system.</para>

    <para>Enea Linux includes an optimizied version of QEMU with KVM-only
    support. To use KVM pass<command> --enable-kvm</command> to QEMU.</para>

    <para>The following is an example of starting a guest:</para>

    <programlisting>taskset -c 0,1 qemu-system-x86_64 \
-cpu host -M q35 -smp cores=2,sockets=1 \
-vcpu 0,affinity=0 -vcpu 1,affinity=1  \
-enable-kvm -nographic \
-kernel bzImage \
-drive file=enea-image-virtualization-guest-qemux86-64.ext4,if=virtio,format=raw \
-append 'root=/dev/vda console=ttyS0,115200' \
-m 4096 \
-object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on \
-numa node,memdev=mem -mem-prealloc</programlisting>
  </section>

  <section id="qemu_boot">
    <title>Main QEMU boot options</title>

    <para>Below are detailed all the pertinent boot options for the QEMU
    emulator:</para>

    <itemizedlist>
      <listitem>
        <para>SMP - at least 2 cores should be enabled in order to isolate
        application(s) running in virtual machine(s) on specific cores for
        better performance.</para>

        <programlisting>-smp cores=2,threads=1,sockets=1 \</programlisting>
      </listitem>

      <listitem>
        <para>CPU affinity - associate virtual CPUs with physical CPUs and
        optionally assign a default real time priority to the virtual CPU
        process in the host kernel. This option allows you to start qemu vCPUs
        on isolated physical CPUs.</para>

        <programlisting>-vcpu 0,affinity=0   \</programlisting>
      </listitem>

      <listitem>
        <para>Hugepages - KVM guests can be deployed with huge page memory
        support in order to reduce memory consumption and improve performance,
        by reducing CPU cache usage. By using huge pages for a KVM guest, less
        memory is used for page tables and TLB (Translation Lookaside Buffer)
        misses are reduced, thereby significantly increasing performance,
        especially for memory-intensive situations.</para>

        <programlisting>-object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on \</programlisting>
      </listitem>

      <listitem>
        <para>Memory preallocation - preallocate huge pages at startup time
        can improve performance but it may affect the qemu boot time.</para>

        <programlisting>-mem-prealloc \</programlisting>
      </listitem>

      <listitem>
        <para>Enable realtime characteristics - run qemu with realtime
        features. While that mildly implies that "-realtime" alone might do
        something, it's just an identifier for options that are partially
        realtime. If you're running in a realtime or low latency environment,
        you don't want your pages to be swapped out and mlock does that, thus
        mlock=on. If you want VM density, then you may want swappable VMs,
        thus mlock=off.</para>

        <programlisting>-realtime mlock=on \</programlisting>
      </listitem>
    </itemizedlist>

    <para>If the hardware does not have an IOMMU (known as "Intel VT-d" on
    Intel-based machines and "AMD I/O Virtualization Technology" on AMD-based
    machines), it will not be possible to assign devices in KVM.
    Virtualization Technology features (VT-d, VT-x, etc.) must be enabled from
    BIOS on the host target before starting a virtual machine.</para>
  </section>

  <section id="net_in_guest">
    <title>Networking in guest</title>

    <section id="vhost-user-support">
      <title>Using vhost-user support</title>

      <para>The goal of vhost-user is to implement a Virtio transport, staying
      as close as possible to the vhost paradigm of using shared memory,
      ioeventfds and irqfds. A UNIX domain socket based mechanism allows the
      set up of resources used by a number of Vrings shared between two
      userspace processes, which will be placed in shared memory.</para>

      <para>To run QEMU with the vhost-user backend, you have to provide the
      named UNIX domain socket which needs to be already opened by the
      backend:</para>

      <programlisting>-object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on \
-chardev socket,id=char0,path=/var/run/openvswitch/vhost-user1 \
-netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
-device virtio-net-pci,netdev=mynet1,mac=52:54:00:00:00:01  \</programlisting>

      <para>The vHost User standard uses a client-server model. The server
      creates and manages the vHost User sockets and the client connects to
      the sockets created by the server. It is recommended to use QEMU as
      server so the vhost-user client can be restarted without affecting the
      server, otherwise if the server side dies all clients need to be
      restarted.</para>

      <para>Using vhost-user in QEMU as server will offer the flexibility to
      stop and start the virtual machine with no impact on virtual switch from
      the host (vhost-user-client).</para>

      <programlisting>-chardev socket,id=char0,path=/var/run/openvswitch/vhost-user1,server \</programlisting>
    </section>

    <section id="tap-interface">
      <title>Using TAP Interfaces</title>

      <para>QEMU can use TAP interfaces to provide full networking capability
      for the guest OS:</para>

      <programlisting>-netdev tap,id=net0,ifname=tap0,script=no,downscript=no \
-device virtio-net-pci,netdev=net0,mac=22:EA:FB:A8:25:AE \</programlisting>
    </section>

    <section id="vfio-passthrough">
      <title>VFIO passthrough VF (SR-IOV) to guest</title>

      <para>KVM hypervisor support for attaching PCI devices on the host
      system to guests. PCI passthrough allows guests to have exclusive access
      to PCI devices for a range of tasks. PCI passthrough allows PCI devices
      to appear and behave as if they were physically attached to the guest
      operating system.</para>

      <para>Preparing an Intel system for PCI passthrough:</para>

      <itemizedlist>
        <listitem>
          <para>Enable the Intel VT-d extensions in BIOS</para>
        </listitem>

        <listitem>
          <para>Activate Intel VT-d in the kernel by using
          <literal>intel_iommu=on</literal> as a kernel boot parameter</para>
        </listitem>

        <listitem>
          <para>Allow unsafe interrupts in case the system doesn't support
          interrupt remapping. This can be done using
          <literal>vfio_iommu_type1.allow_unsafe_interrupts=1</literal> as a
          boot kernel parameter.</para>
        </listitem>
      </itemizedlist>

      <para>Create guest with direct passthrough via VFIO framework like
      so:</para>

      <programlisting>-device vfio-pci,host=0000:03:10.2 \</programlisting>

      <para>On the host, one or more VirtualFunctions (VFs) must be created in
      order to be allocated for a guest network to access, before starting
      QEMU:</para>

      <programlisting>$ echo 2 &gt; /sys/class/net/eno3/device/sriov_numvfs
$ modprobe vfio_pci
$ dpdk-devbind.py --bind=vfio-pci 0000:03:10.2</programlisting>
    </section>

    <section id="multiqueue">
      <title>Multi-queue</title>

      <section id="qemu-multiqueue-support">
        <title>QEMU multi queue support configuration</title>

        <programlisting>-chardev socket,id=char0,path=/var/run/openvswitch/vhost-user1 \
-netdev type=vhost-user,id=net0,chardev=char0,queues=2 \
-device virtio-net-pci,netdev=net0,mac=22:EA:FB:A8:25:AE,mq=on,vectors=6
where vectors is calculated as: 2 + 2 * queues number.</programlisting>
      </section>

      <section id="inside-guest">
        <title>Inside guest</title>

        <para>Linux kernel virtio-net driver (one queue is enabled by
        default):</para>

        <programlisting>$ ethtool -L combined 2 eth0
DPDK Virtio PMD
$ testpmd -c 0x7 -- -i --rxq=2 --txq=2 --nb-cores=2 ...</programlisting>

        <para>For QEMU documentation please see: <ulink
        url="https://qemu.weilnetz.de/doc/qemu-doc.html">https://qemu.weilnetz.de/doc/qemu-doc.html</ulink>.</para>
      </section>
    </section>
  </section>

  <section id="libvirt">
    <title>Libvirt</title>

    <para>One way to manage guests in Enea NFV Access is by using
    <literal>libvirt</literal>. Libvirt is used in conjunction with a daemon
    (<literal>libvirtd</literal>) and a command line utility (virsh) to manage
    virtualized environments.</para>

    <para>The libvirt library is a hypervisor-independent virtualization API
    and toolkit that is able to interact with the virtualization capabilities
    of a range of operating systems. Libvirt provides a common, generic and
    stable layer to securely manage domains on a node. As nodes may be
    remotely located, libvirt provides all methods required to provision,
    create, modify, monitor, control, migrate and stop the domains, within the
    limits of hypervisor support for these operations.</para>

    <para>The libvirt daemon runs on the Enea NFV Access host. All tools built
    on libvirt API connect to the daemon to request the desired operation, and
    to collect information about the configuration and resources of the host
    system and guests. <literal>virsh</literal> is a command line interface
    tool for managing guests and the hypervisor. The virsh tool is built on
    the libvirt management API.</para>

    <para><emphasis role="bold">Major functionality provided by
    libvirt</emphasis></para>

    <para>The following is a summary from the libvirt <ulink
    url="http://wiki.libvirt.org/page/FAQ#What_is_libvirt.3F">home
    page</ulink> describing the major libvirt features:</para>

    <itemizedlist>
      <listitem>
        <para><emphasis role="bold">VM management:</emphasis> Various domain
        lifecycle operations such as start, stop, pause, save, restore, and
        migrate. Hotplug operations for many device types including disk and
        network interfaces, memory, and cpus.</para>
      </listitem>

      <listitem>
        <para><emphasis role="bold">Remote machine support:</emphasis> All
        libvirt functionality is accessible on any machine running the libvirt
        daemon, including remote machines. A variety of network transports are
        supported for connecting remotely, with the simplest being
        <literal>SSH</literal>, which requires no extra explicit
        configuration. For more information, see: <ulink
        url="http://libvirt.org/remote.html">http://libvirt.org/remote.html</ulink>.</para>
      </listitem>

      <listitem>
        <para><emphasis role="bold">Network interface management:</emphasis>
        Any host running the libvirt daemon can be used to manage physical and
        logical network interfaces. Enumerate existing interfaces, as well as
        configure (and create) interfaces, bridges, vlans, and bond devices.
        For more details see: <ulink
        url="https://fedorahosted.org/netcf/">https://fedorahosted.org/netcf/</ulink>.</para>
      </listitem>

      <listitem>
        <para><emphasis role="bold">Virtual NAT and Route based
        networking:</emphasis> Any host running the libvirt daemon can manage
        and create virtual networks. Libvirt virtual networks use firewall
        rules to act as a router, providing VMs transparent access to the host
        machines network. For more information, see: <ulink
        url="http://libvirt.org/archnetwork.html">http://libvirt.org/archnetwork.html</ulink>.</para>
      </listitem>

      <listitem>
        <para><emphasis role="bold">Storage management:</emphasis> Any host
        running the libvirt daemon can be used to manage various types of
        storage: create file images of various formats (raw, qcow2, etc.),
        mount NFS shares, enumerate existing LVM volume groups, create new LVM
        volume groups and logical volumes, partition raw disk devices, mount
        iSCSI shares, and much more. For more details, see: <ulink
        url="http://libvirt.org/storage.html">http://libvirt.org/storage.html</ulink>.</para>
      </listitem>

      <listitem>
        <para><emphasis role="bold">Libvirt Configuration:</emphasis> A
        properly running libvirt requires that the following elements be in
        place:</para>

        <itemizedlist>
          <listitem>
            <para>Configuration files, located in the directory
            <literal>/etc/libvirt</literal>. They include the daemon's
            configuration file <literal>libvirtd.conf</literal>, and
            hypervisor-specific configuration files, like
            <literal>qemu.conf</literal> for the QEMU.</para>
          </listitem>

          <listitem>
            <para>A running libvirtd daemon. The daemon is started
            automatically in Enea NFV Access host.</para>
          </listitem>

          <listitem>
            <para>Configuration files for the libvirt domains, or guests, to
            be managed by the KVM host. The specifics for guest domains shall
            be defined in an XML file of a format specified at <ulink
            url="http://libvirt.org/formatdomain.html">http://libvirt.org/formatdomain.html</ulink>.
            XML formats for other structures are specified at <ulink type=""
            url="http://libvirt.org/format.html">http://libvirt.org/format.html</ulink>.</para>
          </listitem>
        </itemizedlist>
      </listitem>
    </itemizedlist>
  </section>
</chapter>