1 files changed, 71 insertions, 162 deletions
diff --git a/book-enea-nfv-core-release-info/doc/known_bugs_and_limitations.xml b/book-enea-nfv-core-release-info/doc/known_bugs_and_limitations.xml
index bc682ab..1aa394d 100644
--- a/book-enea-nfv-core-release-info/doc/known_bugs_and_limitations.xml
+++ b/book-enea-nfv-core-release-info/doc/known_bugs_and_limitations.xml
@@ -44,29 +44,35 @@
      </listitem>
      <listitem>
-        <para>Instances fail to boot when using a direct port (SR-IOV) on
+        <para>ThunderX integrated NICs cannot be used for SR-IOV</para>
-        ThunderX</para>
        <itemizedlist>
          <listitem>
            <para>Description and Impact:</para>
-            <para>Deployment is successful with SR-IOV enabled interfaces
+            <para>For the moment ENEA NFV Core is missing the support to
-            however, instances fail to boot when a direct bound (SR-IOV) port
+            configure ThunderX integrated NICs for deployment. Furthermore,
-            is added. This has been tested using a SR-IOV capable PCI Express
+            ThunderX integrated NICs cannot be used for SR-IOV even if
-            Network Interface. As a consequence it is impossible to
+            configured manually after deployment. This happens because
-            passthrough a SR-IOV port on ThunderX.</para>
+            ThunderX integrated NICs are themseleves virtual functions and are
+            incorrectly handled by libvirt when trying to assign them to a
+            virtual machine.</para>
+            <para>It is however possible to deploy with SR-IOV over add-on
+            interfaces via the PCI-E expansion slots.</para>
          </listitem>
          <listitem>
-            <para>Workaround: N/A.</para>
+            <para>Workaround: there is no workaround for this issue. As an
+            alternative, the user can configure an external PCI-E NIC for
+            SR-IOV.</para>
          </listitem>
        </itemizedlist>
      </listitem>
      <listitem>
-        <para>Security groups are not working correctly for ICMP traffic in
+        <para>ThunderX integrated NICs cannot be used for PCI passthrough with
-        deployments with OpenDaylight.</para>
+        direct-physical bound Neutron ports</para>
        <itemizedlist>
          <listitem>
@@ -74,52 +80,32 @@
            <itemizedlist>
              <listitem>
-                <para>When OPNFV is deployed with OpenDaylight as an SDN
+                <para>PCI Passthrough using direct-physical bound ports also
-                controller, the Security Groups rules pertaining to ICMP do
+                uses the neutron-sriov-agent. Because the interfaces are
-                not work as expected. The OpenFlow rules describing the ICMP
+                represented as virtual functions, it will be impossible to use
-                rules are inconsistent, so VMs can be pinged even when this is
+                Neutron ports bound as direct-physical (the Nova driver will
-                not desired.</para>
+                identify them as type-VF, not type-PF).</para>
-              </listitem>
+                <para>Due to this, it is not possible to claim PCI devices
-              <listitem>
+                using direct-physical bound ports.</para>
-                <para>This reproduces on aarch64. On x86 the security groups
-                work correctly.</para>
              </listitem>
            </itemizedlist>
-          </listitem>
-          <listitem>
-            <para>Workaround: N/A.</para>
-          </listitem>
-        </itemizedlist>
-      </listitem>
-      <listitem>
-        <para>Virtual instances do not get IP from DHCP in SFC scenarios with
-        ODL</para>
-        <itemizedlist>
-          <listitem>
-            <para>Description and Impact:</para>
            <itemizedlist>
              <listitem>
-                <para>After a fresh deploy of OPNFV with OpenDaylight and the
+                <para>It is however possible to passthrough any device using
-                Service Function Chaining scenario configurations, instances
+                the PCI alias method, which requries configuring a whitelist
-                fail to get IP from DHCP, due to OpenDaylight
+                of PCI devices and assigning an alias which is set as metadata
-                malfunctioning.</para>
+                in the Nova flavor.</para>
-              </listitem>
-              <listitem>
-                <para>The SFC VNFs are not reachable via SSH for management
-                and configuration.</para>
              </listitem>
            </itemizedlist>
          </listitem>
          <listitem>
-            <para>Workaround: Restarting OpenDaylight via
+            <para>Workaround:</para>
-            <literal>systemctl</literal> fixes the problem.</para>
+            <para>There is no workaround for this issue. As an alternative,
+            the user can configure a PCI alias instead.</para>
          </listitem>
        </itemizedlist>
      </listitem>
@@ -242,106 +228,6 @@
      </listitem>
      <listitem>
-        <para>Fuel Healthcheck Stack update test fails</para>
-        <itemizedlist>
-          <listitem>
-            <para>Description and Impact:</para>
-            <para>The Platform test case number 5 (Update stack) from the Fuel
-            Healthcheck sometimes fails. This has no impact on the overal
-            cluster functionality.</para>
-          </listitem>
-          <listitem>
-            <para>Workaround: N/A.</para>
-          </listitem>
-        </itemizedlist>
-      </listitem>
-      <listitem>
-        <para>Issue #1 with Openstack Resource Agents and Compute Fencing
-        functionality</para>
-        <itemizedlist>
-          <listitem>
-            <para>Description and Impact:</para>
-            <para>In an OPNFV deployment that uses Openstack Resource Agents,
-            the neutron-openvswitch-agent is killed by Pacemaker when booting,
-            due to Pacemaker misconfiguration.</para>
-          </listitem>
-          <listitem>
-            <para>Workaround:</para>
-            <para>Starting the <literal>systemd</literal> service manually
-            makes it run successfully. Enea NFV Core 1.0.1 is shipped without
-            Openstack Resource Agents, therefore this issue should not affect
-            the user.</para>
-          </listitem>
-        </itemizedlist>
-      </listitem>
-      <listitem>
-        <para>Issue #2 with Openstack Resource Agents and Compute Fencing
-        functionality</para>
-        <itemizedlist>
-          <listitem>
-            <para>Description and Impact:</para>
-            <para>In an OPNFV deployment that uses Openstack Resource Agents,
-            when we configure the <literal>fence_compute</literal> as a
-            Pacemaker resource, the Controller nodes start to reboot each
-            other endlessly.</para>
-          </listitem>
-          <listitem>
-            <para>Workaround:</para>
-            <para>Enea NFV Core 1.0.1 is shipped without Openstack Resource
-            Agents, therefore this issue should not affect the user.</para>
-          </listitem>
-        </itemizedlist>
-      </listitem>
-      <listitem>
-        <para>Virtual instances are not affected by removing a node from the
-        Ceph Storage Cluster</para>
-        <itemizedlist>
-          <listitem>
-            <para>Description and Impact:</para>
-            <para>Engineering wanted to validate the survival of storage
-            systems when a single disk is removed, without causing data loss.
-            Without physical access to the test setup, this test is not
-            feasible.</para>
-          </listitem>
-          <listitem>
-            <para>Workaround:</para>
-            <itemizedlist>
-              <listitem>
-                <para>The chosen approach was to validate what happens to the
-                Ceph cluster, when network connectivity is lost for the
-                Storage interface of one of the nodes. No impact was observed
-                when running an instance using Ceph for volume storage.</para>
-              </listitem>
-              <listitem>
-                <para><ulink
-                url="http://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-ceph-cluster/">Reference
-                information</ulink></para>
-              </listitem>
-            </itemizedlist>
-          </listitem>
-        </itemizedlist>
-      </listitem>
-      <listitem>
        <para>Offline Deploy with Fuel fails at times</para>
        <itemizedlist>
@@ -394,44 +280,67 @@
      </listitem>
      <listitem>
-        <para>Fuel Healthcheck Stack creation with wait condition test,
+        <para>On Mixed Arch Deployment, only the aarch64 TestVM Cirros image
-        fails</para>
+        will be installed by Fuel</para>
        <itemizedlist>
          <listitem>
            <para>Description and Impact:</para>
-            <para>The platform test case (create stack with wait condition)
+            <para>Due to the fact that Fuel will only deploy the aarch64
-            from the Fuel Healthcheck, fails. This has no impact on overall
+            image, Yardstick, Functest, and certain Health Check tests will
-            cluster functionality.</para>
+            not work. These test suites are dependent on a single image name
+            at a time, and do not know on how to place instances on the
+            Compute for images that each require a different arch.</para>
+            <para>To have both testVM images, the user must add the x86_64
+            image manually.</para>
          </listitem>
          <listitem>
-            <para>Workaround: N/A.</para>
+            <para>There is no workaround for the test suites failures.</para>
          </listitem>
        </itemizedlist>
      </listitem>
      <listitem>
-        <para>On Mixed Arch Deployment, only the aarch64 TestVM Cirros image
+        <para>Removing QoS policies is unreliable</para>
-        will be installed by Fuel</para>
        <itemizedlist>
          <listitem>
            <para>Description and Impact:</para>
-            <para>Due to the fact that Fuel will only deploy the aarch64
+            <para>When removing per port bandwidth limiting QoS policies, all
-            image, Yardstick, Functest, and certain Health Check tests will
+            traffic is suddenly dropped. On the the other hand, when removing
-            not work. These test suites are dependent on a single image name
+            QoS policies configured at Openstack network level, traffic flows
-            at a time, and do not know on how to place instances on the
+            as if the rules are still there.</para>
-            Compute for images that each require a different arch.</para>
+          </listitem>
-            <para>To have both testVM images, the user must add the x86_64
+          <listitem>
-            image manually.</para>
+            <para>There is no workaround.</para>
+          </listitem>
+        </itemizedlist>
+      </listitem>
+      <listitem>
+        <para>Enabling Ceph for Glance and Nova ephemeral storage makes the
+        deployment fail on aarch64</para>
+        <itemizedlist>
+          <listitem>
+            <para>Description and Impact:</para>
+            <para>There are multiple configurable Storage Backends in Fuel
+            settings. Enabling Ceph RBD for images (Glance) and Ceph RBD for
+            ephemeral volumes (Nova), makes the deployment fail at the CEPH
+            Ready Check performed on the primary Controller node. This only
+            occurs when using aarch64 nodes; on x86_64, deployment does not
+            fail</para>
          </listitem>
          <listitem>
-            <para>There is no workaround for the test suites failures.</para>
+            <para>There is no workaround. The user should not enable these
+            options.</para>
          </listitem>
        </itemizedlist>
      </listitem>

diff --git a/book-enea-nfv-core-release-info/doc/known_bugs_and_limitations.xml b/book-enea-nfv-core-release-info/doc/known_bugs_and_limitations.xml index bc682ab..1aa394d 100644 --- a/book-enea-nfv-core-release-info/doc/known_bugs_and_limitations.xml +++ b/book-enea-nfv-core-release-info/doc/known_bugs_and_limitations.xml
@@ -44,29 +44,35 @@
44	</listitem>	44	</listitem>
45		45
46	<listitem>	46	<listitem>
47	<para>Instances fail to boot when using a direct port (SR-IOV) on	47	<para>ThunderX integrated NICs cannot be used for SR-IOV</para>
48	ThunderX</para>
49		48
50	<itemizedlist>	49	<itemizedlist>
51	<listitem>	50	<listitem>
52	<para>Description and Impact:</para>	51	<para>Description and Impact:</para>
53		52
54	<para>Deployment is successful with SR-IOV enabled interfaces	53	<para>For the moment ENEA NFV Core is missing the support to
55	however, instances fail to boot when a direct bound (SR-IOV) port	54	configure ThunderX integrated NICs for deployment. Furthermore,
56	is added. This has been tested using a SR-IOV capable PCI Express	55	ThunderX integrated NICs cannot be used for SR-IOV even if
57	Network Interface. As a consequence it is impossible to	56	configured manually after deployment. This happens because
58	passthrough a SR-IOV port on ThunderX.</para>	57	ThunderX integrated NICs are themseleves virtual functions and are
		58	incorrectly handled by libvirt when trying to assign them to a
		59	virtual machine.</para>
		60
		61	<para>It is however possible to deploy with SR-IOV over add-on
		62	interfaces via the PCI-E expansion slots.</para>
59	</listitem>	63	</listitem>
60		64
61	<listitem>	65	<listitem>
62	<para>Workaround: N/A.</para>	66	<para>Workaround: there is no workaround for this issue. As an
		67	alternative, the user can configure an external PCI-E NIC for
		68	SR-IOV.</para>
63	</listitem>	69	</listitem>
64	</itemizedlist>	70	</itemizedlist>
65	</listitem>	71	</listitem>
66		72
67	<listitem>	73	<listitem>
68	<para>Security groups are not working correctly for ICMP traffic in	74	<para>ThunderX integrated NICs cannot be used for PCI passthrough with
69	deployments with OpenDaylight.</para>	75	direct-physical bound Neutron ports</para>
70		76
71	<itemizedlist>	77	<itemizedlist>
72	<listitem>	78	<listitem>
@@ -74,52 +80,32 @@
74		80
75	<itemizedlist>	81	<itemizedlist>
76	<listitem>	82	<listitem>
77	<para>When OPNFV is deployed with OpenDaylight as an SDN	83	<para>PCI Passthrough using direct-physical bound ports also
78	controller, the Security Groups rules pertaining to ICMP do	84	uses the neutron-sriov-agent. Because the interfaces are
79	not work as expected. The OpenFlow rules describing the ICMP	85	represented as virtual functions, it will be impossible to use
80	rules are inconsistent, so VMs can be pinged even when this is	86	Neutron ports bound as direct-physical (the Nova driver will
81	not desired.</para>	87	identify them as type-VF, not type-PF).</para>
82	</listitem>	88
83		89	<para>Due to this, it is not possible to claim PCI devices
84	<listitem>	90	using direct-physical bound ports.</para>
85	<para>This reproduces on aarch64. On x86 the security groups
86	work correctly.</para>
87	</listitem>	91	</listitem>
88	</itemizedlist>	92	</itemizedlist>
89	</listitem>
90
91	<listitem>
92	<para>Workaround: N/A.</para>
93	</listitem>
94	</itemizedlist>
95	</listitem>
96
97	<listitem>
98	<para>Virtual instances do not get IP from DHCP in SFC scenarios with
99	ODL</para>
100
101	<itemizedlist>
102	<listitem>
103	<para>Description and Impact:</para>
104		93
105	<itemizedlist>	94	<itemizedlist>
106	<listitem>	95	<listitem>
107	<para>After a fresh deploy of OPNFV with OpenDaylight and the	96	<para>It is however possible to passthrough any device using
108	Service Function Chaining scenario configurations, instances	97	the PCI alias method, which requries configuring a whitelist
109	fail to get IP from DHCP, due to OpenDaylight	98	of PCI devices and assigning an alias which is set as metadata
110	malfunctioning.</para>	99	in the Nova flavor.</para>
111	</listitem>
112
113	<listitem>
114	<para>The SFC VNFs are not reachable via SSH for management
115	and configuration.</para>
116	</listitem>	100	</listitem>
117	</itemizedlist>	101	</itemizedlist>
118	</listitem>	102	</listitem>
119		103
120	<listitem>	104	<listitem>
121	<para>Workaround: Restarting OpenDaylight via	105	<para>Workaround:</para>
122	<literal>systemctl</literal> fixes the problem.</para>	106
		107	<para>There is no workaround for this issue. As an alternative,
		108	the user can configure a PCI alias instead.</para>
123	</listitem>	109	</listitem>
124	</itemizedlist>	110	</itemizedlist>
125	</listitem>	111	</listitem>
@@ -242,106 +228,6 @@
242	</listitem>	228	</listitem>
243		229
244	<listitem>	230	<listitem>
245	<para>Fuel Healthcheck Stack update test fails</para>
246
247	<itemizedlist>
248	<listitem>
249	<para>Description and Impact:</para>
250
251	<para>The Platform test case number 5 (Update stack) from the Fuel
252	Healthcheck sometimes fails. This has no impact on the overal
253	cluster functionality.</para>
254	</listitem>
255
256	<listitem>
257	<para>Workaround: N/A.</para>
258	</listitem>
259	</itemizedlist>
260	</listitem>
261
262	<listitem>
263	<para>Issue #1 with Openstack Resource Agents and Compute Fencing
264	functionality</para>
265
266	<itemizedlist>
267	<listitem>
268	<para>Description and Impact:</para>
269
270	<para>In an OPNFV deployment that uses Openstack Resource Agents,
271	the neutron-openvswitch-agent is killed by Pacemaker when booting,
272	due to Pacemaker misconfiguration.</para>
273	</listitem>
274
275	<listitem>
276	<para>Workaround:</para>
277
278	<para>Starting the <literal>systemd</literal> service manually
279	makes it run successfully. Enea NFV Core 1.0.1 is shipped without
280	Openstack Resource Agents, therefore this issue should not affect
281	the user.</para>
282	</listitem>
283	</itemizedlist>
284	</listitem>
285
286	<listitem>
287	<para>Issue #2 with Openstack Resource Agents and Compute Fencing
288	functionality</para>
289
290	<itemizedlist>
291	<listitem>
292	<para>Description and Impact:</para>
293
294	<para>In an OPNFV deployment that uses Openstack Resource Agents,
295	when we configure the <literal>fence_compute</literal> as a
296	Pacemaker resource, the Controller nodes start to reboot each
297	other endlessly.</para>
298	</listitem>
299
300	<listitem>
301	<para>Workaround:</para>
302
303	<para>Enea NFV Core 1.0.1 is shipped without Openstack Resource
304	Agents, therefore this issue should not affect the user.</para>
305	</listitem>
306	</itemizedlist>
307	</listitem>
308
309	<listitem>
310	<para>Virtual instances are not affected by removing a node from the
311	Ceph Storage Cluster</para>
312
313	<itemizedlist>
314	<listitem>
315	<para>Description and Impact:</para>
316
317	<para>Engineering wanted to validate the survival of storage
318	systems when a single disk is removed, without causing data loss.
319	Without physical access to the test setup, this test is not
320	feasible.</para>
321	</listitem>
322
323	<listitem>
324	<para>Workaround:</para>
325
326	<itemizedlist>
327	<listitem>
328	<para>The chosen approach was to validate what happens to the
329	Ceph cluster, when network connectivity is lost for the
330	Storage interface of one of the nodes. No impact was observed
331	when running an instance using Ceph for volume storage.</para>
332	</listitem>
333
334	<listitem>
335	<para><ulink
336	url="http://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-ceph-cluster/">Reference
337	information</ulink></para>
338	</listitem>
339	</itemizedlist>
340	</listitem>
341	</itemizedlist>
342	</listitem>
343
344	<listitem>
345	<para>Offline Deploy with Fuel fails at times</para>	231	<para>Offline Deploy with Fuel fails at times</para>
346		232
347	<itemizedlist>	233	<itemizedlist>
@@ -394,44 +280,67 @@
394	</listitem>	280	</listitem>
395		281
396	<listitem>	282	<listitem>
397	<para>Fuel Healthcheck Stack creation with wait condition test,	283	<para>On Mixed Arch Deployment, only the aarch64 TestVM Cirros image
398	fails</para>	284	will be installed by Fuel</para>
399		285
400	<itemizedlist>	286	<itemizedlist>
401	<listitem>	287	<listitem>
402	<para>Description and Impact:</para>	288	<para>Description and Impact:</para>
403		289
404	<para>The platform test case (create stack with wait condition)	290	<para>Due to the fact that Fuel will only deploy the aarch64
405	from the Fuel Healthcheck, fails. This has no impact on overall	291	image, Yardstick, Functest, and certain Health Check tests will
406	cluster functionality.</para>	292	not work. These test suites are dependent on a single image name
		293	at a time, and do not know on how to place instances on the
		294	Compute for images that each require a different arch.</para>
		295
		296	<para>To have both testVM images, the user must add the x86_64
		297	image manually.</para>
407	</listitem>	298	</listitem>
408		299
409	<listitem>	300	<listitem>
410	<para>Workaround: N/A.</para>	301	<para>There is no workaround for the test suites failures.</para>
411	</listitem>	302	</listitem>
412	</itemizedlist>	303	</itemizedlist>
413	</listitem>	304	</listitem>
414		305
415	<listitem>	306	<listitem>
416	<para>On Mixed Arch Deployment, only the aarch64 TestVM Cirros image	307	<para>Removing QoS policies is unreliable</para>
417	will be installed by Fuel</para>
418		308
419	<itemizedlist>	309	<itemizedlist>
420	<listitem>	310	<listitem>
421	<para>Description and Impact:</para>	311	<para>Description and Impact:</para>
422		312
423	<para>Due to the fact that Fuel will only deploy the aarch64	313	<para>When removing per port bandwidth limiting QoS policies, all
424	image, Yardstick, Functest, and certain Health Check tests will	314	traffic is suddenly dropped. On the the other hand, when removing
425	not work. These test suites are dependent on a single image name	315	QoS policies configured at Openstack network level, traffic flows
426	at a time, and do not know on how to place instances on the	316	as if the rules are still there.</para>
427	Compute for images that each require a different arch.</para>	317	</listitem>
428		318
429	<para>To have both testVM images, the user must add the x86_64	319	<listitem>
430	image manually.</para>	320	<para>There is no workaround.</para>
		321	</listitem>
		322	</itemizedlist>
		323	</listitem>
		324
		325	<listitem>
		326	<para>Enabling Ceph for Glance and Nova ephemeral storage makes the
		327	deployment fail on aarch64</para>
		328
		329	<itemizedlist>
		330	<listitem>
		331	<para>Description and Impact:</para>
		332
		333	<para>There are multiple configurable Storage Backends in Fuel
		334	settings. Enabling Ceph RBD for images (Glance) and Ceph RBD for
		335	ephemeral volumes (Nova), makes the deployment fail at the CEPH
		336	Ready Check performed on the primary Controller node. This only
		337	occurs when using aarch64 nodes; on x86_64, deployment does not
		338	fail</para>
431	</listitem>	339	</listitem>
432		340
433	<listitem>	341	<listitem>
434	<para>There is no workaround for the test suites failures.</para>	342	<para>There is no workaround. The user should not enable these
		343	options.</para>
435	</listitem>	344	</listitem>
436	</itemizedlist>	345	</itemizedlist>
437	</listitem>	346	</listitem>