summaryrefslogtreecommitdiffstats
path: root/book-enea-nfv-core-installation-guide/doc/high_availability.xml
diff options
context:
space:
mode:
authorMiruna Paun <Miruna.Paun@enea.com>2017-09-25 13:57:48 +0200
committerMiruna Paun <Miruna.Paun@enea.com>2017-09-25 13:57:48 +0200
commit380e975b1b93e83705c8ed30197b1c23f8193814 (patch)
tree72e98d39867886b77c6008109080b4edb5ee410c /book-enea-nfv-core-installation-guide/doc/high_availability.xml
parent2df2d1adbab4c4fbfda61700945d85ca3ce53d74 (diff)
downloaddoc-enea-nfv-380e975b1b93e83705c8ed30197b1c23f8193814.tar.gz
Create new version of NFV Core 1.0 Installation Guide
USERDOCAP-240 Signed-off-by: Miruna Paun <Miruna.Paun@enea.com>
Diffstat (limited to 'book-enea-nfv-core-installation-guide/doc/high_availability.xml')
-rw-r--r--book-enea-nfv-core-installation-guide/doc/high_availability.xml794
1 files changed, 794 insertions, 0 deletions
diff --git a/book-enea-nfv-core-installation-guide/doc/high_availability.xml b/book-enea-nfv-core-installation-guide/doc/high_availability.xml
new file mode 100644
index 0000000..e489101
--- /dev/null
+++ b/book-enea-nfv-core-installation-guide/doc/high_availability.xml
@@ -0,0 +1,794 @@
1<?xml version="1.0" encoding="ISO-8859-1"?>
2<chapter id="high_availability">
3 <title>High Availability Guide</title>
4
5 <para>ENEA NFV Core 1.0 has been designed to provide high availability
6 characteristics that are needed for developing and deploying telco-grade NFV
7 solutions on top of our OPNFV based platform.</para>
8
9 <para>The High Availability subject in general is very wide and still an
10 important focus in both opensource communities and independent/proprietary
11 solutions market. ENEA NFV Core 1.0 aims to initially leverage the efforts
12 in the upstream OPNFV and OpenStack opensource projects, combining solutions
13 from both worlds in an effort to provide flexibility and a wide enough use
14 case coverage. ENEA has a long time expertise and proprietary solutions
15 addressing High Availability for telco applications, which are subject to
16 integrating with the NFV based solutions, however the initial scope for ENEA
17 NFV Core is to leverage as much as possible the OPNFV Reference Platform and
18 open source projects in general, such as it will be seen further ahead in
19 this chapter.</para>
20
21 <section id="levels">
22 <title>High Availability Levels</title>
23
24 <para>The base for the feature set in ENEA NFV Core is divided into three
25 levels:</para>
26
27 <itemizedlist>
28 <listitem>
29 <para>Hardware Fault</para>
30 </listitem>
31
32 <listitem>
33 <para>NFV Platform HA</para>
34 </listitem>
35
36 <listitem>
37 <para>VNF High Availability</para>
38 </listitem>
39 </itemizedlist>
40
41 <para>The same division of levels of fault management can be seen in the
42 scope of the High Availability for OPNFV (Availability) project. OPNFV
43 also hosts the Doctor Project which is a fault management and maintenance
44 project to develop and realize the consequent implementation for the OPNFV
45 reference platform.</para>
46
47 <para>These two projects complement each other.</para>
48
49 <para>The Availability project addresses HA requirement and solutions from
50 the perspective of the three levels mentioned above and produces high
51 level requirements and API definitions for High Availability of OPNFV, HA
52 Gap Analysis Report for OpenStack and more recently works on optimizing
53 existing OPNFV test frameworks, such as Yardstick, and develops test cases
54 which realize HA specific use cases and scenarios such as derived from the
55 HA requirements.</para>
56
57 <para>The Doctor Project on the other hand aims to build fault management
58 and maintenance framework for high availability of Network Services on top
59 of virtualized infrastructure; the key feature is immediate notification
60 of unavailability of virtualized resources from VIM, to process recovery
61 of VNFs on them. The Doctor project has also collaborated with the
62 Availability project on identifying gaps in upstream project, mainly
63 OpenStack but not exclusive, and has worked towards implementing missing
64 features or improving the functionality, one good example being the Aodh
65 event based alarms, which allows for fast notifications when certain
66 predefined events occur. The Doctor project also produced an architecture
67 design and a reference implementation based on opensource components,
68 which will be presented later on in this document.</para>
69 </section>
70
71 <section id="doctor_arch">
72 <title>Doctor Architecture</title>
73
74 <para>The Doctor documentation shows the detailed architecture for Fault
75 Management and NFVI Maintenance . The two are very similar so we will
76 focus on the Fault Management.</para>
77
78 <para>The architecture specifies a set of functional blocks:</para>
79
80 <itemizedlist>
81 <listitem>
82 <para>Monitor - monitors the virtualized infrastructure capturing
83 fault events in the Software and Hardware; for this particular
84 component we chose Zabbix which is integrated into the platform by
85 means of the Fuel Zabbix Plugin, available upstream.</para>
86 </listitem>
87
88 <listitem>
89 <para>Inspector - this component is able to receive notifications from
90 Monitor components and also OpenStack core components, which allows it
91 to create logic relationships between entities, identify affected
92 resources when faults occur, and communicates with Controllers to
93 update the states of the virtual and physical resources. For this
94 component ENEA NFV Core 1.0 makes use of Vitrage , an OpenStack
95 related project used for Root Cause Analysis, which has been adapted
96 to server as a Doctor Inspector. The integration into the platform is
97 realized with the help of a Fuel Plugin which has been developed
98 internally by ENEA.</para>
99 </listitem>
100
101 <listitem>
102 <para>Controller - OpenStack core components act as Controllers, which
103 are responsible for maintaining the resource map between physical and
104 virtual resources, they accept update requests from the Inspector and
105 are responsible for sending failure event notifications to the
106 Notifier. Components such as Nova, Neutron, Glance, Heat act as
107 Controllers in the Doctor Architecture.</para>
108 </listitem>
109
110 <listitem>
111 <para>Notifier - the focus of this component is on selecting and
112 aggregating failure events received from the controller based on
113 policies mandated by the Consumer. The role of the Notifier is
114 accomplished by the Aodh component in OpenStack.</para>
115 </listitem>
116 </itemizedlist>
117
118 <para>Besides the Doctor components there are a couple other blocks
119 mentioned in the architecture:</para>
120
121 <itemizedlist>
122 <listitem>
123 <para>Administrator - this represents the human role of administrating
124 the platform by means of dedicated interfaces, either visual
125 dashboards, like OpenStack Horizon or Fuel Dashboard, or via CLI
126 tools, like the OpenStack unified CLI that can be accessed
127 traditionally from one of the servers that act as OpenStack Controller
128 nodes. In the case of ENEA NFV Core 1.0, the Administrator can also
129 access the Zabbix dashboard for doing further configurations. The same
130 applies for the Vitrage tool, which comes with its own Horizon
131 dashboard which enables the user to visually inspect the faults
132 reported by the monitoring tools and also creates visual
133 representations of the virtual and physical resources, the
134 relationships between them and the fault correlation. For Vitrage,
135 users will usually want to configure additional usecases and describe
136 relationships between components, via template files written in yaml
137 format. More information about using Vitrage will be presented in a
138 following section.</para>
139 </listitem>
140
141 <listitem>
142 <para>Consumer - this block is vaguely described in the Doctor
143 Architecture and it's out of its scope. Doctor only deals with fault
144 detection and management, making sure faults are handled as soon as
145 possible after detection, identifies affected virtual resources and
146 updates the states of them, but since the actual VNFs are managed,
147 according to the ETSI architecture, by a different entity, Doctor does
148 not deal with recovery actions of the VNFs. The role of the Consumer
149 thus falls in the task of a VNF Manager and Orchestrator. ENEA NFV
150 Core 1.0 provides VNF management capabilities using Tacker, which is
151 an OpenStack project that implements a generic VNF Manager and
152 Orchestrator according to the ETSI MANO Architectural
153 Framework.</para>
154 </listitem>
155 </itemizedlist>
156
157 <para>The functional blocks overview in the picture below has been
158 complemented to show the components used for realizing the Doctor
159 Architecture:</para>
160
161 <mediaobject>
162 <imageobject role="fo">
163 <imagedata contentwidth="600" fileref="images/functional_blocks.svg"
164 format="SVG" />
165 </imageobject>
166 </mediaobject>
167
168 <section id="dr_fault_mg">
169 <title>Doctor Fault Management</title>
170
171 <para>The architecture described in the Doctor project has been
172 demonstrated in various PoCs and demos, but always using sample
173 components for either the consumer or the monitor. ENEA has worked with
174 upstream projects, Doctor and Vitrage, to realize the goals of the
175 Doctor project by using real components, as described before.</para>
176
177 <para>The two pictures below show a typical fault management scenario,
178 as described in the Doctor documentation.</para>
179
180 <mediaobject>
181 <imageobject>
182 <imagedata contentwidth="600" fileref="images/dr_fault_mg.svg" />
183 </imageobject>
184 </mediaobject>
185
186 <mediaobject>
187 <imageobject>
188 <imagedata contentwidth="600" fileref="images/dr_fault_mg_2.svg" />
189 </imageobject>
190 </mediaobject>
191
192 <para>ENEA NFV Core 1.0 uses the same approach described above, but it's
193 worth going through each step and detail them.</para>
194
195 <orderedlist>
196 <listitem>
197 <para>When creating a VNF, the user will have to enable the
198 monitoring capabilities of Tacker, by passing a template which
199 specifies that an alarm will be created when the VM represented by
200 this VNF changes state. The support for alarm monitoring in Tacker
201 is captured in the Alarm Monitoring Framework spec in OpenStack
202 documentation. In a few words, Tacker should be able to create a VNF
203 and then create an Aodh alarm of type event which triggers when the
204 instance is in state ERROR. The action to take when this event
205 triggers is to perform an HTTP call, to an URL managed by Tacker. As
206 a result of this action, Tacker can detect when an instance has
207 failed (for whatever reasons) and will respawn it somewhere
208 else.</para>
209 </listitem>
210
211 <listitem>
212 <para>The subscribe response in this case is an empty operation, the
213 Notifier (Aodh) only has to confirm that the alarm has been
214 created.</para>
215 </listitem>
216
217 <listitem>
218 <para>The NFVI sends monitoring events for resources the VIM has
219 been subscribed to. Note: this subscription message exchange between
220 the VIM and NFVI is not shown in this message flow. This steps is
221 related to Vitrage's capability of receiving notifications from
222 OpenStack services, at this moment Vitrage supports notifications
223 from nova.host, nova.instances, nova.zone, cinder.volume,
224 neutron.network, neutron.port and heat.stack OpenStack
225 datasources.</para>
226 </listitem>
227
228 <listitem>
229 <para>This steps describes faults being detected by Zabbix which are
230 sent to the Inspector (Vitrage) as soon as detected, using a push
231 approach by means of sending an AMQP message to a dedicated message
232 queue managed by Vitrage. For example, if nova-compute fails on one
233 of the compute nodes, Zabbix will format a message specifying all
234 the needed details needed for processing the fault, e.g. a
235 timestamp, what host failed, what event occurred and others.</para>
236 </listitem>
237
238 <listitem>
239 <para>Database lookup to find the virtual resources affected by the
240 detected fault. In this step Vitrage will perform various
241 calculations to detect what virtual resources are affected by the
242 raw failure presented by Zabbix. Vitrage can be configured via
243 templates to correlate instances with the physical hosts they are
244 running on, so that if a compute node fails, then instances running
245 on that host will be affected. A typical usecase is to mark the
246 compute node down (a.k.a mark_host_down) and update the states of
247 all instances running on them, by issuing Nova API calls for each of
248 these instances. Step 5c) shows the Controller (Nova in this case)
249 acting upon the state change of the instance and issues an event
250 alarm to Aodh.</para>
251 </listitem>
252
253 <listitem>
254 <para>The Notifier will acknowledge the alarm event request from
255 Nova and will trigger the alarm(s) created by Tacker in step 1).
256 Since Tacker has configured the alarm to send an HTTP request, Aodh
257 will perform that HTTP call at the URL managed by Tacker.</para>
258 </listitem>
259
260 <listitem>
261 <para>The Consumer (Tacker) will react to the HTTP call and perform
262 the action configured by the user (e.g. respawn the VNF).</para>
263 </listitem>
264
265 <listitem>
266 <para>The action is sent to the Controller (Nova) so that the VNF is
267 recreated.</para>
268 </listitem>
269 </orderedlist>
270
271 <note>
272 <para>The ENEA NFV Core 1.0 Pre-Release fully covers the required
273 Doctor functionality only for the Vitrage and Zabbix
274 components.</para>
275 </note>
276 </section>
277
278 <section id="zabbix">
279 <title>Zabbix Configuration for Push Notifications</title>
280
281 <para>Vitrage supports Zabbix datasource by means of regularly polling
282 the Zabbix agents, which need to be configured in advance. The Vitrage
283 plugin developed internally by ENEA can automatically configure Zabbix
284 so that everything works as expected.</para>
285
286 <para>However, polling is not fast enough for a telco usecase, so it is
287 necessary to configure pushed notifications for Zabbix . This requires
288 manual configuration on one of the controller nodes, since Zabbix uses a
289 centralized database which makes the configuration available on all the
290 other nodes.</para>
291
292 <para>The Zabbix configuration dashboard is available at the same IP
293 address where OpenStack can be reached, e.g.
294 http://&lt;vip__zbx_vip_mgmt&gt;/zabbix.</para>
295
296 <para>To forward zabbix events to Vitrage a new media script needs to be
297 created and associated with a user. Follow the steps below as a Zabbix
298 Admin user:</para>
299
300 <orderedlist>
301 <listitem>
302 <para>Create a new media type [Admininstration Media Types Create
303 Media Type]</para>
304
305 <itemizedlist>
306 <listitem>
307 <para>Name: Vitrage Notifications</para>
308 </listitem>
309
310 <listitem>
311 <para>Type: Script</para>
312 </listitem>
313
314 <listitem>
315 <para>Script name: zabbix_vitrage.py</para>
316 </listitem>
317 </itemizedlist>
318 </listitem>
319
320 <listitem>
321 <para>Modify the Media for the Admin user [Administration
322 Users]</para>
323
324 <itemizedlist>
325 <listitem>
326 <para>Type: Vitrage Notifications</para>
327 </listitem>
328
329 <listitem>
330 <para>Send to: rabbit://rabbit_user:rabbit_pass@127.0.0.1:5672/
331 --- Vitrage message bus url (you need to search for this in
332 /etc/vitrage/vitrage.conf or /etc/nova/nova.conf
333 transport_url)</para>
334 </listitem>
335
336 <listitem>
337 <para>When active: 1-7,00:00-24:00</para>
338 </listitem>
339
340 <listitem>
341 <para>Use if severity: (all)</para>
342 </listitem>
343
344 <listitem>
345 <para>Status: Enabled</para>
346 </listitem>
347 </itemizedlist>
348 </listitem>
349
350 <listitem>
351 <para>Configure Action [Configuration Actions Create Action
352 Action]</para>
353
354 <itemizedlist>
355 <listitem>
356 <para>Name: Forward to Vitrage</para>
357 </listitem>
358
359 <listitem>
360 <para>Default Subject: {TRIGGER.STATUS}</para>
361 </listitem>
362
363 <listitem>
364 <para>Default Message: host={HOST.NAME1} hostid={HOST.ID1}
365 hostip={HOST.IP1} triggerid={TRIGGER.ID}
366 description={TRIGGER.NAME} rawtext={TRIGGER.NAME.ORIG}
367 expression={TRIGGER.EXPRESSION} value={TRIGGER.VALUE}
368 priority={TRIGGER.NSEVERITY} lastchange={EVENT.DATE}
369 {EVENT.TIME}</para>
370 </listitem>
371 </itemizedlist>
372 </listitem>
373
374 <listitem>
375 <para>To send events add under the Conditions tab: 'Maintenance
376 status not in 'maintenance'".</para>
377 </listitem>
378
379 <listitem>
380 <para>Finally, add an operation:</para>
381
382 <itemizedlist>
383 <listitem>
384 <para>Send to Users: Admin</para>
385 </listitem>
386
387 <listitem>
388 <para>Send only to: Vitrage Notifications</para>
389 </listitem>
390 </itemizedlist>
391 </listitem>
392 </orderedlist>
393
394 <para>Using these instructions, Zabbix will call the zabbix_vitrage.py
395 script, which is made readily available by the Fuel Vitrage Plugin,
396 passing the arguments described in step 3). The zabbix_vitrage.py script
397 will then interpret the parameters and format an AMQP message will be
398 sent to the vitrage.notifications queue, which is managed by the
399 vitrage-graph service.</para>
400 </section>
401
402 <section id="vitrage_config">
403 <title>Vitrage Configuration</title>
404
405 <para>The Vitrage team has been collaborating with OPNFV Doctor Project
406 in order to support Vitrage as an Inspector Component. The Doctor
407 usecase for Vitrage is described in an OpenStack blueprint .
408 Additionally, ENEA NFV Core has complemented Vitrage with the capability
409 of setting states of failed instances by implementing an action type in
410 Vitrage which calls Nova APIs to set instances in error state. There is
411 also an action type which allows fencing failed hosts.</para>
412
413 <para>In order to make use of these features, Vitrage supports
414 additional configurations via yaml templates that must be placed in
415 /etc/vitrage/templates on the nodes have the Vitrage role.</para>
416
417 <para>The example below shows how to program Vitrage to mark failed
418 compute hosts as down and then to change the state of the instances to
419 Error, by creating Vitrage deduced alarms.</para>
420
421 <programlisting>metadata:
422 name: test_nova_mark_instance_err
423 description: test description
424definitions:
425 entities:
426 - entity:
427 category: ALARM
428 type: zabbix
429 rawtext: Nova Compute process is not running on {HOST.NAME}
430 template_id: zabbix_alarm
431 - entity:
432 category: RESOURCE
433 type: nova.host
434 template_id: host
435 - entity:
436 category: RESOURCE
437 type: nova.instance
438 template_id: instance
439 relationships:
440 - relationship:
441 source: zabbix_alarm
442 relationship_type: on
443 target: host
444 template_id: nova_process_not_running
445 - relationship:
446 source: host
447 target: instance
448 relationship_type: contains
449 template_id : host_contains_instance
450scenarios:
451 - scenario:
452 condition: nova_process_not_running and host_contains_instance
453 actions:
454 - action:
455 action_type: mark_down
456 action_target:
457 target: host
458 - action:
459 action_type: set_instance_state
460 action_target:
461 target: instance
462 - action:
463 action_type: set_state
464 action_target:
465 target: instance
466 properties:
467 state: ERROR</programlisting>
468
469 <para>For the action type of fencing a similar action item must be
470 added:</para>
471
472 <programlisting>- scenario:
473 condition: critical_problem_on_host
474 actions:
475 - action:
476 action_type: fence
477 action_target:
478 target: host</programlisting>
479
480 <para>After a template is configured, it is required to restart the
481 vitrage-api and vitrage-graph services:</para>
482
483 <programlisting>root@node-6:~# systemctl restart vitrage-api
484root@node-6:~# systemctl restart vitrage-graph</programlisting>
485 </section>
486
487 <section id="vitrage_custom">
488 <title>Vitrage Customizations</title>
489
490 <para>ENEA NFV Core 1.0 has added custom features for Vitrage which
491 allow two kinds of action:</para>
492
493 <orderedlist>
494 <listitem>
495 <para>Perform actions Northbound of the VIM</para>
496
497 <itemizedlist>
498 <listitem>
499 <para>Nova force host down on compute</para>
500 </listitem>
501
502 <listitem>
503 <para>Setting instance state to error in nova; this is used in
504 conjunction with an alarm created by Tacker, as described
505 before, should allow Tacker to detect when an instance is
506 affected and take proper actions.</para>
507 </listitem>
508 </itemizedlist>
509 </listitem>
510
511 <listitem>
512 <para>Perform actions Southbound of the VIM.</para>
513
514 <para>Vitrage templates allow us to program fencing actions for
515 hosts with failed services. In the event of that systemd is unable
516 to recover from a critical process or other type of sofware error
517 ocurs on Hardware supporting them, we can program a fencing of that
518 Node which will perform a reboot thus attempting to recover a failed
519 node.</para>
520 </listitem>
521 </orderedlist>
522 </section>
523 </section>
524
525 <section id="pm_high_avail">
526 <title>Pacemaker High Availability</title>
527
528 <para>Many of the OpenStack solutions which offer High Availability
529 characteristics employ pacemaker for achieving highly available OpenStack
530 services. Traditionally pacemaker has been used for managing only the
531 control plane services, so it can effectively provide redundancy and
532 recovery for the Controller nodes only. One reason for this is that
533 Controller nodes and Compute nodes essentially have very different High
534 Availability requirements that need to be considered. Typically, for
535 Controller nodes, the services that run on them are stateless, with few
536 exceptions, where only one instance of a given service is allowed, but for
537 which redundancy is still desired, one good example being an AMQP service
538 (e.g. RabbitMQ). Compute nodes HA requirements depend on the type of
539 services that run on them, but typically it is desired that failures on
540 these nodes is detected as soon as possible so that the instances that run
541 on them can be either migrated, resurrected or restarted. One other aspect
542 is that sometimes failures on the physical hosts do not necessarily cause
543 a failure on the services (VNFs), but having these services incapacitated
544 can prevent accessing and controlling the services.</para>
545
546 <para>So Controller High Availability is one subject which is in general
547 well understood and experimented with, and the base of achieving this is
548 Pacemaker using Corosync underneath.</para>
549
550 <para>Extending the use of pacemaker to Compute nodes was thought as a
551 possible solution for providing VNF high availability, but this turns out
552 to be a problem which is not easy to solve. On one hand pacemaker as a
553 clustering tool can only scale properly up to limited number of nodes,
554 usually less than 128. This poses a problem for large scale deployments
555 where hundreds of compute nodes are required. On the other hand, Compute
556 node HA requires other considerations and calls for specially designed
557 solutions.</para>
558
559 <section id="pm_remote">
560 <title>Pacemaker Remote</title>
561
562 <para>As mentioned earlier, pacemaker and corosync do not scale well
563 over a large cluster, because each node has to talk to everyone,
564 essentially creating a mesh configuration. Some solution to this problem
565 could be partitioning the cluster into smaller groups, but this solution
566 has its limitation and it's generally difficult to manage.</para>
567
568 <para>A better solution is using pacemaker-remote, a feature of
569 pacemaker which allows extending the cluster beyond the usual limits by
570 using the pacemaker monitoring capabilities, essentially creating a new
571 type of resource which enables adding light weight nodes to the cluster.
572 More information about pacemaker-remote can be found on the official
573 clusterlabs website.</para>
574
575 <para>Please note that at this moment pacemaker remote must be
576 configured manually after deployment. Here are the manual steps for
577 doing so:</para>
578
579 <orderedlist>
580 <listitem>
581 <para>Logon to the Fuel Master using the default credentials if not
582 changed (root/r00tme)</para>
583 </listitem>
584
585 <listitem>
586 <para>Type fuel node to obtain the list of nodes, their roles and
587 the IP addresses</para>
588
589 <programlisting>[root@fuel ~]# fuel node
590id | status | name | cluster | ip | mac | roles /
591 | pending_roles | online | group_id
592---+--------+------------------+---------+-----------+-------------------+----------/
593-----------------+---------------+--------+---------
594 1 | ready | Untitled (8c:d4) | 1 | 10.20.0.4 | 68:05:ca:46:8c:d4 | ceph-osd,/
595 controller | | 1 | 1
596 4 | ready | Untitled (8c:c2) | 1 | 10.20.0.6 | 68:05:ca:46:8c:c2 | ceph-osd,/
597 compute | | 1 | 1
598 5 | ready | Untitled (8c:c9) | 1 | 10.20.0.7 | 68:05:ca:46:8c:c9 | ceph-osd,/
599 compute | | 1 | 1
600 2 | ready | Untitled (8b:64) | 1 | 10.20.0.3 | 68:05:ca:46:8b:64 | /
601controller, mongo, tacker | | 1 | 1
602 3 | ready | Untitled (8c:45) | 1 | 10.20.0.5 | 68:05:ca:46:8c:45 | /
603controller, vitrage | | 1 | 1</programlisting>
604 </listitem>
605
606 <listitem>
607 <para>Each controller has a unique pacemaker authkey, we need to
608 keep one an propagate it to the other servers. Assuming node-1,
609 node-2 and node-3 are the controllers, execute the following from
610 the Fuel console:</para>
611
612 <programlisting>[root@fuel ~]# scp node-1:/etc/pacemaker/authkey .
613[root@fuel ~]# scp authkey node-2:/etc/pacemaker/
614[root@fuel ~]# scp authkey node-3:/etc/pacemaker/
615[root@fuel ~]# scp authkey node-3:/etc/pacemaker/
616[root@fuel ~]# scp authkey node-4:~
617[root@fuel ~]# scp authkey node-5:~</programlisting>
618 </listitem>
619
620 <listitem>
621 <para>For each compute node, log on to it using the corresponding
622 IP.</para>
623 </listitem>
624
625 <listitem>
626 <para>Install the required packages:</para>
627
628 <programlisting>root@node-4:~# apt-get install pacemaker-remote resource-agents crmsh</programlisting>
629 </listitem>
630
631 <listitem>
632 <para>Copy the authkey from the Fuel master and make sure the right
633 permissions are set:</para>
634
635 <programlisting>[root@node-4:~]# cp authkey /etc/pacemaker
636[root@node-4:~]# chown root:haclient /etc/pacemaker/authkey</programlisting>
637 </listitem>
638
639 <listitem>
640 <para>Add iptables rule for the default port (3121). Also save it to
641 /etc/iptables/rules.v4 to make it persistent:</para>
642
643 <programlisting>root@node-4:~# iptables -A INPUT -s 192.168.0.0/24 -p tcp -m multiport /
644--dports 3121 -m comment --comment "pacemaker_remoted from 192.168.0.0/24" -j ACCEPT </programlisting>
645 </listitem>
646
647 <listitem>
648 <para>Start the pacemaker-remote service</para>
649
650 <programlisting>[root@node-4:~]# systemctl start pacemaker-remote.service</programlisting>
651 </listitem>
652
653 <listitem>
654 <para>Log on one of the controller nodes and configure the
655 pacemaker-remote resources:</para>
656
657 <programlisting>[root@node-1:~]# pcs resource create node-4.domain.tld remote
658[root@node-1:~]# pcs constraint location node-4.domain.tld prefers /
659node-1.domain.tld=100 node-2.domain.tld=100 node-3.domain.tld=100
660[root@node-1:~]# pcs constraint location node-4.domain.tld avoids node-5.domain.tld
661[root@node-1:~]# pcs resource create node-5.domain.tld remote
662[root@node-1:~]# pcs constraint location node-5.domain.tld prefers /
663node-1.domain.tld=100 node-2.domain.tld=100 node-3.domain.tld=100
664[root@node-1:~]# pcs constraint location node-5.domain.tld avoids node-4.domain.tld</programlisting>
665 </listitem>
666
667 <listitem>
668 <para>Remote nodes should now appear online:</para>
669
670 <programlisting>[root@node-1:~]# pcs status
671Cluster name: OpenStack
672Last updated: Thu Aug 24 12:00:21 2017 Last change: Thu Aug 24 11:57:32 2017 /
673by root via cibadmin on node-1.domain.tld
674Stack: corosync
675Current DC: node-1.domain.tld (version 1.1.14-70404b0) - partition with quorum
6765 nodes and 78 resources configured
677
678Online: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
679RemoteOnline: [ node-4.domain.tld node-5.domain.tld ]</programlisting>
680 </listitem>
681 </orderedlist>
682 </section>
683
684 <section id="pm_fencing">
685 <title>Pacemaker Fencing</title>
686
687 <para>ENEA NFV Core 1.0 makes use of the fencing capabilities of
688 Pacemaker to isolate faulty nodes and trigger recovery actions by means
689 of power cycling the failed nodes. Fencing is configured by creating
690 STONITH type resources for each of the servers in the cluster, both
691 Controller nodes and Compute nodes. The STONITH adapter for fencing the
692 nodes is fence_ipmilan, which makes use of the IPMI capabilities of the
693 Cavium ThunderX servers.</para>
694
695 <para>Here are the steps for enabling fencing capabilities in the
696 cluster:</para>
697
698 <orderedlist>
699 <listitem>
700 <para>Logon to the Fuel Master using the default credentials if not
701 changed (root/r00tme).</para>
702 </listitem>
703
704 <listitem>
705 <para>Type fuel node to obtain the list of nodes, their roles and
706 the IP addresses:</para>
707
708 <programlisting>[root@fuel ~]# fuel node
709id | status | name | cluster | ip | mac | roles /
710 | pending_roles | online | group_id
711---+--------+------------------+---------+-----------+-------------------+----------/
712-----------------+---------------+--------+---------
713 1 | ready | Untitled (8c:d4) | 1 | 10.20.0.4 | 68:05:ca:46:8c:d4 | ceph-osd,/
714 controller | | 1 | 1
715 4 | ready | Untitled (8c:c2) | 1 | 10.20.0.6 | 68:05:ca:46:8c:c2 | ceph-osd,/
716 compute | | 1 | 1
717 5 | ready | Untitled (8c:c9) | 1 | 10.20.0.7 | 68:05:ca:46:8c:c9 | ceph-osd,/
718 compute | | 1 | 1
719 2 | ready | Untitled (8b:64) | 1 | 10.20.0.3 | 68:05:ca:46:8b:64 | /
720controller, mongo, tacker | | 1 | 1
721 3 | ready | Untitled (8c:45) | 1 | 10.20.0.5 | 68:05:ca:46:8c:45 | /
722controller, vitrage | | 1 | 1
723</programlisting>
724 </listitem>
725
726 <listitem>
727 <para>Logon to each server to install additional packages:</para>
728
729 <programlisting>[root@node-1:~]# apt-get install fence-agents ipmitool</programlisting>
730 </listitem>
731
732 <listitem>
733 <para>Configure pacemaker fencing resources; this needs to be done
734 once on one of the controllers. The parameters will vary, depending
735 on the BMC addresses of each node and credentials.</para>
736
737 <programlisting>[root@node-1:~]# crm configure primitive ipmi-fencing-node-1 /
738stonith::fence_ipmilan params pcmk_host_list="node-1.domain.tld" /
739ipaddr=10.0.100.151 login=ADMIN passwd=ADMIN op monitor interval="60s"
740[root@node-1:~]# crm configure primitive ipmi-fencing-node-2 /
741stonith::fence_ipmilan params pcmk_host_list="node-2.domain.tld" /
742ipaddr=10.0.100.152 login=ADMIN passwd=ADMIN op monitor interval="60s"
743[root@node-1:~]# crm configure primitive ipmi-fencing-node-3 /
744stonith::fence_ipmilan params pcmk_host_list="node-3.domain.tld" /
745ipaddr=10.0.100.153 login=ADMIN passwd=ADMIN op monitor interval="60s"
746[root@node-1:~]# crm configure primitive ipmi-fencing-node-4 /
747stonith::fence_ipmilan params pcmk_host_list="node-4.domain.tld" /
748ipaddr=10.0.100.154 login=ADMIN passwd=ADMIN op monitor interval="60s"
749[root@node-1:~]# crm configure primitive ipmi-fencing-node-5 /
750stonith::fence_ipmilan params pcmk_host_list="node-5.domain.tld" /
751ipaddr=10.0.100.155 login=ADMIN passwd=ADMIN op monitor interval="60s"</programlisting>
752 </listitem>
753
754 <listitem>
755 <para>Activate fencing by enabling stonith property in pacemaker (by
756 default it is disabled); this also needs to be done only once, on
757 one of the controllers.</para>
758
759 <programlisting>[root@node-1:~]# pcs property set stonith-enabled=true</programlisting>
760 </listitem>
761 </orderedlist>
762 </section>
763 </section>
764
765 <section id="ops_resources_agents">
766 <title>OpenStack Resource Agents</title>
767
768 <para>The OpenStack community has been working for some time on
769 identifying possible solutions for enabling High Availability for Compute
770 nodes, although initially the subject of HA on compute node was very
771 controversial as not being something that should concern the cloud
772 platform. Over time it became obvious that even on a true cloud platform,
773 where services are designed to run without being affected by the
774 availability of the cloud platform, fault management and recovery is still
775 very important and desirable. This is very much the case for NFV
776 applications, where, in the good tradition of telecom applications, the
777 operators must have complete engineering control over the resources it
778 owns and manages.</para>
779
780 <para>The work for compute node high availability is captured in an
781 OpenStack user story and documented upstream, showing proposed solutions,
782 summit talks and presentations.</para>
783
784 <para>A number of these solutions make use of OpenStack Resource Agents,
785 which are basically a set of specialized pacemaker resources which are
786 capable of identifying failures in compute nodes and can perform automatic
787 evacuation of the instances affected by these failures.</para>
788
789 <para>ENEA NFV Core 1.0 aims to validate and integrate this work and to
790 make this feature available in the platform to be used as an alternative
791 to the Doctor framework, where simple, autonomous recovery of the running
792 instances is desired.</para>
793 </section>
794</chapter> \ No newline at end of file