Wednesday, March 31, 2021

Pacemaker/Corosync on Ubuntu

One common way of achieving High-Availability setup is by installing pacemaker
and corosync on your nodes. Pacemaker controls and manages the resources which
is dependent on Corosync which controls the communication between nodes.

We will configure an active/passive setup on 2 nodes running on Ubuntu Server
17.04 and mimic a real-world scenario on which 1 node went down. Do note that
most of the commands below needs to be executed on both nodes unless the task
explicitly designated for one (or any) node only.

Configuring the cluster
=======================

1. In any clustering software, time is a critical factor in ensuring
synchronization between the nodes so let's configure our time/data settings
properly by installing ntp. After installation, wait for few minutes and check
the date and time if they are already correct.

sudo apt-get install ntp
sudo systemctl start ntp
sudo systemctl enable ntp
[...]
date

2. Install our cluster suite. In Ubuntu 17.04, which is the OS of our choice,
"corosync" and other required packages can be installed by just installin
"pacemaker". After installation, make sure required services are running and
enabled at boot.

sudo apt-get install pacemaker
sudo systemctl start pacemaker corosync
sudo systemctl enable pacemaker corosync

3. (Do this one one node only) Corosync requires an authkey (authorization key)
to be present on all members of the cluster. To create one, install an entropy
package, generate authkey and send that to all members of the cluster. In our
case, we will send it to node2. The generated authkey is located in
/etc/corosync/authkey.

sudo apt-get install haveged
sudo corosync-keygen
scp /etc/corosync/authkey node2:/etc/corosync/authkey

4. Backup the default corosync.conf and replace the contents with the config
below. The important items here are the 3 IP addresses - IPs of each nodes and
the bind IP which will be used by the cluster itself. You must decide on what
bind IP will you use. Just make sure that it is not used by any host on your
network. Later, you will see that it will be automatically generated when we
start our cluster.

sudo cp /etc/corosync/corosync.conf /etc/corosync/corosync.conf.orig
sudo vi /etc/corosync/corosync.conf

--- START ---
totem {
  version: 2
  cluster_name: lbcluster
  transport: udpu
  interface {
    ringnumber: 0
    bindnetaddr: <put bind IP here>
    broadcast: yes
    mcastport: 5405
  }
}
quorum {
  provider: corosync_votequorum
  two_node: 1
}
nodelist {
  node {
    ring0_addr: <put node1 IP>
    name: primary
    nodeid: 1
  }
  node {
    ring0_addr: <put node2 IP>
    name: secondary
    nodeid: 2
  }
}
logging {
  to_logfile: yes
  logfile: /var/log/corosync/corosync.log
  to_syslog: yes
  timestamp: on
}
--- END ---

5. Now, we need to allow pacemaker service in corosync. We do that by creating
the service directory and "pcmk" file inside it. We need to add one more setting
on the default file.

sudo mkdir /etc/corosync/service.d
sudo vi /etc/corosync/service.d/pcmk

--- START ---
service {
  name: pacemaker
  ver: 1
}
--- END ---

sudo echo "START=yes" >> /etc/default/corosync

6. Let's restart corosync and pacemaker to get the configuration we made.

sudo systemctl restart corosync pacemaker

7. Let's verify if the cluster honored the node IPs. We must have similar output
below where both node IPs were detected.

sudo corosync-cmapctl | grep members

* sample output *

runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(10.0.0.1)
runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1.status (str) = joined
runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(10.0.0.2)
runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.2.status (str) = joined

8. We can now interact with pacemaker and see the status of our cluster using
"crm" command. On the output below, you will see that both nodes (primary and
secondary) are online but we still don't have a resource. On the next steps, we
will create a virtual IP resource.

crm status

* sample output *

Stack: corosync
Current DC: primary (version 1.1.16-94ff4df) - partition with quorum
Last updated: Thu Dec 28 20:13:19 2017
Last change: Thu Dec 28 20:10:30 2017 by hacluster via crmd on primary

3 nodes configured
0 resources configured

Node node1: UNCLEAN (offline)
Online: [ primary secondary ]

No resources

9. (Do this on one node only) Before creating our virtual IP resource, let's
disable quorum and fencing settings for simplicity. Whenever we configure any
properties, we can do it on node only since it will be synchronizes to all
members.

sudo crm configure property stonith-enabled=false
sudo crm configure property no-quorum-policy=ignore

10. (Do this on one node only) Let's create our first resource using the command
below. This will be a virtual IP (or the bind IP) that will represent our
cluster. Meaning, access to the cluster must be done via this IP and not on the
individual IPs of the nodes. Since we are aiming for an active/passive setup, it
is better to add `resource-stickiness="100"` as one of the parameters. That
means that when one node is offline, the other node will get the bind IP and
assign it to itself from that moment even after the other node comes back to
life. Be sure to set this IP same as the `bindnetaddr` inside corosync.conf.

sudo crm configure primitive virtual_ip \
ocf:heartbeat:IPaddr2 params ip="10.1.1.3" \
cidr_netmask="32" op monitor interval="10s" \
meta migration-threshold="2" failure-timeout="60s" \
resource-stickiness="100"

Same thing in configuring a cluster property, the command above needs to be ran
on one node only since it will be synchronize across the members.

11. Once a resource is created, it will immediately appear on the status. Let's
verify. From the output below, you can see that the virtual_ip resource is
started on the primary which is pertaining to node1. So if you log in to that
node and inspect the network interfaces, you should see a new one tied to the
bind IP. Also, at this moment, that bind IP is already UP and pingable.

sudo crm status

* sample output *

Stack: corosync                                                                 
Current DC: primary (version 1.1.16-94ff4df) - partition with quorum
Last updated: Thu Dec 28 20:33:39 2017     
Last change: Thu Dec 28 20:32:54 2017 by root via cibadmin on primary
                                           
3 nodes configured   
1 resource configured
                             
Online: [ primary secondary ]           
                 
Full list of resources:
                                       
 virtual_ip     (ocf::heartbeat:IPaddr2):       Started primary

Testing High-Availability
=========================

Now that we have a fully working cluster, the best way to appreciate its magic
is by testing!

1. (Do this on node1 only). Let's replicate a real world scenario where the
primary node went down. What will happen to the cluster? Will the bind IP go
down also? You may mimic such scenario by disconnecting the interface or
powering off the server but for a quicker way, let's use our favorite "crm"
command to switch the primary node into "standby" mode.

sudo crm node standby primary

* sample output *

Stack: corosync
Current DC: primary (version 1.1.16-94ff4df) - partition with quorum
Last updated: Thu Dec 28 20:44:45 2017
Last change: Thu Dec 28 20:40:02 2017 by root via crm_attribute on primary

3 nodes configured
1 resource configured

Node primary: standby
Online: [ secondary ]

Full list of resources:

 virtual_ip     (ocf::heartbeat:IPaddr2):       Started secondary

If you started doing a continuous ping to the bind IP before this step, you will
notice a 1 - 5 second pause. Our HA is doing its magic. It is moving the bind IP
from the primary (node1) to the secondary (node2). And when you login in to
the secondary, you will see that a new interface is created having the bind IP.
The primary node no longer have that interface.

2. Now, let's remove the primary from standby mode.

sudo crm node online primary

* sample output *

Stack: corosync
Current DC: primary (version 1.1.16-94ff4df) - partition with quorum
Last updated: Thu Dec 28 20:48:28 2017
Last change: Thu Dec 28 20:48:25 2017 by root via crm_attribute on primary

3 nodes configured
1 resource configured

Online: [ primary secondary ]

Full list of resources:

 virtual_ip     (ocf::heartbeat:IPaddr2):       Started secondary

Both nodes are now online but the bind IP is still on the secondary since we
added `resource-stickiness="100"` as one of the parameters when we configure
our resource.

That concludes our post for today. Hope you learn something! :)

No comments:

Post a Comment