Red Hat Cluster Suite Failover Cluster explained using HTTP as a failover service
Cluster: A Group of two or more computers to perform a same task. Through this document we will explain about the Red Hat Cluster Implementation via Conga Project which has 2 important services running on Base Node and Cluster Nodes respectively.
LUCI: Luci is the service which gets installed on a separate base node which gives us an complete functionality via Admin Console in order to create / Configure / Manage our nodes in cluster.
RICCI: Ricci is the service known as Agent Service which gets installed on all the nodes in the cluster and it is because of this service via LUCI ADMIN they will become JOIN Clusters.
CMAN: Cluster Management: CMAN manages the quorum and cluster membership. A very important component (service) of Red Hat Cluster hence mandatory to run on each of the nodes
Fencing: Fencing is a mechanism of disconnecting a node from the cluster in case node has gone down faulty in order to avoid the data corruption and maintain data integrity.
Lock Management: Red Hat Cluster provides this lock management via known as DLM ( Distributed Lock Manager ). GFS uses locks from lock manager in order to synchronize their access to shared file system metadata.CLVM uses locks from lock manager in order to synchronize their updates to LVM Volumes and Volume Groups on a shared storage.
Cluster Configuration file lies under /etc/cluster/cluster.conf and is an XML file Cluster Resources are defined under cluster configuration file like IP Address , Script , Red Hat Storage GFS 2 Maximum number of nodes supported in Red hat cluster deployment of GFS/GFS2 is 16
Through this document we shall explain the deployment of Apache Application in Red Hat Cluster
Server: – For setting up 2 Node Cluster we may take 2 servers (Quad Core Quad CPU HP machine with minimum of 4GB RAM) and 1 Base node [Quad Core Quad CPU HP machine with minimum of 4GB RAM) on which we will have LUCI install.
Note : You can choose Virtual Nodes as well but there are certain limitation’s with VM Fencing.
1. IP Detailed Requirement:-
1. Server 1 - 2 Local IP for (bonding) HTTP server + 1 IP for Cluster fencing
2. Server 2 - 2 Local IP for (bonding) HTTP server + 1 IP for Cluster fencing
3. Virtual IP - 1 virtual IP for HTTP ( Cluster IP )
2. Storage Requirement:-
a. One 200 GB SAN LUN for Database. (Depend upon requirement)
How to Configure Apache Cluster
Step 1: Firstly Install RHEL 5.6 on both the services respectively with custom packages.
Step 2 : Configure Network Bonding
Steps for creating bonding are as mentioned below: Create the bond interface file for the public network and save the file as # vim /etc/sysconfig/network-scripts/ifcfg-bond0
IPADDR=192.168.5.20 [This will be actual network IP address]
After creating bond0 file, modify eth0 and eth1 file respectively.
# vim /etc/sysconfig/network-scripts/ifcfg-eth0
Make sure you remove HW Address / IP Address / Gateway Information from eth0 and eth1 and add 2 important lines under those file:
# vim /etc/sysconfig/network-scripts/ifcfg-eth0
Make sure file read as follows for eth1 interface:
# vim /etc/sysconfig/network-scripts/ifcfg-eth1
Make sure file read as follows for eth1 interface:
Load bond driver/module
Edit vim /etc/modprobe.conf
Append following two lines:
alias bond0 bonding
options bond0 mode=balance-alb miimon=100
Save the file accordingly
First, load the bonding module, enter:
# modprobe bonding
Restart the networking service in order to bring up bond0 interface, enter:
# service network restart
Check with the below command whether Bonding is actually working or not.
# cat /proc/net/bonding/bond0
Step 3: Need to set the hostname on the Base node and other 2 nodes as well namely.
192.168.5.20 station20.example.com station20
192.168.5.10 station10.example.com station10
192.168.5.30 station30.example.com station30
Step 4: Password Less authentication shall be among both the nodes namely station20.example.com and station30.example.com
Login on station20 and enter command as ssh-keygen
#ssh-copy-id -i /root/.ssh/id_rsa.pub station30
Login on station30 and enter command as ssh-keygen
#ssh-copy-id -i /root/.ssh/id_rsa.pub station20
Step 5 : Set the yum repository on the base node [ station10 ] and other 2 servers
Step 6: Make sure IPTABLES and SELINUX are disabled on all the three machines
Step 7: Login on Base node and first install LUCI and cluster packages
#yum groupinstall “ClusterStorage” -y
#yum install luci*
Run command as #luci_admin init
Above command will generate a ssl certificate and asks for a password for user admin
Assign the password and it will come on # prompt stating we may login from URL as
https://192.168.5.10:8084 via username as admin and password as redhat [assume we have given password as redhat]
#service luci restart && chkconfig luci on
Step 8: Login on other 2 nodes and first install
#yum groupinstall “ClusterStorage” –y
#yum install ricci* -y
#service ricci restart && chkconfig ricci on
Once above all steps are done, we may need to login from https://192.168.5.10:8084 and start building our cluster.
Step 9: We will use Fencing Device as ILO while building this cluster hence will add the user id and password under the ILO configuration option available in BIOS mode and also set the IP address accordingly. This will be done on both the nodes accordingly.
Need to check whether manual fencing is working or not by logging on each node:
Step 10: Login on station20 and run command as :
#fence_ilo -a station30 -l admin -p redhat -o reboot
Username as admin and Password as redhat which we have assigned inthe ILO configuration
Login on station30 and run command as:
#fence_ilo -a station20 -l admin -p redhat -o reboot
Step 11: Assign 200 GB Storage (LUN) for Apache, 100 GB LUN should be visible in both server node1 and node2
Step 12: Create LVM on 200GB LUN
Step 13: vim /etc/lvm/lvm.conf and set the locking_type=3 which makes LVM a Clustered aware file system
Step 14: Install Apache on both the servers respectively as node1 and node2
Step 15 : Configure the cluster, Login on https://192.168.5.10:8084
First Step which we need to do after Login via Luci console is to create a cluster
Click Cluster > Create a new cluster and add the node host name and password
Step 16 :Mention the Cluster Name as Cluster_01 and enter the both nodes name respectively with their password as shown in the screen shot as mentioned below
Step 17 :Click on view SSL finger print and it shall verify the finger print as mentioned in the screen shot as mentioned below
Step 18: Once we click on submit button it will INSTALL / REBOOT / CONFIGURE and JOIN the node in the cluster Step 19: After Installation is successful we may login on each station20.example.com and station30.example.com and can check our Cluster status via clustat & cman_tool command
Cluster Status for Cluster_01 @ Fri Jun 1 15:56:08 2012
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
station30.example.com 1 Online
station20.example.com 2 Online, Local
Config Version: 1
Cluster Name: Cluster_01
Cluster Id: 25517
Cluster Member: Yes
Cluster Generation: 8
Membership state: Cluster-Member
Expected votes: 1
Total votes: 2
Active subsystems: 9
Flags: 2node Dirty
Ports Bound: 0 11 177
Node name: station20.example.com
Node ID: 2
Multicast addresses: 188.8.131.52
Node addresses: 192.168.5.20
Step 20: Next step is to generate the fence key : For that click on Click on Cluster_01 and then click on Fence Option as mentioned in the screen shot As mentioned in above screenshot tick mark on fence daemon and enter the node IP and click apply. Once we do that it will create fence_xvm.key under /etc/cluster folder Step 21: Now we need to add the fence device and mention that fence device under each node. Since Fence device we have to add is a Non Shared Fencing Device, we shall create that fencing while adding a fence device under node itself Click on node and click on Manage Fence for this node Once we click on fence device then we have to click on add a fence device for this node Click on Add a fence device to this node & I am using HP ILO fencing here: Step 22: Few Setting required on ILO as well. To enter in ILO2 configuration reboot the server and wait for the prompt and press F8. First thing that we will configure is the IP address so you go to Network->DNS/DHCP as shown in the visual. After that set DHCP Enabled to OFF. From the main screen select Network->NIC and TCP/IP Set Network Interface Adapter to ON. Configure IP address, Subnet Mask and Gateway and press F10 to save the changes. Set Network Interface Adapter to ON. Configure IP address, Subnet Mask and Gateway and press F10 to save the changes. The next step is to change/create user account settings. From the main screen go to User->Add The next step is to change/create user account settings. From the main screen go to User->Add Step 23: Click on Cluster then Failover Domain and Add Failover Domain Step 24: Format Clustered LVM with GFS2 file system
/dev/vg0/lv0 is an existing lvm
Create a file system on /dev/vg0/lvo
#mkfs.gfs2 -p lock_dlm Cluster_01:vg0 -j 3 /dev/vg0/lv0
#mkfs.gfs2: More than one device specified (try -h for help)
[root@station30 ~]# mkfs.gfs2 -p lock_dlm -t Cluster_01:vg0 -j 3 /dev/vg0/lv0
This will destroy any data on /dev/vg0/lv0.
It appears to contain a gfs filesystem.
Are you sure you want to proceed? [y/n] y
Device Size 0.48 GB (126976 blocks)
Filesystem Size: 0.48 GB (126973 blocks)
Resource Groups: 2
Locking Protocol: "lock_dlm"
Lock Table: "Cluster_01:vg0"
#mount /dev/vg0/lv0 /var/www/html/
#gfs2_tool df /dev/mapper/vg0-lv0
Step 24: Now we need to add the resources
1. Click on Add Resource and then select the IP
2. Then we need to add the GFS File system
3. Now we need to add the script
Step 25: Now Add a Service Group Add resources in dependency order > IP > File System > script to run the service successfully. Start the Webby Service
#clusvcadm -r Webby -m station30.example.com
******* If you interested in Qdisk Concept then follow the below steps ******** Quorum Disk: Just in case we have a 3 node cluster and out of 3, two of our nodes went down, then Cluster will not achieve quorum hence, cluster will not start, in order to start the cluster on a single node, we need max of 2 votes for 1 node, this quorum disk gives us that functionality of voting.
# mkqdisk -c /dev/qdisk-vg/qdisk-lv -l qdisk
In the above mentioned screenshot; Setting UP Qdisk configuration in Cluster
On all Nodes:
# /etc/init.d/qdiskd restart
# chkconfig qdiskd on
# cman_tool status
Final Setup to check your cluster is working as expected or not, i am going to power off Station30, where current my webby application is running. Expected behavior: Webby application should be relocated to other cluster Node. I.E : station20 or station10; And Now I am going to power off my Station20 as well to check, whether my Qdisk configuration working as expected or not. Let’s figure crossed 🙂 🙂 Final cman_tool status to understand the voting calculation: Cheers!!!!