Red Hat Cluster Suite Failover Cluster explained using HTTP as a failover service
Cluster: A Group of two or more computers to perform a same task. Through this document we will explain about the Red Hat Cluster Implementation via Conga Project which has 2 important services running on Base Node and Cluster Nodes respectively.
LUCI: Luci is the service which gets installed on a separate base node which gives us an complete functionality via Admin Console in order to create / Configure / Manage our nodes in cluster.
RICCI: Ricci is the service known as Agent Service which gets installed on all the nodes in the cluster and it is because of this service via LUCI ADMIN they will become JOIN Clusters.
CMAN: Cluster Management: CMAN manages the quorum and cluster membership. A very important component (service) of Red Hat Cluster hence mandatory to run on each of the nodes
Fencing: Fencing is a mechanism of disconnecting a node from the cluster in case node has gone down faulty in order to avoid the data corruption and maintain data integrity.
Lock Management: Red Hat Cluster provides this lock management via known as DLM ( Distributed Lock Manager ). GFS uses locks from lock manager in order to synchronize their access to shared file system metadata.CLVM uses locks from lock manager in order to synchronize their updates to LVM Volumes and Volume Groups on a shared storage.
Cluster Configuration file lies under /etc/cluster/cluster.conf and is an XML file Cluster Resources are defined under cluster configuration file like IP Address , Script , Red Hat Storage GFS 2 Maximum number of nodes supported in Red hat cluster deployment of GFS/GFS2 is 16
Through this document we shall explain the deployment of Apache Application in Red Hat Cluster
Server: – For setting up 2 Node Cluster we may take 2 servers (Quad Core Quad CPU HP machine with minimum of 4GB RAM) and 1 Base node [Quad Core Quad CPU HP machine with minimum of 4GB RAM) on which we will have LUCI install.
Note : You can choose Virtual Nodes as well but there are certain limitation’s with VM Fencing.
1. IP Detailed Requirement:-
1. Server 1 - 2 Local IP for (bonding) HTTP server + 1 IP for Cluster fencing 2. Server 2 - 2 Local IP for (bonding) HTTP server + 1 IP for Cluster fencing 3. Virtual IP - 1 virtual IP for HTTP ( Cluster IP )
2. Storage Requirement:-
a. One 200 GB SAN LUN for Database. (Depend upon requirement)
How to Configure Apache Cluster
Step 1: Firstly Install RHEL 5.6 on both the services respectively with custom packages.
Step 2 : Configure Network Bonding
Steps for creating bonding are as mentioned below: Create the bond interface file for the public network and save the file as # vim /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0 IPADDR=192.168.5.20 [This will be actual network IP address] NETMASK=255.255.255.0 GATEWAY=192.168.5.1 USERCTL=no BOOTPROTO=static ONBOOT=yes
After creating bond0 file, modify eth0 and eth1 file respectively.
# vim /etc/sysconfig/network-scripts/ifcfg-eth0
Make sure you remove HW Address / IP Address / Gateway Information from eth0 and eth1 and add 2 important lines under those file:
# vim /etc/sysconfig/network-scripts/ifcfg-eth0 Make sure file read as follows for eth1 interface: DEVICE=eth0 USERCTL=no ONBOOT=yes MASTER=bond0 SLAVE=yes BOOTPROTO=none
# vim /etc/sysconfig/network-scripts/ifcfg-eth1 Make sure file read as follows for eth1 interface: DEVICE=eth1 USERCTL=no ONBOOT=yes MASTER=bond0 SLAVE=yes BOOTPROTO=none
Load bond driver/module
Edit vim /etc/modprobe.conf Append following two lines: alias bond0 bonding options bond0 mode=balance-alb miimon=100 Save the file accordingly
First, load the bonding module, enter: # modprobe bonding Restart the networking service in order to bring up bond0 interface, enter: # service network restart Check with the below command whether Bonding is actually working or not. # cat /proc/net/bonding/bond0
Step 3: Need to set the hostname on the Base node and other 2 nodes as well namely.
vim /etc/hosts 192.168.5.20 station20.example.com station20 192.168.5.10 station10.example.com station10 192.168.5.30 station30.example.com station30
Step 4: Password Less authentication shall be among both the nodes namely station20.example.com and station30.example.com
Login on station20 and enter command as ssh-keygen #ssh-copy-id -i /root/.ssh/id_rsa.pub station30 Login on station30 and enter command as ssh-keygen #ssh-copy-id -i /root/.ssh/id_rsa.pub station20
Step 5 : Set the yum repository on the base node [ station10 ] and other 2 servers
Step 6: Make sure IPTABLES and SELINUX are disabled on all the three machines
Step 7: Login on Base node and first install LUCI and cluster packages
#yum groupinstall “ClusterStorage” -y #yum install luci* Run command as #luci_admin init Above command will generate a ssl certificate and asks for a password for user admin Assign the password and it will come on # prompt stating we may login from URL as https://192.168.5.10:8084 via username as admin and password as redhat [assume we have given password as redhat] #service luci restart && chkconfig luci on
Step 8: Login on other 2 nodes and first install
#yum groupinstall “ClusterStorage” –y #yum install ricci* -y #service ricci restart && chkconfig ricci on Once above all steps are done, we may need to login from https://192.168.5.10:8084 and start building our cluster.
Step 9: We will use Fencing Device as ILO while building this cluster hence will add the user id and password under the ILO configuration option available in BIOS mode and also set the IP address accordingly. This will be done on both the nodes accordingly.
Need to check whether manual fencing is working or not by logging on each node:
Step 10: Login on station20 and run command as :
#fence_ilo -a station30 -l admin -p redhat -o reboot Username as admin and Password as redhat which we have assigned inthe ILO configuration Login on station30 and run command as: #fence_ilo -a station20 -l admin -p redhat -o reboot
Step 11: Assign 200 GB Storage (LUN) for Apache, 100 GB LUN should be visible in both server node1 and node2
Step 12: Create LVM on 200GB LUN
Step 13: vim /etc/lvm/lvm.conf and set the locking_type=3 which makes LVM a Clustered aware file system
Step 14: Install Apache on both the servers respectively as node1 and node2
Step 15 : Configure the cluster, Login on https://192.168.5.10:8084
First Step which we need to do after Login via Luci console is to create a cluster
Click Cluster > Create a new cluster and add the node host name and password
Step 16 :Mention the Cluster Name as Cluster_01 and enter the both nodes name respectively with their password as shown in the screen shot as mentioned below
Step 17 :Click on view SSL finger print and it shall verify the finger print as mentioned in the screen shot as mentioned below
Step 18: Once we click on submit button it will INSTALL / REBOOT / CONFIGURE and JOIN the node in the cluster Step 19: After Installation is successful we may login on each station20.example.com and station30.example.com and can check our Cluster status via clustat & cman_tool command
#clustat Cluster Status for Cluster_01 @ Fri Jun 1 15:56:08 2012 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ station30.example.com 1 Online station20.example.com 2 Online, Local
#cman_tool status Version: 6.2.0 Config Version: 1 Cluster Name: Cluster_01 Cluster Id: 25517 Cluster Member: Yes Cluster Generation: 8 Membership state: Cluster-Member Nodes: 2 Expected votes: 1 Total votes: 2 Quorum: 1 Active subsystems: 9 Flags: 2node Dirty Ports Bound: 0 11 177 Node name: station20.example.com Node ID: 2 Multicast addresses: 220.127.116.11 Node addresses: 192.168.5.20
Step 20: Next step is to generate the fence key : For that click on Click on Cluster_01 and then click on Fence Option as mentioned in the screen shot As mentioned in above screenshot tick mark on fence daemon and enter the node IP and click apply. Once we do that it will create fence_xvm.key under /etc/cluster folder Step 21: Now we need to add the fence device and mention that fence device under each node. Since Fence device we have to add is a Non Shared Fencing Device, we shall create that fencing while adding a fence device under node itself Click on node and click on Manage Fence for this node Once we click on fence device then we have to click on add a fence device for this node Click on Add a fence device to this node & I am using HP ILO fencing here: Step 22: Few Setting required on ILO as well. To enter in ILO2 configuration reboot the server and wait for the prompt and press F8. First thing that we will configure is the IP address so you go to Network->DNS/DHCP as shown in the visual. After that set DHCP Enabled to OFF. From the main screen select Network->NIC and TCP/IP Set Network Interface Adapter to ON. Configure IP address, Subnet Mask and Gateway and press F10 to save the changes. Set Network Interface Adapter to ON. Configure IP address, Subnet Mask and Gateway and press F10 to save the changes. The next step is to change/create user account settings. From the main screen go to User->Add The next step is to change/create user account settings. From the main screen go to User->Add Step 23: Click on Cluster then Failover Domain and Add Failover Domain Step 24: Format Clustered LVM with GFS2 file system
/dev/vg0/lv0 is an existing lvm Create a file system on /dev/vg0/lvo #mkfs.gfs2 -p lock_dlm Cluster_01:vg0 -j 3 /dev/vg0/lv0 #mkfs.gfs2: More than one device specified (try -h for help) [root@station30 ~]# mkfs.gfs2 -p lock_dlm -t Cluster_01:vg0 -j 3 /dev/vg0/lv0 This will destroy any data on /dev/vg0/lv0. It appears to contain a gfs filesystem. Are you sure you want to proceed? [y/n] y Device: /dev/vg0/lv0 Blocksize: 4096 Device Size 0.48 GB (126976 blocks) Filesystem Size: 0.48 GB (126973 blocks) Journals: 3 Resource Groups: 2 Locking Protocol: "lock_dlm" Lock Table: "Cluster_01:vg0" UUID: A4599910-69AF-5814-8FA9-C1F382B7F5E5 #mount /dev/vg0/lv0 /var/www/html/ #gfs2_tool df /dev/mapper/vg0-lv0
Step 24: Now we need to add the resources
1. Click on Add Resource and then select the IP 2. Then we need to add the GFS File system 3. Now we need to add the script
#clusvcadm -r Webby -m station30.example.com
******* If you interested in Qdisk Concept then follow the below steps ******** Quorum Disk: Just in case we have a 3 node cluster and out of 3, two of our nodes went down, then Cluster will not achieve quorum hence, cluster will not start, in order to start the cluster on a single node, we need max of 2 votes for 1 node, this quorum disk gives us that functionality of voting.
# mkqdisk -c /dev/qdisk-vg/qdisk-lv -l qdisk
On all Nodes: # /etc/init.d/qdiskd restart # chkconfig qdiskd on
# cman_tool status
Final Setup to check your cluster is working as expected or not, i am going to power off Station30, where current my webby application is running. Expected behavior: Webby application should be relocated to other cluster Node. I.E : station20 or station10; And Now I am going to power off my Station20 as well to check, whether my Qdisk configuration working as expected or not. Let’s figure crossed 🙂 🙂 Final cman_tool status to understand the voting calculation: Cheers!!!!