I-Space Research Labs

Clustering made EZ

by on Jun.02, 2009, under Clustering, Tech Stuff

I’m going to be a little helpful and explain how to do clustering on Redhat Linux. This should also work for Fedora and CentOS. But don’t ask me about other distros, I don’t use them.

First, some definitions. When I say clustering, I’m talking about high availability. Not parallel or cloud computing. Not that tired old joke “Imagine a Beowulf cluster of that! HAWHAWHAW”.

/SLAP! God, I hate Slashdotters.

When we’re talking clustering, we’re talking about making things highly available by throwing 2 or more computers at providing some kind of service, be it a file share, application, etc… We’re also not talking about load balancing, that’s something else in the Redhat world (Pirhana).So let’s imagine that we have a website that we want to make highly available, and management was nice enough to throw two nearly identical servers and we scrounged up a third box that happens to have some nice disk that we can use as shared storage via iSCSI. A fourth box will be our luci server, which we will use to manage the cluster. Doesn’t that sound EXCITING?! (this is where you nod your head yes).

Kickstart your two Linux boxes. You don’t need to install Clustering or Cluster-storage yet, but you can if you want to during package installation phase. Being a lazy admin (those are the best), I have all my RPMs copied up to a web server on the network here, so I can just do a ghetto kickstart since I’m having problems with Spacewalk and CentOS. Once Linux is installed, I just make my webserver a yum repository, it makes things a lot easier. Hell, kickstart your iSCSI box too while you’re at it.

Once you have the base OS installed and you’re ready to start clustering…. make sure that you’ve made a yum repository from the CD contents because it’s going to save you a LOT of trouble. Go ahead and do it, I’ll wait (hint: makerepo).

I’m now going to give you the command that will install everything you need, if you didn’t install Clustering during package selection phase:

yum install -y cluster-cim cluster-snmp rgmanager ricci gfs-utils gnbd kmod-gnbd-PAE kmod-gfs-PAE kmod-gfs2-PAE lvm2-cluster iscsi-initiator-utils

Sit back and listen to some Amon Amarth while yum does its thing. If you’re not going to use iSCSI for shared storage, you can drop it from the list.

On a separate box, install luci. It makes cluster management a snap. Fire it up, and make sure that ricci is running on your cluster boxes. Connect to luci through a web browser and set up your new cluster. Luci needs to be able to ssh into each host as root, but only needs to do this once, so make sure that your iptables is set to allow ssh for the moment and that sshd is allowing root to log in. Once luci has done its thing, you can disable root from being able to log in via ssh and tweak your firewalls if you’re not allowing ssh in.

The trick here is to set up the initial cluster before worrying about storage. You don’t need any services, resources or failover domains set up yet, but it helps to have the cluster in a basic state before worrying about storage.

First thing is to set up a small parition on the shared disk for a quorum. This will allow you to expand the cluster later on, and help solve the problem of the “split-brain” issue when you have only 2 nodes in a cluster. It’s amusing the first couple of times watching both nodes fence each other off, but eventually it gets tiresome and you need to stop it. One thing you could do if you’re not using a quorum disk is to add clean_start=”1″ in the fenced_daemon stanza of cluster.config, which will tell the nodes to assume that the other nodes are in a proper state to be in a cluster.

Or you can do it RIGHT and create a quorum disk. You can create the partition in Luci if you want, or in fdisk, parted, etc… just don’t put a filesystem on the partition. Then use mkqdisk to create a quorum disk and label it. The two main arguments to worry about are -c <partition> and -l <label-name>.  If /dev/sdb1 is going to be your quorum disk and it will be called quorum-disk, the command would be mkqdisk -c /dev/sdb1 -l quorum-disk

In luci, you then go to the quorum disk tab and enter in the information to have all cluster members mount it. Give the quorum disk enough votes so that there is still a quorum even if 1 node is all that is up. Every host gets a vote, so you need a quorum disk to cast the deciding vote(s), sort of like the Vice President. The formula is (N/2)+1 where N is the number of nodes in the cluster. Round down if your math doesn’t divide evenly. 4 nodes? (4/2)+1=3, so the quorum disk needs 2 votes. 5 nodes? (5/2)+1= 2.5 (round down to 2) +1 = 3, so in a 4 or 5 node cluster, you need 3 votes for quorum, give 2 votes to the quorum disk.

Use the label name rather than the device name, since the disk order could change for any reason.

Easy.

Some of the other things to look at: Interval is the number of seconds before qdiskd re-evaluates the cluster. Votes is the number of votes the quorum disk has. TKO is the number of times a node doesn’t check in before it’s Technically Knocked Out by the cluster.  If this is set to 3, and the interval is 3, then it will be 9 seconds before a failed node gets fenced.

We won’t worry so much about minimum score and additional heuristics, just leave it at 1 for right now.

So now make the GFS parition on whatever is left from your shared storage. You can do this in luci, just watch out for some things, and this is why we initialized the cluster before assigning storage. Make sure that the content type is GFS2, the locking mechanism is dlm and that the number of journals is at least equal to the number of nodes in the cluster. You can add more if you want, if you plan on  adding nodes later. But don’t sweat it if you set up only 2 journals, then want to add a third node… you can always add journals later via the command line.

Don’t try to put this partition in fstab to be mounted on boot, we’ll let the cluster handle it. It’s better that way. You can manually mount the partition for now to test it. Make some files, check that both nodes can see them.

Create a failover domain, put your nodes in it.

Create resources! GFS Filesystem, point it to the GFS partition. Give it a mount point that both nodes will use, like /mnt/gfs or something. IP address – use this as a VIP that Apache will be bound to. And a Script resource that points to /etc/init.d/httpd.

Tie it all together with a Service- Set GFS as a resource, the IP address as a child and the Script resource as a child of THAT, and you’re just about done. Make sure that your httpd.conf is set to bind Apache to the VIP, and that the Directory is set to point to the GFS filesystem, throw your content up there and you’re set to start cman and rgmanager.

That’s it in a VERY SMALL nutshell.

:, ,

Leave a Reply

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Visit our friends!

A few highly recommended friends...

Archives

All entries, chronologically...