Raymii.org
Quis custodiet ipsos custodes?Home | About | All pages | Cluster Status | RSS Feed
Corosync Pacemaker - Execute script on failover
Published: 20-11-2013 | Author: Remy van Elst | Text only version of this article
❗ This post is over eleven years old. It may no longer be up to date. Opinions may have changed.
With Corosync/Pacemaker there is no easy way to simply run a script on failover. There are good reasons for this, but sometimes you want to do something simple. This tutorial describes how to change the Dummy OCF resource to execute a script on failover.
Recently I removed all Google Ads from this site due to their invasive tracking, as well as Google Analytics. Please, if you found this content useful, consider a small donation using any of the options below. It means the world to me if you show your appreciation and you'll help pay the server costs:
GitHub Sponsorship
PCBWay referral link (You get $5, I get $20 after you've placed an order)
Digital Ocea referral link ($200 credit for 60 days. Spend $25 after your credit expires and I'll get $25!)
In this example it is a script which triggers a few SNMP traps, sends an alert
to Nagios and sends some data to Graphite. SNMP alone could be done with the
ocf:heartbeat:ClusterMon
resource, but the other stuff not.
This is a very very simple way of doing it, I find it more a quick hack. For example, the script path is hard coded. For me that is not a problem because both the script as the Dummy resource are managed via Ansible, so I can change them any time.
Start by copying the Dummy resource over to a new resource. On Ubuntu the resource files are located here:
/usr/lib/ocf/resource.d/heartbeat/
In there, copy the Dummy
file to a new resource, for example FailOverScript
.
If you don't have the Dummy resource, you can also find it here.
Edit the name and description:
Name:
meta_data() {
cat <<END
<?xml version="1.0"?>
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
<resource-agent name="FailOverScript" version="0.9">
<version>1.0</version>
Description:
<longdesc lang="en">
Script ran on Failover
</longdesc>
<shortdesc lang="en">Script ran on Failover</shortdesc>
Make sure the script you want to execute is placed on the host, and is
executable (chmod +x /usr/local/bin/script
).
A bit lower in the file, edit the dummy_start
function. Add the script path
below the if [ $? = $OCF_SUCCESS ]; then
and above the return $OCF_SUCCESS
lines. Like so:
dummy_start() {
dummy_monitor
/usr/local/bin/failover.sh
if [ $? = $OCF_SUCCESS ]; then
return $OCF_SUCCESS
fi
touch ${OCF_RESKEY_state}
}
After that has been done, replace all instances of Dummy and dummy with your name of choice:
sed -i 's/Dummy/FailOverScript' /usr/lib/ocf/resource.d/heartbeat/FailOverScript
sed -i 's/dummy/failoverscript' /usr/lib/ocf/resource.d/heartbeat/FailOverScript
Test the script using the ocf-tester
program to see if you have any mistakes:
ocf-tester -n resourcename /usr/lib/ocf/resource.d/heartbeat/FailOverScript
Output:
Beginning tests for /usr/lib/ocf/resource.d/heartbeat/FailOverScript...
/usr/sbin/ocf-tester: 214: /usr/sbin/ocf-tester: xmllint: not found
* rc=127: Your agent produces meta-data which does not conform to ra-api-1.dtd
* Your agent does not support the notify action (optional)
* Your agent does not support the demote action (optional)
* Your agent does not support the promote action (optional)
* Your agent does not support master/slave (optional)
Tests failed: /usr/lib/ocf/resource.d/heartbeat/FailOverScript failed 1 tests
Oops. Seems we need xmllint
. On Ubuntu, install it:
apt-get install libxml2-utils
Test again, you'll see it will pass:
Beginning tests for /usr/lib/ocf/resource.d/heartbeat/FailOverScript...
* Your agent does not support the notify action (optional)
* Your agent does not support the demote action (optional)
* Your agent does not support the promote action (optional)
* Your agent does not support master/slave (optional)
/usr/lib/ocf/resource.d/heartbeat/FailOverScript passed all tests
As an extra test, to see if the script you've created is correctly executed, you can do a test start of the resource:
export OCF_ROOT=/usr/lib/ocf
bash -x /usr/lib/ocf/resource.d/heartbeat/FailOverScript start
To use this resource, add it like so:
crm configure primitive script ocf:heartbeat:FailOverScript op monitor interval="30"
If you want to test it, you can for example let the script send you an email. Put a node in standby and see if you get an email.
Tags: cluster , corosync , crm , heartbeat , high-availability , network , pacemaker , tutorials