Thursday 25 October 2018

EDB Postgres Failover Manager | EFM 3.2

EDB Postgres Failover Manager (EFM) is a high-availability module from EnterpriseDB that enables a Postgres Master node to automatically failover to a Standby node in the event of a software or hardware failure on the Master.

In this blog I am trying my best to explain how to Install,Configure and Perform [Failover/Switchover] .

You can get EDB's official document from - Link

Supported Platforms  

Failover Manager 3.2 is supported on EDB Postgres Advanced Server or PostgreSQL(version 9.3 and higher) installations running on:

CentOS 6.x and 7.x
Red Hat Enterprise Linux 6.x and 7.x
Oracle Enterprise Linux 6.x and 7.x
Red Hat Enterprise Linux (IBM Power8 Little Endian or ppc64le) 7.x
Debian 9
SLES 12
Ubuntu 18.04

Prerequisites 


1) Install Java-1.8 or later 
2) Configure Streaming Replication 
3) Make sure below ports are open .


iptables -I INPUT -p tcp --dport 7800:7810 -j ACCEPT

Above command opens a small range of ports (7800 through 7810). Failover Manager will connect via the port that corresponds to the port specified in the cluster properties file. 

So ,Now let's start with actual installation and configuration of EFM considering you have a fresh system .

Download EDB's yum repo .

Edit below file 

/etc/yum.repos.d/edb.repo 

Edit the above file and enable the [enterprisedb-tools] and the [enterprisedb-dependencies] entries. To enable a repository, change the value of the enabled parameter to 1 and replace the user name and password placeholders in the baseurl specification with your user name and the repository password.

[enterprisedb-tools]
name=EnterpriseDB Tools $releasever - $basearch
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/ENTERPRISEDB-GPG-KEY

Install EFM 3.2

yum install yum install edb-efm34
Where 

Failover Manager components are installed in the following locations:

Component
Location
Executables
/usr/edb/efm-3.2/bin
Libraries
/usr/edb/efm-3.2/lib
Cluster configuration files
/etc/edb/efm-3.2
Logs
/var/log/efm-3.2
Lock files
/var/lock/efm-3.2
Log rotation file
/etc/logrotate.d/efm-3.2
sudo configuration file
/etc/sudoers.d/efm-32
Binary to access VIP without sudo
/usr/edb/efm-3.2/bin/secure

Configure EDB EFM 

1) Create dedicated user for EFM .
postgres=# create user efmtest with superuser ;
CREATE ROLE

2) You must modify the pg_hba.conf file on the Master and Standby nodes, adding
entries that allow communication between the all of the nodes in the cluster. 

host postgres efmtest 192.168.1.100/32 md5 ##Master
host postgres efmtest 192.168.1.102/32 md5 ##Slave
host postgres efmtest 192.168.1.103/32 md5 ##Witness


3) Encrypt your database user password .

Failover Manager requires you to encrypt your database password before including it in the cluster properties file.

/usr/edb/efm-3.2/bin/efm encrypt efm

Keep copy of this password for some time .

4) Make a configuration files .

After installation two sample files get created i.e 
     
    a) efm.property.in
    b) efm.nodes.in 

Make a copy of this file as 

cp /etc/edb/efm-3.2/efm.properties.in /etc/edb/efm-3.2/efm.properties
cp /etc/edb/efm-3.2/efm.nodes.in /etc/edb/efm-3.2/efm.nodes

Where 
        efm is your cluster name ,If you want you can use another name .

5) I have changed some parameter in efm.properties which is as below .

db.user=efmtest db.password.encrypted=e32f382eea7b5e2a4b413fa475e1e2c3 db.port=5432 db.database=postgres db.service.owner=postgres db.bin=/opt/PostgreSQL/10/bin db.recovery.conf.dir=/opt/PostgreSQL/10/data user.email=tushar.takate@enterprisedb.com bind.address=192.168.0.192:7800 admin.port=7809 is.witness=false stop.isolated.master=false stop.failed.master=false master.shutdown.as.failure=false pingServerIp=8.8.8.8 pingServerCommand=/bin/ping -q -c3 -w5 auto.allow.hosts=true stable.nodes.file=false auto.failover=true auto.reconfigure=true promotable=true recovery.check.period=2 auto.resume.period=5 virtualIp=192.168.0.108 virtualIp.interface=ens33 virtualIp.prefix=24 virtualIp.single=true check.vip.before.promotion=true

Note : 

      *  bind.address 

                The bind.address property specifies the IP address and port number of the agent on the current node of the Failover Manager cluster. 

     * On witness node make sure it should be (is.witness=true) .


6) Make changes into efm.nodes file .

efm.nodes file contains a list of the current Failover Manager cluster members.

Add the addresses and ports of each node in the cluster to this file. One node will act as the membership coordinator; the list should include at least the membership coordinator's address:

efm.nodes entries on master .


192.168.1.103:7800 192.168.1.102:7800

efm.nodes entries  on Slave .

192.168.1.100:7800 192.168.1.103:7800

efm.nodes entries  on Witness .

192.168.1.100:7800 192.168.1.102:7800

7) Start the efm service on witness .

Generally keep sequence of EFM service start as below .

witness - master - slave . 

systemctl start efm-3.2

## After service successfully start execute allow-node command .

[root@witness ~]# efm allow-node efm 192.168.1.100
allow-node signal sent to local agent.
[root@witness ~]# efm allow-node efm 192.168.1.102
allow-node signal sent to local agent.

After witness service start it will looks like 


[root@witness ~]# efm cluster-status efm
Cluster Status: efm


       Agent Type  Address         Agent DB VIP
       -----------------------------------------------------------------------
       Witness    192.168.1.103        UP N/A 192.168.1.108


Allowed node host list:
       192.168.1.103 192.168.1.100 192.168.1.102


Membership coordinator: 192.168.1.103


Standby priority host list:
       (List is empty.)


Promote Status:


Did not find XLog location for any nodes.

After you start all the efm services .It will look like as below .

You can see now VIP is assigned with Master node .


[root@master ~]# efm cluster-status efm
Cluster Status: efm


       Agent Type  Address         Agent DB VIP
       -----------------------------------------------------------------------
       Master    192.168.1.100        UP UP 192.168.1.108*
       Standby    192.168.1.102        UP UP 192.168.1.108
       Witness    192.168.1.103        UP N/A 192.168.1.108


Allowed node host list:
       192.168.1.103 192.168.1.100 192.168.1.102


Membership coordinator: 192.168.1.103


Standby priority host list:
       192.168.1.102


Promote Status:


       DB Type    Address         XLog Loc Info
       --------------------------------------------------------------
       Master    192.168.1.100        0/F5FCEE28
       Standby    192.168.1.102        0/F5FCEE28


       Standby database(s) in sync with master. It is safe to promote.
[root@master ~]#



8 ) Performing Switchover/failover to make old Master as a slave and old salve as a new master .

You can invoke efm promote on any node of a Failover Manager cluster to start a manual promotion of a Standby database to Master database.
  • Command to perform manual failover .
efm promote efm
  • Command to perform manual switchover  .
efm promote efm -switchover

Now i am testing switchover .


[root@slave ~]# efm promote efm -switchover
Promote/switchover command accepted by local agent. Proceeding with promotion and will reconfigure original master. Run the 'cluster-status' command for information about the new cluster state.
[root@slave ~]#

After performing the switchover cluster status will look like as below .And VIP will get assigned with new Master .

[root@slave ~]# efm cluster-status efm
Cluster Status: efm


       Agent Type  Address         Agent DB VIP
       -----------------------------------------------------------------------
       Standby    192.168.1.100        UP UP 192.168.1.108
       Master    192.168.1.102        UP UP 192.168.1.108*
       Witness    192.168.1.103        UP N/A 192.168.1.108


Allowed node host list:
       192.168.1.103 192.168.1.100 192.168.1.102


Membership coordinator: 192.168.1.103


Standby priority host list:
       192.168.1.100


Promote Status:


       DB Type    Address         XLog Loc Info
       --------------------------------------------------------------
       Master    192.168.1.102        0/F60001A8
       Standby    192.168.1.100        0/F60001A8


       Standby database(s) in sync with master. It is safe to promote.
[root@slave ~]#

For more information and different switches/commands of EFM ,Please visit EDB's website .

You can get EDB's official document from - Link