Welcome to my PostgreSQL blogs: 2018

Thursday 25 October 2018

EDB Postgres Failover Manager | EFM 3.2

EDB Postgres Failover Manager (EFM) is a high-availability module from EnterpriseDB that enables a Postgres Master node to automatically failover to a Standby node in the event of a software or hardware failure on the Master.

In this blog I am trying my best to explain how to Install,Configure and Perform [Failover/Switchover] .

You can get EDB's official document from - Link

Supported Platforms

Failover Manager 3.2 is supported on EDB Postgres Advanced Server or PostgreSQL(version 9.3 and higher) installations running on:

CentOS 6.x and 7.x
Red Hat Enterprise Linux 6.x and 7.x

Oracle Enterprise Linux 6.x and 7.x

Red Hat Enterprise Linux (IBM Power8 Little Endian or ppc64le) 7.x

Debian 9

SLES 12

Ubuntu 18.04

Prerequisites

1) Install Java-1.8 or later

2) Configure Streaming Replication

3) Make sure below ports are open .

iptables -I INPUT -p tcp --dport 7800:7810 -j ACCEPT

Above command opens a small range of ports (7800 through 7810). Failover Manager will connect via the port that corresponds to the port specified in the cluster properties file.

So ,Now let's start with actual installation and configuration of EFM considering you have a fresh system .

Download EDB's yum repo .

wget https://yum.enterprisedb.com/edbrepos/edb-repo-latest.noarch.rpm

rpm -ivh edb-repo-latest.noarch.rpm

Edit below file

/etc/yum.repos.d/edb.repo

Edit the above file and enable the [enterprisedb-tools] and the [enterprisedb-dependencies] entries. To enable a repository, change the value of the enabled parameter to 1 and replace the user name and password placeholders in the baseurl specification with your user name and the repository password.

[enterprisedb-tools]

name=EnterpriseDB Tools $releasever - $basearch

baseurl=https://<username>:<password>@yum.enterprisedb.com/tools-testing/redhat/rhel-$releasever-$basearch

enabled=1

gpgcheck=1

gpgkey=file:///etc/pki/rpm-gpg/ENTERPRISEDB-GPG-KEY

Install EFM 3.2

yum install yum install edb-efm34

Where

Failover Manager components are installed in the following locations:

Component	Location
Executables	/usr/edb/efm-3.2/bin
Libraries	/usr/edb/efm-3.2/lib
Cluster configuration files	/etc/edb/efm-3.2
Logs	/var/log/efm-3.2
Lock files	/var/lock/efm-3.2
Log rotation file	/etc/logrotate.d/efm-3.2
sudo configuration file	/etc/sudoers.d/efm-32
Binary to access VIP without sudo	/usr/edb/efm-3.2/bin/secure

Configure EDB EFM

1) Create dedicated user for EFM .

postgres=# create user efmtest with superuser ;

CREATE ROLE

2) You must modify the pg_hba.conf file on the Master and Standby nodes, adding

entries that allow communication between the all of the nodes in the cluster.

host postgres efmtest 192.168.1.100/32 md5 ##Master

host postgres efmtest 192.168.1.102/32 md5 ##Slave

host postgres efmtest 192.168.1.103/32 md5 ##Witness

3) Encrypt your database user password .

Failover Manager requires you to encrypt your database password before including it in the cluster properties file.

/usr/edb/efm-3.2/bin/efm encrypt efm

Keep copy of this password for some time .

4) Make a configuration files .

After installation two sample files get created i.e

a) efm.property.in
b) efm.nodes.in

Make a copy of this file as

cp /etc/edb/efm-3.2/efm.properties.in /etc/edb/efm-3.2/efm.properties
cp /etc/edb/efm-3.2/efm.nodes.in /etc/edb/efm-3.2/efm.nodes

Where

efm is your cluster name ,If you want you can use another name .

5) I have changed some parameter in efm.properties which is as below .

db.user=efmtest db.password.encrypted=e32f382eea7b5e2a4b413fa475e1e2c3 db.port=5432 db.database=postgres db.service.owner=postgres db.bin=/opt/PostgreSQL/10/bin db.recovery.conf.dir=/opt/PostgreSQL/10/data user.email=tushar.takate@enterprisedb.com bind.address=192.168.0.192:7800 admin.port=7809 is.witness=false stop.isolated.master=false stop.failed.master=false master.shutdown.as.failure=false pingServerIp=8.8.8.8 pingServerCommand=/bin/ping -q -c3 -w5 auto.allow.hosts=true stable.nodes.file=false auto.failover=true auto.reconfigure=true promotable=true recovery.check.period=2 auto.resume.period=5 virtualIp=192.168.0.108 virtualIp.interface=ens33 virtualIp.prefix=24 virtualIp.single=true check.vip.before.promotion=true

Note :

* bind.address

The bind.address property specifies the IP address and port number of the agent on the current node of the Failover Manager cluster.

* On witness node make sure it should be (is.witness=true) .

6) Make changes into efm.nodes file .

efm.nodes file contains a list of the current Failover Manager cluster members.

Add the addresses and ports of each node in the cluster to this file. One node will act as the membership coordinator; the list should include at least the membership coordinator's address:

efm.nodes entries on master .

192.168.1.103:7800 192.168.1.102:7800

efm.nodes entries on Slave .

192.168.1.100:7800 192.168.1.103:7800

efm.nodes entries on Witness .

192.168.1.100:7800 192.168.1.102:7800

7) Start the efm service on witness .

Generally keep sequence of EFM service start as below .

witness - master - slave .

systemctl start efm-3.2

## After service successfully start execute allow-node command .

[root@witness ~]# efm allow-node efm 192.168.1.100

allow-node signal sent to local agent.

[root@witness ~]# efm allow-node efm 192.168.1.102

allow-node signal sent to local agent.

After witness service start it will looks like

[root@witness ~]# efm cluster-status efm

Cluster Status: efm

Agent Type Address Agent DB VIP

-----------------------------------------------------------------------

Witness 192.168.1.103 UP N/A 192.168.1.108

Allowed node host list:

192.168.1.103 192.168.1.100 192.168.1.102

Membership coordinator: 192.168.1.103

Standby priority host list:

(List is empty.)

Promote Status:

Did not find XLog location for any nodes.

After you start all the efm services .It will look like as below .

You can see now VIP is assigned with Master node .

[root@master ~]# efm cluster-status efm

Cluster Status: efm

Agent Type Address Agent DB VIP

-----------------------------------------------------------------------

Master 192.168.1.100 UP UP 192.168.1.108*

Standby 192.168.1.102 UP UP 192.168.1.108

Witness 192.168.1.103 UP N/A 192.168.1.108

Allowed node host list:

192.168.1.103 192.168.1.100 192.168.1.102

Membership coordinator: 192.168.1.103

Standby priority host list:

192.168.1.102

Promote Status:

DB Type Address XLog Loc Info

--------------------------------------------------------------

Master 192.168.1.100 0/F5FCEE28

Standby 192.168.1.102 0/F5FCEE28

Standby database(s) in sync with master. It is safe to promote.

[root@master ~]#

8 ) Performing Switchover/failover to make old Master as a slave and old salve as a new master .

You can invoke efm promote on any node of a Failover Manager cluster to start a manual promotion of a Standby database to Master database.

Command to perform manual failover .

efm promote efm

Command to perform manual switchover .

efm promote efm -switchover

Now i am testing switchover .

[root@slave ~]# efm promote efm -switchover

Promote/switchover command accepted by local agent. Proceeding with promotion and will reconfigure original master. Run the 'cluster-status' command for information about the new cluster state.

[root@slave ~]#

After performing the switchover cluster status will look like as below .And VIP will get assigned with new Master .

[root@slave ~]# efm cluster-status efm

Cluster Status: efm

Agent Type Address Agent DB VIP

-----------------------------------------------------------------------

Standby 192.168.1.100 UP UP 192.168.1.108

Master 192.168.1.102 UP UP 192.168.1.108*

Witness 192.168.1.103 UP N/A 192.168.1.108

Allowed node host list:

192.168.1.103 192.168.1.100 192.168.1.102

Membership coordinator: 192.168.1.103

Standby priority host list:

192.168.1.100

Promote Status:

DB Type Address XLog Loc Info

--------------------------------------------------------------

Master 192.168.1.102 0/F60001A8

Standby 192.168.1.100 0/F60001A8

Standby database(s) in sync with master. It is safe to promote.

[root@slave ~]#

For more information and different switches/commands of EFM ,Please visit EDB's website .

You can get EDB's official document from - Link

Thursday 13 September 2018

Mapping pgbouncer with pg_stat_activity

When we do use pgbounecr for connection pooling we do faces difficulties to trace IP of bad/poor performance query .

Here ,In this blog i am trying to get this archive by using PostgreSQL DB link to get more visibility of query from where it is coming

Let's ses how can we do it .

JFI - I have installed pg-bouncer 1.9

Step 1 -

Create DB link extension .

Step 2 -

Create server

Details need to be mentioned of pg-bouncer with pg-bouncer database

Step 3 -

Create user mapping

Step 4 -

Create View which will fetch data by using server which is created in step 2 .

create view pg_bou_stat_activity AS SELECT

_.type ,

_."user" ,

_.database ,

_.state ,

_.addr ,

_.port ,

_.local_addr ,

_.local_port ,

_.connect_time,

_.request_time,

_.wait ,

_.wait_us ,

_.close_needed,

_.ptr ,

_.link ,

_.remote_pid ,

_.tls ,

_.recv_pos ,

_.pkt_pos ,

_.pkt_remain ,

_.send_pos ,

_.send_remain ,

_.pkt_avail ,

_.send_avail

from dblink('pgbouncer'::text, 'show SOCKETS'::text) _(

type text ,

"user" text ,

database text ,

state text ,

addr text ,

port integer ,

local_addr text ,

local_port integer ,

connect_time timestamp with time zone ,

request_time timestamp with time zone ,

wait integer ,

wait_us integer ,

close_needed integer ,

ptr bytea ,

link text ,

remote_pid integer ,

tls text ,

recv_pos integer ,

pkt_pos integer ,

pkt_remain integer ,

send_pos integer ,

send_remain integer ,

pkt_avail integer ,

send_avail integer );

Step 5 -

Execute the below query to check connection details .

select
datname
, usename
, p.state
, case when b.user is not null then 'pgBouncer' else application_name end app
, case when b.user is null then client_addr else addr::inet end ip
, b.user
, b.state "state"
, b.connect_time "pgbouncer_connect"
, query_start
, pid
from pg_stat_activity p
left outer join pg_bou_stat_activity b
on addr||':'||b.port = regexp_replace(p.application_name,'^.{0,}(- )','')
where pid <> pg_backend_pid()

;

You will get below kind of result set .

Monday 12 March 2018

PostgreSQL pg_rewind

Introduction : -

pg_rewind is a tool for synchronizing a PostgreSQL cluster with another copy of the same cluster, after the clusters' timelines have diverged. A typical scenario is to bring an old master server back online after failover as a standby that follows the new master.

The result is equivalent to replacing the target data directory with the source one. Only changed blocks from relation files are copied; all other files are copied in full, including configuration files. The advantage of pg_rewind over taking a new base backup, or tools like rsync, is that pg_rewind does not require reading through unchanged blocks in the cluster. This makes it a lot faster when the database is large and only a small fraction of blocks differ between the clusters.

How it works :-

The basic idea is to copy all file system-level changes from the source cluster to the target cluster

Scan the WAL log of the target cluster, starting from the last checkpoint before the point where the source cluster's timeline history forked off from the target cluster. For each WAL record, record each data block that was touched. This yields a list of all the data blocks that were changed in the target cluster, after the source cluster forked off.

Copy all those changed blocks from the source cluster to the target cluster, either using direct file system access (--source-pgdata) or SQL (--source-server).

Copy all other files such as pg_clog and configuration files from the source cluster to the target cluster (everything except the relation files).

Apply the WAL from the source cluster, starting from the checkpoint created at failover. (Strictly speaking, pg_rewind doesn't apply the WAL, it just creates a backup label file that makes PostgreSQL start by replaying all WAL from that checkpoint forward.)

Configure Replication : -

I have already configured replication between two server [9.6 PostgreSQL community version] .

Here i have my replication setup .

Situation of fail-over :-

DBA error
Fail-over management tool error
Fail-over script get wrong input which cause standby promotion .
While doing fail-over test .

When this kind of situation occurs and we want to re-attach to connect old master as new slave in less time [Bypassing pg_basebackup/rsync] ,pg_rewind is a good solution to build new standby which start following new master .

Use case of pg_rewind :-

Performe fail-over
Lode some data on both server [insert some rows/create any object]
perform pg_rewind so that old master [New standby] start following new master [old replica] .
Check that both server are in sync and checking which was made on old master after fail-over this should get roll-backed .

Step 1 :- Promote standby server

Step 2 : -

Lode some data in New Master

Lode some data in Old master

Now both server are totally out of sync ,Try to bring them in sync and old master server will start following to new master as a standby .

Step 3 : -

Perform pg_rewind to start old master sync with new master .

pg_rewind --target-pgdata=/usr/local/Postgresql9.6/data_master/ --source-server="port=5444 host=Replica_server_ip"

Note : - Before starting new standby ,Make changes in recovery.conf to change IP address and create replication slot on master server .

Step 4: -

Changes get roll-backed from old master which is not required and blocks which are updated those get replayed on new slave