[pgpool-general: 4920] PGPOOL failover issues

Thu Aug 18 06:34:32 JST 2016

Hi Contributors,

I am testing the failover for PGPOOL, I have scenario where I am testing the failover by shutting down the master, Slave is promoted to master.
I am trying to do vice versa, Old primary not able to bring back to original state automatically  through pgpool.

I have followed the following steps.

  *   Current settings Step-1
  *   postgres=# show pool_nodes;
  *    node_id |    hostname    | port | status | lb_weight |  role  | select_cnt
  *   ---------+----------------+------+--------+-----------+--------+------------
  *    0       | 100.114.116.67 | 5432 | 2      | 1.000000  | master | 0
  *    1       | 100.114.116.68 | 5433 | 2      | 0.000000  | slave  | 0
  *   (2 rows)

Step-2

Shutting down the master server  67

sudo service postgresql-9.5 stop
Stopping PostgreSQL 9.5:
waiting for server to shut down.... done
server stopped

Step-3

l_use_sync_map: we cannot use sync map because STREAM: 0 query in progress: 0 doing extended query: 0
2016-08-17 14:25:04: pid 9795: DEBUG:  reading backend data packet kind
2016-08-17 14:25:04: pid 9795: DETAIL:  backend:1 of 2 kind = 'Z'
2016-08-17 14:25:04: pid 9795: DEBUG:  read_kind_from_backend max_count:1.000000 num_executed_nodes:1
2016-08-17 14:25:04: pid 9795: DEBUG:  processing backend response
2016-08-17 14:25:04: pid 9795: DETAIL:  received kind 'Z'(5a) from backend
2016-08-17 14:25:04: pid 9795: DEBUG:  processing backend response
2016-08-17 14:25:04: pid 9795: DETAIL:  Ready For Query received
2016-08-17 14:25:04: pid 9795: DEBUG:  session context: clearing ignore till sync. DONE
2016-08-17 14:25:04: pid 9795: DEBUG:  pool_use_sync_map: we cannot use sync map because STREAM: 0 query in progress: 0 doing extended query: 0
2016-08-17 14:25:04: pid 9795: DEBUG:  pool_use_sync_map: we cannot use sync map because STREAM: 0 query in progress: 0 doing extended query: 0
2016-08-17 14:25:04: pid 9795: DEBUG:  pool_use_sync_map: we cannot use sync map because STREAM: 0 query in progress: 0 doing extended query: 0
2016-08-17 14:25:04: pid 9795: DEBUG:  reading message length
2016-08-17 14:25:04: pid 9795: DETAIL:  slot: 1 length: 5
2016-08-17 14:25:04: pid 9795: DEBUG:  processing ReadyForQuery

Step-4

section to the server was lost. Attempting reset: Succeeded.
postgres=# show pool_nodes;
 node_id |    hostname    | port | status | lb_weight |  role  | select_cnt
---------+----------------+------+--------+-----------+--------+------------
 0       | 100.114.116.67 | 5432 | 3      | 1.000000  | slave  | 0
 1       | 100.114.116.68 | 5433 | 2      | 0.000000  | master | 0
(2 rows)

postgres=#

Step-5

Start the old master 67 again

sudo service postgresql-9.5 start
Starting PostgreSQL 9.5:

WARNING --> PERL_INSTALL_PATH is not set in /opt/PostgreSQL/9.5/etc/sysconfig/plLanguages.config file
WARNING --> PYTHON_INSTALL_PATH is not set in /opt/PostgreSQL/9.5/etc/sysconfig/plLanguages.config file
WARNING --> TCL_INSTALL_PATH is not set in /opt/PostgreSQL/9.5/etc/sysconfig/plLanguages.config file

waiting for server to start.... done
server started

Step-6

Shutting down the current master 68

postgres=# \q
bash-4.1$ /usr/local/pgsql/bin/pg_ctl -D /pgsql/data/9.5/data stop
waiting for server to shut down......... done
server stopped
bash-4.1$

Step-7

Pgpool is down after master shutdown

bash-4.1$ psql -p 9999
psql: ERROR:  pgpool is not accepting any new connections
DETAIL:  all backend nodes are down, pgpool requires at least one valid node
HINT:  repair the backend nodes and restart pgpool
bash-4.1$

lading startup packet
2016-08-17 14:29:15: pid 9965: DETAIL:  application_name: psql
2016-08-17 14:29:15: pid 9965: DEBUG:  reading startup packet
2016-08-17 14:29:15: pid 9965: DETAIL:  Protocol Major: 3 Minor: 0 database: postgres user: postgres
2016-08-17 14:29:15: pid 9965: DEBUG:  forwarding error message to frontend
2016-08-17 14:29:15: pid 9965: FATAL:  pgpool is not accepting any new connections
2016-08-17 14:29:15: pid 9965: DETAIL:  all backend nodes are down, pgpool requires at least one valid node
2016-08-17 14:29:15: pid 9965: HINT:  repair the backend nodes and restart pgpool
2016-08-17 14:29:15: pid 9595: DEBUG:  reaper handler
2016-08-17 14:29:15: pid 9595: LOG:  child process with pid: 9965 exits with status 256
2016-08-17 14:29:15: pid 9595: LOG:  fork a new child process with pid: 9988
2016-08-17 14:29:15: pid 9595: DEBUG:  reaper handler: exiting normally
2016-08-17 14:29:15: pid 9988: DEBUG:  initializing backend status

Step-8

Restarting the pgpool

and restart pgpool
bash-4.1$ exit
exit
[root at ushdbld00034 bin]# ./pgpool -f /usr/local/etc/pgpool.conf stop
2016-08-17 14:30:27: pid 10015: LOG:  stop request sent to pgpool. waiting for termination...
.done.
[1]+  Done                    ./pgpool -f /usr/local/etc/pgpool.conf -n -d > /tmp/PPOOL.log 2>&1
[root at ushdbld00034 bin]#  ./pgpool -f /usr/local/etc/pgpool.conf -n -d > /tmp/PPOOL.log 2>&1 &
[1] 10017

Step-9

Old master is new master now

postgres=# show pool_nodes;
 node_id |    hostname    | port | status | lb_weight |  role  | select_cnt
---------+----------------+------+--------+-----------+--------+------------
 0       | 100.114.116.67 | 5432 | 2      | 1.000000  | master | 0
 1       | 100.114.116.68 | 5433 | 3      | 0.000000  | slave  | 0
(2 rows)

I am not able to bring back the old master through pgpool , Do I need to execute the failover scripts through pgpoo.conf?

Will pgpool would not automatically bring back the old master after shutdown the new master?
I have recovery.conf on both the servers.

Please let me know.

Your help Is much appreciated.

Thanks !
sai

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20160817/b2b51f75/attachment-0001.html>