View Issue Details

IDProjectCategoryView StatusLast Update
0000577Pgpool-IIBugpublic2020-03-02 11:21
Reporterraj.pandey1982@gmail.comAssigned Tot-ishii 
PriorityhighSeveritymajorReproducibilityalways
Status feedbackResolutionopen 
PlatformLinux OScentos OS VersionOS Version x86_6
Product Version4.1.0 
Target VersionFixed in Version 
Summary0000577: PGPOOLII goes unstable after DB fail over
DescriptionI have 2 postgres master-slave nodes.
Each having POSGRESDB/pgpoolI 4.1.0 configured on same node as Master and stand by.
1 virtual IP.


1st scenrio :- PGPPOL tool itself failover Going good:-

a - Maste to Slave tool failover happenning and Slave acquires tht VI_P as New Master.
b - Old Slave to Old master tool failover happenning and old master again come over as New MAster .Old Slave also coming up as new slave.
c - VIP deligation happening well .
d - Postgres Admin tool able to reconnect in same session.
e - Applcation(frontend) also acquire connections.
f - Conection pooling happenning.
g - read load balancing going through.
 
2nd scenrio :- DB Automatic Failover (which is very Important) :-

a-Failover_command worked well when i stopped Master Db(pgpool-poc01.novalocal:5432) , Slave DB (pgpool-poc02.novalocal:5432) got opened as Master

BUT :-

b- PGPOOL log gives lot many errors , almost error on both Master & Slave nodes and not coming in stable stage and hence neither PGAdmin nor Application/front End is able to connect with newly opend Slave Db.

Log:-
2020-01-22 12:39:11: pid 7492:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-22 12:39:11: pid 7492:LOG: received degenerate backend request for node_id: 0 from pid [7492]
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:11: pid 20160:LOG: reading and processing packets
2020-01-22 12:39:11: pid 20160:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-22 12:39:11: pid 20160:LOG: received degenerate backend request for node_id: 0 from pid [20160]
2020-01-22 12:39:11: pid 14003:LOG: reading and processing packets
2020-01-22 12:39:11: pid 14003:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-22 12:39:11: pid 14003:LOG: received degenerate backend request for node_id: 0 from pid [14003]
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:11: pid 20778:LOG: Pgpool-II parent process has received failover request
2020-01-22 12:39:11: pid 20780:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: received the failover indication from Pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is informed of failover start by the main process
2020-01-22 12:39:11: pid 9422:LOG: reading and processing packets
2020-01-22 12:39:11: pid 9422:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-22 12:39:11: pid 20778:LOG: starting degeneration. shutdown host pgpool-poc01.novalocal(5432)
2020-01-22 12:39:11: pid 9422:LOG: received degenerate backend request for node_id: 0 from pid [9422]
2020-01-22 12:39:11: pid 17824:LOG: reading and processing packets
2020-01-22 12:39:11: pid 17824:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-22 12:39:11: pid 17824:LOG: received degenerate backend request for node_id: 0 from pid [17824]
2020-01-22 12:39:11: pid 21172:LOG: reading and processing packets
2020-01-22 12:39:11: pid 21172:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-22 12:39:11: pid 21172:LOG: received degenerate backend request for node_id: 0 from pid [21172]
2020-01-22 12:39:11: pid 30846:LOG: reading and processing packets
2020-01-22 12:39:11: pid 30846:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-22 12:39:11: pid 30846:LOG: received degenerate backend request for node_id: 0 from pid [30846]
2020-01-22 12:39:11: pid 13838:LOG: reading and processing packets
2020-01-22 12:39:11: pid 13838:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-22 12:39:11: pid 13838:LOG: received degenerate backend request for node_id: 0 from pid [13838]
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 17408:LOG: reading and processing packets
2020-01-22 12:39:11: pid 17408:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-22 12:39:11: pid 20780:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface

2020-01-22 12:39:11: pid 3924:LOG: received degenerate backend request for node_id: 0 from pid [3924]
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:11: pid 21539:LOG: reading and processing packets
2020-01-22 12:39:11: pid 21539:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-22 12:39:11: pid 21539:LOG: received degenerate backend request for node_id: 0 from pid [21539]
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 10792:LOG: reading and processing packets

2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:11: pid 5643:LOG: reading and processing packets
2020-01-22 12:39:11: pid 5643:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 5643:LOG: received degenerate backend request for node_id: 0 from pid [5643]
2020-01-22 12:39:11: pid 20780:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:11: pid 7442:LOG: reading and processing packets
2020-01-22 12:39:11: pid 7442:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-22 12:39:11: pid 7442:LOG: received degenerate backend request for node_id: 0 from pid [7442]
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20778:LOG: Restart all children
2020-01-22 12:39:11: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
22020-01-22 12:39:11: pid 20778:LOG: Restart all children
2020-01-22 12:39:11: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:11: pid 20780:LOG: new IPC connection received
2020-01-22 12:39:11: pid 20780:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-22 12:39:11: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:11: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:11: pid 20778:LOG: execute command: /usr/share/pgpool/4.1.0/etc/failover.sh 0 0 pgpool-poc02.novalocal reppassword /installer/postgresql-11.5/data/im_the_master


Authorized Uses Only.All activity may be Monitored and Reported
promote - Start
DEBUG: The script will be executed with the following arguments:
DEBUG: --trigger-file=/installer/postgresql-11.5/data/im_the_master
DEBUG: --standby_file=/installer/postgresql-11.5/data/im_slave
DEBUG: --demote-host=
DEBUG: --user=replication
DEBUG: --password=reppassword
DEBUG: --force
INFO: Checking if standby file exists...
INFO: Checking if trigger file exists...
INFO: Deleting recovery.conf file...
INFO: Checking if postgresql.conf file exists...
INFO: postgresql.conf file found. Checking if it is for primary server...
INFO: postgresql.conf file corresponds to primary server file. Nothing to do.
pg_ctl: server is running (PID: 25960)
/usr/local/pgsql11.5/bin/postgres "-D" "/installer/postgresql-11.5/data"
INFO: Restarting postgresql service...
waiting for server to shut down.... done
server stopped
waiting for server to start....2020-01-22 12:39:12 +03 LOG: listening on IPv4 address "0.0.0.0", port 5432
2020-01-22 12:39:12 +03 LOG: listening on IPv6 address "::", port 5432
2020-01-22 12:39:12 +03 LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2020-01-22 12:39:12 +03 LOG: redirecting log output to logging collector process
2020-01-22 12:39:12 +03 HINT: Future log output will appear in directory "/dblogs/logs".
 done
server started
pg_ctl: server is running (PID: 23249)
/usr/local/pgsql11.5/bin/postgres "-D" "/installer/postgresql-11.5/data"
INFO: postgresql already running.
INFO: Ensuring replication role and password...
INFO: Replication role found. Ensuring password...
ALTER ROLE
INFO: Creating primary info file...
promote - Done!
2020-01-22 12:39:15: pid 21791:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-01-22 12:39:15: pid 21791:ERROR: failed to make persistent db connection
2020-01-22 12:39:15: pid 21791:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-01-22 12:39:16: pid 20780:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-01-22 12:39:16: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-01-22 12:39:16: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:16: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:16: pid 20780:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
2020-01-22 12:39:20: pid 21791:ERROR: Failed to check replication time lag
2020-01-22 12:39:20: pid 21791:DETAIL: No persistent db connection for the node 0
2020-01-22 12:39:20: pid 21791:HINT: check sr_check_user and sr_check_password
2020-01-22 12:39:20: pid 21791:CONTEXT: while checking replication time lag
2020-01-22 12:39:20: pid 21791:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-01-22 12:39:20: pid 21791:ERROR: failed to make persistent db connection
2020-01-22 12:39:20: pid 21791:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-01-22 12:39:21: pid 20780:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-01-22 12:39:21: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-01-22 12:39:21: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:21: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:21: pid 20780:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
2020-01-22 12:39:25: pid 21791:ERROR: Failed to check replication time lag
2020-01-22 12:39:25: pid 21791:DETAIL: No persistent db connection for the node 0
2020-01-22 12:39:25: pid 21791:HINT: check sr_check_user and sr_check_password
2020-01-22 12:39:25: pid 21791:CONTEXT: while checking replication time lag
2020-01-22 12:39:25: pid 21791:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-01-22 12:39:25: pid 21791:ERROR: failed to make persistent db connection
2020-01-22 12:39:25: pid 21791:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-01-22 12:39:26: pid 20780:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-01-22 12:39:26: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-01-22 12:39:26: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:26: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:26: pid 20780:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
2020-01-22 12:39:30: pid 21791:ERROR: Failed to check replication time lag
2020-01-22 12:39:30: pid 21791:DETAIL: No persistent db connection for the node 0
2020-01-22 12:39:30: pid 21791:HINT: check sr_check_user and sr_check_password
2020-01-22 12:39:30: pid 21791:CONTEXT: while checking replication time lag
2020-01-22 12:39:30: pid 21791:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-01-22 12:39:30: pid 21791:ERROR: failed to make persistent db connection
2020-01-22 12:39:30: pid 21791:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-01-22 12:39:31: pid 20780:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-01-22 12:39:31: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-01-22 12:39:31: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:39:31: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:39:31: pid 20780:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
2020-01-22 12:39:35: pid 21791:ERROR: Failed to check replication time lag
2020-01-22 12:39:35: pid 21791:DETAIL: No persistent db connection for the node 0
2020-01-22 12:55:41: pid 21791:HINT: check sr_check_user and sr_check_password
2020-01-22 12:55:41: pid 21791:CONTEXT: while checking replication time lag
2020-01-22 12:55:41: pid 21791:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-01-22 12:55:41: pid 21791:ERROR: failed to make persistent db connection
2020-01-22 12:55:41: pid 21791:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-01-22 12:55:42: pid 20780:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-01-22 12:55:42: pid 20780:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-01-22 12:55:42: pid 20780:LOG: we have got the consensus to perform the failover
2020-01-22 12:55:42: pid 20780:DETAIL: 1 node(s) voted in the favor
2020-01-22 12:55:42: pid 20780:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover

TagsNo tags attached.

Activities

raj.pandey1982@gmail.com

2020-01-23 00:23

reporter   ~0003083

Not Sure even after fail over command execute and slave is up as Master then also pgool services are not coming back to normal and keep on showing above error and hence remote connection is not happenning

raj.pandey1982@gmail.com

2020-01-26 17:49

reporter   ~0003084

Please suggest something her if possible .its little urgent.

raj.pandey1982@gmail.com

2020-01-26 17:55

reporter   ~0003085

Below messhage in log keep on comming on both nodes
:-
2020-01-23 12:14:16: pid 2909:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-01-23 12:14:16: pid 2909:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-01-23 12:14:16: pid 2909:LOG: we have got the consensus to perform the failover
2020-01-23 12:14:16: pid 2909:DETAIL: 1 node(s) voted in the favor
2020-01-23 12:14:16: pid 2909:LOG: received degenerate backend request for node_id: 0 from pid [2909]
2020-01-23 12:14:16: pid 2907:LOG: Pgpool-II parent process has received failover request
2020-01-23 12:14:16: pid 2909:LOG: new IPC connection received
2020-01-23 12:14:16: pid 2909:LOG: received the failover indication from Pgpool-II on IPC interface
2020-01-23 12:14:16: pid 2909:LOG: watchdog is informed of failover start by the main process
2020-01-23 12:14:16: pid 2907:LOG: starting degeneration. shutdown host pgpool-poc01.novalocal(5432)
2020-01-23 12:14:16: pid 2907:LOG: Restart all children
2020-01-23 12:14:16: pid 2907:LOG: execute command: /usr/share/pgpool/4.1.0/etc/failover.sh 0 0 pgpool-poc02.novalocal reppassword /installer/postgresql-11.5/data/im_the_master
Authorized Uses Only.All activity may be Monitored and Reported
promote - Start
DEBUG: The script will be executed with the following arguments:
DEBUG: --trigger-file=/installer/postgresql-11.5/data/im_the_master
DEBUG: --standby_file=/installer/postgresql-11.5/data/im_slave
DEBUG: --demote-host=
DEBUG: --user=replication
DEBUG: --password=reppassword
DEBUG: --force
INFO: Checking if standby file exists...
INFO: Checking if trigger file exists...
INFO: Deleting recovery.conf file...
INFO: Checking if postgresql.conf file exists...
INFO: postgresql.conf file found. Checking if it is for primary server...
INFO: postgresql.conf file corresponds to primary server file. Nothing to do.
pg_ctl: server is running (PID: 30170)
/usr/local/pgsql11.5/bin/postgres "-D" "/installer/postgresql-11.5/data"
INFO: Restarting postgresql service...
waiting for server to shut down.... done
server stopped
waiting for server to start....2020-01-23 12:14:17 +03 LOG: listening on IPv4 address "0.0.0.0", port 5432
2020-01-23 12:14:17 +03 LOG: listening on IPv6 address "::", port 5432
2020-01-23 12:14:17 +03 LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2020-01-23 12:14:17 +03 LOG: redirecting log output to logging collector process
2020-01-23 12:14:17 +03 HINT: Future log output will appear in directory "/dblogs/logs".
 done
server started
pg_ctl: server is running (PID: 31964)
/usr/local/pgsql11.5/bin/postgres "-D" "/installer/postgresql-11.5/data"
INFO: postgresql already running.
INFO: Ensuring replication role and password...
INFO: Replication role found. Ensuring password...
ALTER ROLE
INFO: Creating primary info file...
promote - Done!
2020-01-23 12:14:18: pid 3928:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-01-23 12:14:18: pid 3928:ERROR: failed to make persistent db connection
2020-01-23 12:14:18: pid 3928:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-01-23 12:14:21: pid 2909:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-01-23 12:14:21: pid 2909:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-01-23 12:14:21: pid 2909:LOG: we have got the consensus to perform the failover
2020-01-23 12:14:21: pid 2909:DETAIL: 1 node(s) voted in the favor
2020-01-23 12:14:21: pid 2909:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
2020-01-23 12:14:23: pid 3928:ERROR: Failed to check replication time lag
2020-01-23 12:14:23: pid 3928:DETAIL: No persistent db connection for the node 0
2020-01-23 12:14:23: pid 3928:HINT: check sr_check_user and sr_check_password
2020-01-23 12:14:23: pid 3928:CONTEXT: while checking replication time lag
2020-01-23 12:14:23: pid 3928:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-01-23 12:14:23: pid 3928:ERROR: failed to make persistent db connection
2020-01-23 12:14:23: pid 3928:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-01-23 12:14:26: pid 2909:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-01-23 12:14:26: pid 2909:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-01-23 12:14:26: pid 2909:LOG: we have got the consensus to perform the failover
2020-01-23 12:14:26: pid 2909:DETAIL: 1 node(s) voted in the favor
2020-01-23 12:14:26: pid 2909:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
2020-01-23 12:14:28: pid 3928:ERROR: Failed to check replication time lag
2020-01-23 12:14:28: pid 3928:DETAIL: No persistent db connection for the node 0
2020-01-23 12:14:28: pid 3928:HINT: check sr_check_user and sr_check_password
2020-01-23 12:14:28: pid 3928:CONTEXT: while checking replication time lag
2020-01-23 12:14:28: pid 3928:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-01-23 12:14:28: pid 3928:ERROR: failed to make persistent db connection
2020-01-23 12:14:28: pid 3928:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed

raj.pandey1982@gmail.com

2020-01-26 18:20

reporter   ~0003086

What i simply want here that when i made master DB down at node 1 the pgpool running on node 1 and 2 should go in sync with slave db at node 2 which is now become master .That is it.
But here looks like pgpool still searchingg of Node 1 old master Db which is down and not syncing with node 2 which is become master after successful failover script run .

raj.pandey1982@gmail.com

2020-01-27 17:41

reporter   ~0003087

Hello Team, Any update on this!

raj.pandey1982@gmail.com

2020-01-27 17:45

reporter   ~0003088

I dont know when promotion is done then why still its keep on searching for old node and not allowing stable remote connections:-
log:-
promote - Done!
2020-01-23 12:14:18: pid 3928:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-01-23 12:14:18: pid 3928:ERROR: failed to make persistent db connection
2020-01-23 12:14:18: pid 3928:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-01-23 12:14:21: pid 2909:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-01-23 12:14:21: pid 2909:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-01-23 12:14:21: pid 2909:LOG: we have got the consensus to perform the failover
2020-01-23 12:14:21: pid 2909:DETAIL: 1 node(s) voted in the favor
2020-01-23 12:14:21: pid 2909:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
2020-01-23 12:14:23: pid 3928:ERROR: Failed to check replication time lag
2020-01-23 12:14:23: pid 3928:DETAIL: No persistent db connection for the node 0
2020-01-23 12:14:23: pid 3928:HINT: check sr_check_user and sr_check_password
2020-01-23 12:14:23: pid 3928:CONTEXT: while checking replication time lag
2020-01-23 12:14:23: pid 3928:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-01-23 12:14:23: pid 3928:ERROR: failed to make persistent db connection
2020-01-23 12:14:23: pid 3928:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-01-23 12:14:26: pid 2909:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-01-23 12:14:26: pid 2909:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-01-23 12:14:26: pid 2909:LOG: we have got the consensus to perform the failover
2020-01-23 12:14:26: pid 2909:DETAIL: 1 node(s) voted in the favor
2020-01-23 12:14:26: pid 2909:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
2020-01-23 12:14:28: pid 3928:ERROR: Failed to check replication time lag
2020-01-23 12:14:28: pid 3928:DETAIL: No persistent db connection for the node 0
2020-01-23 12:14:28: pid 3928:HINT: check sr_check_user and sr_check_password
2020-01-23 12:14:28: pid 3928:CONTEXT: while checking replication time lag
2020-01-23 12:14:28: pid 3928:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-01-23 12:14:28: pid 3928:ERROR: failed to make persistent db connection
2020-01-23 12:14:28: pid 3928:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed

raj.pandey1982@gmail.com

2020-01-27 19:21

reporter   ~0003089

search_primary_node_timeout=10 also not working in my case to avoid below error again and again as this is streaming replication feature while mine is Master-Slave replication:-

"2020-01-23 12:14:18: pid 3928:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused""

raj.pandey1982@gmail.com

2020-01-27 19:25

reporter   ~0003090

is there any way this parameter "search_primary_node_timeout=10" can be set in Master Slave mode too so that pgpool stops searching for old down backend master node 1 and start accepting remote connections!

t-ishii

2020-01-28 18:22

developer   ~0003094

I need to check your configuration. Can you share pgpool.conf?

raj.pandey1982@gmail.com

2020-01-28 19:01

reporter   ~0003095

Master Log:-
[root@pgpool-poc01 postgresql]# cat /usr/share/pgpool/4.1.0/etc/pgpool.conf
# ----------------------------
# pgPool-II configuration file
# ----------------------------
#
# This file consists of lines of the form:
#
# name = value
#
# Whitespace may be used. Comments are introduced with "#" anywhere on a line.
# The complete list of parameter names and allowed values can be found in the
# pgPool-II documentation.
#
# This file is read on server startup and when the server receives a SIGHUP
# signal. If you edit the file on a running system, you have to SIGHUP the
# server for the changes to take effect, or use "pgpool reload". Some
# parameters, which are marked below, require a server shutdown and restart to
# take effect.
#


#------------------------------------------------------------------------------
# CONNECTIONS
#------------------------------------------------------------------------------

# - pgpool Connection Settings -

listen_addresses = '*'
                                   # Host name or IP address to listen on:
                                   # '*' for all, '' for no TCP/IP connections
                                   # (change requires restart)
port = 5433
                                   # Port number
                                   # (change requires restart)
socket_dir = '/var/run/postgresql'
                                   # Unix domain socket path
                                   # The Debian package defaults to
                                   # /var/run/postgresql
                                   # (change requires restart)
listen_backlog_multiplier = 2
                                   # Set the backlog parameter of listen(2) to
                                   # num_init_children * listen_backlog_multiplier.
                                   # (change requires restart)
serialize_accept = off
                                   # whether to serialize accept() call to avoid thundering herd problem
                                   # (change requires restart)

# - pgpool Communication Manager Connection Settings -

pcp_listen_addresses = '*'
                                   # Host name or IP address for pcp process to listen on:
                                   # '*' for all, '' for no TCP/IP connections
                                   # (change requires restart)
pcp_port = 9898
                                   # Port number for pcp
                                   # (change requires restart)
pcp_socket_dir = '/var/run/postgresql'
                                   # Unix domain socket path for pcp
                                   # The Debian package defaults to
                                   # /var/run/postgresql
                                   # (change requires restart)

# - Backend Connection Settings -

                                   # Host name or IP address to connect to for backend 0
                                   # Port number for backend 0
                                   # Weight for backend 0 (only in load balancing mode)
                                   # Data directory for backend 0
                                   # Controls various backend behavior
                                   # ALLOW_TO_FAILOVER, DISALLOW_TO_FAILOVER
                                   # or ALWAYS_MASTER

# - Authentication -

enable_pool_hba = on
                                   # Use pool_hba.conf for client authentication
pool_passwd = 'pool_passwd'
                                   # File name of pool_passwd for md5 authentication.
                                   # "" disables pool_passwd.
                                   # (change requires restart)
authentication_timeout = 60
                                   # Delay in seconds to complete client authentication
                                   # 0 means no timeout.

allow_clear_text_frontend_auth = off
                                   # Allow Pgpool-II to use clear text password authentication
                                   # with clients, when pool_passwd does not
                                   # contain the user password


# - SSL Connections -

ssl = off
                                   # Enable SSL support
                                   # (change requires restart)
#ssl_key = './server.key'
                                   # Path to the SSL private key file
                                   # (change requires restart)
#ssl_cert = './server.cert'
                                   # Path to the SSL public certificate file
                                   # (change requires restart)
#ssl_ca_cert = ''
                                   # Path to a single PEM format file
                                   # containing CA root certificate(s)
                                   # (change requires restart)
#ssl_ca_cert_dir = ''
                                   # Directory containing CA root certificate(s)
                                   # (change requires restart)

ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL'
                                   # Allowed SSL ciphers
                                   # (change requires restart)
ssl_prefer_server_ciphers = off
                                   # Use server's SSL cipher preferences,
                                   # rather than the client's
                                   # (change requires restart)
#------------------------------------------------------------------------------
# POOLS
#------------------------------------------------------------------------------

# - Concurrent session and pool size -

num_init_children = 1000
                                   # Number of concurrent sessions allowed
                                   # (change requires restart)
max_pool = 10
#max_pool = 4
                                   # Number of connection pool caches per connection
                                   # (change requires restart)

# - Life time -

child_life_time = 300
                                   # Pool exits after being idle for this many seconds
child_max_connections = 0
                                   # Pool exits after receiving that many connections
                                   # 0 means no exit
connection_life_time = 0
                                   # Connection to backend closes after being idle for this many seconds
                                   # 0 means no close
#client_idle_limit = 0
                                   # Client is disconnected after being idle for that many seconds
                                   # (even inside an explicit transactions!)
                                   # 0 means no disconnection

reserved_connections = 1
#------------------------------------------------------------------------------
# LOGS
#------------------------------------------------------------------------------

# - Where to log -

log_destination = 'stderr'
                                   # Where to log
                                   # Valid values are combinations of stderr,
                                   # and syslog. Default to stderr.

# - What to log -

log_line_prefix = '%t: pid %p:'

log_connections = off
                                   # Log connections
log_hostname = off
                                   # Hostname will be shown in ps status
                                   # and in logs if connections are logged
log_statement = off
                                   # Log all statements
log_per_node_statement = off
                                   # Log all statements
                                   # with node and backend informations
log_client_messages = off
                                   # Log any client messages
log_standby_delay = 'none'
                                   # Log standby delay
                                   # Valid values are combinations of always,
                                   # if_over_threshold, none

# - Syslog specific -

syslog_facility = 'LOCAL0'
                                   # Syslog local facility. Default to LOCAL0
syslog_ident = 'pgpool'
                                   # Syslog program identification string
                                   # Default to 'pgpool'

# - Debug -

#log_error_verbosity = default # terse, default, or verbose messages

#client_min_messages = notice # values in order of decreasing detail:
                                        # debug5
                                        # debug4
                                        # debug3
                                        # debug2
                                        # debug1
                                        # log
                                        # notice
                                        # warning
                                        # error

#log_min_messages = warning # values in order of decreasing detail:
                                        # debug5
                                        # debug4
                                        # debug3
                                        # debug2
                                        # debug1
                                        # info
                                        # notice
                                        # warning
                                        # error
                                        # log
                                        # fatal
                                        # panic

#------------------------------------------------------------------------------
# FILE LOCATIONS
#------------------------------------------------------------------------------

pid_file_name = '/var/run/postgresql/pgpool.pid'
                                   # PID file name
                                   # Can be specified as relative to the"
                                   # location of pgpool.conf file or
                                   # as an absolute path
                                   # (change requires restart)
logdir = '/var/log/pgpool'
                                   # Directory of pgPool status file
                                   # (change requires restart)


#------------------------------------------------------------------------------
# CONNECTION POOLING
#------------------------------------------------------------------------------

connection_cache = on
                                   # Activate connection pools
                                   # (change requires restart)

                                   # Semicolon separated list of queries
                                   # to be issued at the end of a session
                                   # The default is for 8.3 and later
reset_query_list = 'ABORT; DISCARD ALL'
                                   # The following one is for 8.2 and before
#reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT'


#------------------------------------------------------------------------------
# REPLICATION MODE
#------------------------------------------------------------------------------

replication_mode = off
                                   # Activate replication mode
                                   # (change requires restart)
replicate_select = off
                                   # Replicate SELECT statements
                                   # when in replication mode
                                   # replicate_select is higher priority than
                                   # load_balance_mode.

insert_lock = on
                                   # Automatically locks a dummy row or a table
                                   # with INSERT statements to keep SERIAL data
                                   # consistency
                                   # Without SERIAL, no lock will be issued
lobj_lock_table = ''
                                   # When rewriting lo_creat command in
                                   # replication mode, specify table name to
                                   # lock

# - Degenerate handling -

replication_stop_on_mismatch = off
                                   # On disagreement with the packet kind
                                   # sent from backend, degenerate the node
                                   # which is most likely "minority"
                                   # If off, just force to exit this session

failover_if_affected_tuples_mismatch = off
                                   # On disagreement with the number of affected
                                   # tuples in UPDATE/DELETE queries, then
                                   # degenerate the node which is most likely
                                   # "minority".
                                   # If off, just abort the transaction to
                                   # keep the consistency


#------------------------------------------------------------------------------
# LOAD BALANCING MODE
#------------------------------------------------------------------------------

load_balance_mode = on
                                   # Activate load balancing mode
                                   # (change requires restart)
ignore_leading_white_space = on
                                   # Ignore leading white spaces of each query
white_function_list = ''
                                   # Comma separated list of function names
                                   # that don't write to database
                                   # Regexp are accepted
black_function_list = 'Get_Appt_code,walkin_appointment_token_no,findAppointmentFacilityTypeIdNew,findAppointmentReferUrgencyType,walkin_appointment_token_no,walkin_appointment_token_no'
black_function_list = 'currval,lastval,nextval,setval'
                                   # Comma separated list of function names
                                   # that write to database
                                   # Regexp are accepted

black_query_pattern_list = ''
                                   # Semicolon separated list of query patterns
                                   # that should be sent to primary node
                                   # Regexp are accepted
                                   # valid for streaming replicaton mode only.

database_redirect_preference_list = ''
                                   # comma separated list of pairs of database and node id.
                                   # example: postgres:primary,mydb[0-4]:1,mydb[5-9]:2'
                                   # valid for streaming replicaton mode only.
app_name_redirect_preference_list = ''
                                   # comma separated list of pairs of app name and node id.
                                   # example: 'psql:primary,myapp[0-4]:1,myapp[5-9]:standby'
                                   # valid for streaming replicaton mode only.
allow_sql_comments = off
                                   # if on, ignore SQL comments when judging if load balance or
                                   # query cache is possible.
                                   # If off, SQL comments effectively prevent the judgment
                                   # (pre 3.4 behavior).

disable_load_balance_on_write = 'transaction'
                                   # Load balance behavior when write query is issued
                                   # in an explicit transaction.
                                   # Note that any query not in an explicit transaction
                                   # is not affected by the parameter.
                                   # 'transaction' (the default): if a write query is issued,
                                   # subsequent read queries will not be load balanced
                                   # until the transaction ends.
                                   # 'trans_transaction': if a write query is issued,
                                   # subsequent read queries in an explicit transaction
                                   # will not be load balanced until the session ends.
                                   # 'always': if a write query is issued, read queries will
                                   # not be load balanced until the session ends.

#------------------------------------------------------------------------------
# MASTER/SLAVE MODE
#------------------------------------------------------------------------------

master_slave_mode = on
                                   # Activate master/slave mode
                                   # (change requires restart)
master_slave_sub_mode = 'stream'
                                   # Master/slave sub mode
                                   # Valid values are combinations stream, slony
                                   # or logical. Default is stream.
                                   # (change requires restart)

# - Streaming -

sr_check_period = 5 #0
                                   # Streaming replication check period
                                   # Disabled (0) by default
sr_check_user = 'replication'
                                   # Streaming replication check user
                                   # This is necessary even if you disable
                                   # streaming replication delay check with
                                   # sr_check_period = 0

sr_check_password = 'reppassword'
                                   # Password for streaming replication check user.
                                   # Leaving it empty will make Pgpool-II to first look for the
                                   # Password in pool_passwd file before using the empty password
sr_check_database = 'mawidstg01'
#sr_check_database = 'postgres'
                                   # Database name for streaming replication check
delay_threshold = 0
                                   # Threshold before not dispatching query to standby node
                                   # Unit is in bytes
                                   # Disabled (0) by default

# - Special commands -

follow_master_command = ''
                                   # Executes this command after master failover
                                   # Special values:
                                   # %d = node id
                                   # %h = host name
                                   # %p = port number
                                   # %D = database cluster path
                                   # %m = new master node id
                                   # %H = hostname of the new master node
                                   # %M = old master node id
                                   # %P = old primary node id
                                   # %r = new master port number
                                   # %R = new master database cluster path
                                   # %% = '%' character

#------------------------------------------------------------------------------
# HEALTH CHECK GLOBAL PARAMETERS
#------------------------------------------------------------------------------

health_check_period = 5
                                   # Health check period
                                   # Disabled (0) by default
health_check_timeout = 10
#health_check_timeout = 0
                                   # Health check timeout
                                   # 0 means no timeout
health_check_user = 'postgres'
                                   # Health check user
health_check_password = 'TEXTpostgrestg'
                                   # Password for health check user
                                   # Leaving it empty will make Pgpool-II to first look for the
                                   # Password in pool_passwd file before using the empty password

health_check_database = 'mawidstg01'
                                   # Database name for health check. If '', tries 'postgres' frist, then 'template1'

health_check_max_retries = 0
                                   # Maximum number of times to retry a failed health check before giving up.
health_check_retry_delay = 1
                                   # Amount of time to wait (in seconds) between retries.
connect_timeout = 10000
                                   # Timeout value in milliseconds before giving up to connect to backend.
                                   # Default is 10000 ms (10 second). Flaky network user may want to increase
                                   # the value. 0 means no timeout.
                                   # Note that this value is not only used for health check,
                                   # but also for ordinary conection to backend.

#------------------------------------------------------------------------------
# HEALTH CHECK PER NODE PARAMETERS (OPTIONAL)
#------------------------------------------------------------------------------
#health_check_period0 = 0
#health_check_timeout0 = 20
#health_check_user0 = 'nobody'
#health_check_password0 = ''
#health_check_database0 = ''
#health_check_max_retries0 = 0
#health_check_retry_delay0 = 1
#connect_timeout0 = 10000

#------------------------------------------------------------------------------
# FAILOVER AND FAILBACK
#------------------------------------------------------------------------------
failover_command = '/usr/share/pgpool/4.1.0/etc/failover.sh %d %P %H reppassword /installer/postgresql-11.5/data/im_the_master'
#failover_command = '/etc/pgpool-II/failover.sh %d %P %H reppassword /installer/postgresql-11.5/data/im_the_master'
                                   # Executes this command at failover
                                   # Special values:
                                   # %d = node id
                                   # %h = host name
                                   # %p = port number
                                   # %D = database cluster path
                                   # %m = new master node id
                                   # %H = hostname of the new master node
                                   # %M = old master node id
                                   # %P = old primary node id
                                   # %r = new master port number
                                   # %R = new master database cluster path
                                   # %% = '%' character
failback_command = ''
                                   # Executes this command at failback.
                                   # Special values:
                                   # %d = node id
                                   # %h = host name
                                   # %p = port number
                                   # %D = database cluster path
                                   # %m = new master node id
                                   # %H = hostname of the new master node
                                   # %M = old master node id
                                   # %P = old primary node id
                                   # %r = new master port number
                                   # %R = new master database cluster path
                                   # %% = '%' character

failover_on_backend_error = on
                                   # Initiates failover when reading/writing to the
                                   # backend communication socket fails
                                   # If set to off, pgpool will report an
                                   # error and disconnect the session.

detach_false_primary = off
                                   # Detach false primary if on. Only
                                   # valid in streaming replicaton
                                   # mode and with PostgreSQL 9.6 or
                                   # after.
search_primary_node_timeout = 10
#search_primary_node_timeout = 300
                                   # Timeout in seconds to search for the
                                   # primary node when a failover occurs.
                                   # 0 means no timeout, keep searching
                                   # for a primary node forever.

#------------------------------------------------------------------------------
# ONLINE RECOVERY
#------------------------------------------------------------------------------

recovery_user = 'postgres'
                                   # Online recovery user
recovery_password = 'postgrestg'
                                   # Online recovery password
                                   # Leaving it empty will make Pgpool-II to first look for the
                                   # Password in pool_passwd file before using the empty password

recovery_1st_stage_command = 'recovery_1st_stage.sh'
                                   # Executes a command in first stage
recovery_2nd_stage_command = ''
                                   # Executes a command in second stage
recovery_timeout = 90
                                   # Timeout in seconds to wait for the
                                   # recovering node's postmaster to start up
                                   # 0 means no wait
client_idle_limit_in_recovery = 0
                                   # Client is disconnected after being idle
                                   # for that many seconds in the second stage
                                   # of online recovery
                                   # 0 means no disconnection
                                   # -1 means immediate disconnection


#------------------------------------------------------------------------------
# WATCHDOG
#------------------------------------------------------------------------------

# - Enabling -

use_watchdog = on
                                    # Activates watchdog
                                    # (change requires restart)

# -Connection to up stream servers -
trusted_servers = 'mohvcasdb01.novalocal,mohcasdevdb.novalocal'
                                    # trusted server list which are used
                                    # to confirm network connection
                                    # (hostA,hostB,hostC,...)
                                    # (change requires restart)
ping_path = '/bin'
                                    # ping command path
                                    # (change requires restart)

# - Watchdog communication Settings -

wd_hostname = 'pgpool-poc01.novalocal'
                                    # Host name or IP address of this watchdog
                                    # (change requires restart)
wd_port = 9000
                                    # port number for watchdog service
                                    # (change requires restart)
wd_priority = 1
                                    # priority of this watchdog in leader election
                                    # (change requires restart)

wd_authkey = ''
                                    # Authentication key for watchdog communication
                                    # (change requires restart)

wd_ipc_socket_dir = '/var/run/postgresql'
                                    # Unix domain socket path for watchdog IPC socket
                                    # The Debian package defaults to
                                    # /var/run/postgresql
                                    # (change requires restart)


# - Virtual IP control Setting -

delegate_IP = '10.70.184.29'
                                    # delegate IP address
                                    # If this is empty, virtual IP never bring up.
                                    # (change requires restart)
if_cmd_path = '/sbin'
                                    # path to the directory where if_up/down_cmd exists
                                    # (change requires restart)
if_up_cmd = 'ip addr add $_IP_$/24 dev eth0 label eth0:0'
                                    # startup delegate IP command
                                    # (change requires restart)
if_down_cmd = 'ip addr del $_IP_$/24 dev eth0'
                                    # shutdown delegate IP command
                                    # (change requires restart)
arping_path = '/usr/sbin'
                                    # arping command path
                                    # (change requires restart)
arping_cmd = 'arping -U $_IP_$ -w 1 -I eth0'
                                    # arping command
                                    # (change requires restart)

# - Behaivor on escalation Setting -

clear_memqcache_on_escalation = on
                                    # Clear all the query cache on shared memory
                                    # when standby pgpool escalate to active pgpool
                                    # (= virtual IP holder).
                                    # This should be off if client connects to pgpool
                                    # not using virtual IP.
                                    # (change requires restart)
wd_escalation_command = ''
                                    # Executes this command at escalation on new active pgpool.
                                    # (change requires restart)
wd_de_escalation_command = ''
                                    # Executes this command when master pgpool resigns from being master.
                                    # (change requires restart)

# - Watchdog consensus settings for failover -

failover_when_quorum_exists = on
                                    # Only perform backend node failover
                                    # when the watchdog cluster holds the quorum
                                    # (change requires restart)

failover_require_consensus = on
                                    # Perform failover when majority of Pgpool-II nodes
                                    # aggrees on the backend node status change
                                    # (change requires restart)

allow_multiple_failover_requests_from_node = off
                                    # A Pgpool-II node can cast multiple votes
                                    # for building the consensus on failover
                                    # (change requires restart)

# - Lifecheck Setting -

# -- common --

wd_monitoring_interfaces_list = ''
                                    # if any interface from the list is active the watchdog will
                                    # consider the network is fine
                                    # 'any' to enable monitoring on all interfaces except loopback
                                    # '' to disable monitoring
                                    # (change requires restart)


wd_lifecheck_method = 'heartbeat'
                                    # Method of watchdog lifecheck ('heartbeat' or 'query' or 'external')
                                    # (change requires restart)
wd_interval = 3
                                    # lifecheck interval (sec) > 0
                                    # (change requires restart)

# -- heartbeat mode --

wd_heartbeat_port = 9694
                                    # Port number for receiving heartbeat signal
                                    # (change requires restart)
wd_heartbeat_keepalive = 2
                                    # Interval time of sending heartbeat signal (sec)
                                    # (change requires restart)
wd_heartbeat_deadtime = 30
                                    # Deadtime interval for heartbeat signal (sec)
                                    # (change requires restart)
                                    # Host name or IP address of destination 0
                                    # for sending heartbeat signal.
                                    # (change requires restart)
                                    # Port number of destination 0 for sending
                                    # heartbeat signal. Usually this is the
                                    # same as wd_heartbeat_port.
                                    # (change requires restart)
                                    # Name of NIC device (such like 'eth0')
                                    # used for sending/receiving heartbeat
                                    # signal to/from destination 0.
                                    # This works only when this is not empty
                                    # and pgpool has root privilege.
                                    # (change requires restart)

#heartbeat_destination1 = 'host0_ip2'
#heartbeat_destination_port1 = 9694
#heartbeat_device1 = ''

# -- query mode --

wd_life_point = 3
                                    # lifecheck retry times
                                    # (change requires restart)
wd_lifecheck_query = 'SELECT 1'
                                    # lifecheck query to pgpool from watchdog
                                    # (change requires restart)
wd_lifecheck_dbname = 'template1'
                                    # Database name connected for lifecheck
                                    # (change requires restart)
wd_lifecheck_user = 'nobody'
                                    # watchdog user monitoring pgpools in lifecheck
                                    # (change requires restart)
wd_lifecheck_password = ''
                                    # Password for watchdog user in lifecheck
                                    # Leaving it empty will make Pgpool-II to first look for the
                                    # Password in pool_passwd file before using the empty password
                                    # (change requires restart)

# - Other pgpool Connection Settings -

                                    # Host name or IP address to connect to for other pgpool 0
                                    # (change requires restart)
                                    # Port number for other pgpool 0
                                    # (change requires restart)
                                    # Port number for other watchdog 0
                                    # (change requires restart)
#other_pgpool_hostname1 = 'host1'
#other_pgpool_port1 = 5432
#other_wd_port1 = 9000


#------------------------------------------------------------------------------
# OTHERS
#------------------------------------------------------------------------------
relcache_expire = 0
                                   # Life time of relation cache in seconds.
                                   # 0 means no cache expiration(the default).
                                   # The relation cache is used for cache the
                                   # query result against PostgreSQL system
                                   # catalog to obtain various information
                                   # including table structures or if it's a
                                   # temporary table or not. The cache is
                                   # maintained in a pgpool child local memory
                                   # and being kept as long as it survives.
                                   # If someone modify the table by using
                                   # ALTER TABLE or some such, the relcache is
                                   # not consistent anymore.
                                   # For this purpose, cache_expiration
                                   # controls the life time of the cache.

relcache_size = 256
                                   # Number of relation cache
                                   # entry. If you see frequently:
                                   # "pool_search_relcache: cache replacement happend"
                                   # in the pgpool log, you might want to increate this number.

check_temp_table = on
                                   # If on, enable temporary table check in SELECT statements.
                                   # This initiates queries against system catalog of primary/master
                                   # thus increases load of master.
                                   # If you are absolutely sure that your system never uses temporary tables
                                   # and you want to save access to primary/master, you could turn this off.
                                   # Default is on.

check_unlogged_table = on
                                   # If on, enable unlogged table check in SELECT statements.
                                   # This initiates queries against system catalog of primary/master
                                   # thus increases load of master.
                                   # If you are absolutely sure that your system never uses unlogged tables
                                   # and you want to save access to primary/master, you could turn this off.
                                   # Default is on.

#------------------------------------------------------------------------------
# IN MEMORY QUERY MEMORY CACHE
#------------------------------------------------------------------------------
memory_cache_enabled = off
                                   # If on, use the memory cache functionality, off by default
                                   # (change requires restart)
memqcache_method = 'shmem'
                                   # Cache storage method. either 'shmem'(shared memory) or
                                   # 'memcached'. 'shmem' by default
                                   # (change requires restart)
memqcache_memcached_host = 'localhost'
                                   # Memcached host name or IP address. Mandatory if
                                   # memqcache_method = 'memcached'.
                                   # Defaults to localhost.
                                   # (change requires restart)
memqcache_memcached_port = 11211
                                   # Memcached port number. Mondatory if memqcache_method = 'memcached'.
                                   # Defaults to 11211.
                                   # (change requires restart)
memqcache_total_size = 67108864
                                   # Total memory size in bytes for storing memory cache.
                                   # Mandatory if memqcache_method = 'shmem'.
                                   # Defaults to 64MB.
                                   # (change requires restart)
memqcache_max_num_cache = 1000000
                                   # Total number of cache entries. Mandatory
                                   # if memqcache_method = 'shmem'.
                                   # Each cache entry consumes 48 bytes on shared memory.
                                   # Defaults to 1,000,000(45.8MB).
                                   # (change requires restart)
memqcache_expire = 0
                                   # Memory cache entry life time specified in seconds.
                                   # 0 means infinite life time. 0 by default.
                                   # (change requires restart)
memqcache_auto_cache_invalidation = on
                                   # If on, invalidation of query cache is triggered by corresponding
                                   # DDL/DML/DCL(and memqcache_expire). If off, it is only triggered
                                   # by memqcache_expire. on by default.
                                   # (change requires restart)
memqcache_maxcache = 409600
                                   # Maximum SELECT result size in bytes.
                                   # Must be smaller than memqcache_cache_block_size. Defaults to 400KB.
                                   # (change requires restart)
memqcache_cache_block_size = 1048576
                                   # Cache block size in bytes. Mandatory if memqcache_method = 'shmem'.
                                   # Defaults to 1MB.
                                   # (change requires restart)
memqcache_oiddir = '/var/log/pgpool/oiddir'
                                   # Temporary work directory to record table oids
                                   # (change requires restart)
white_memqcache_table_list = ''
                                   # Comma separated list of table names to memcache
                                   # that don't write to database
                                   # Regexp are accepted
black_memqcache_table_list = ''
                                   # Comma separated list of table names not to memcache
                                   # that don't write to database
                                   # Regexp are accepted
backend_hostname0 = 'pgpool-poc01.novalocal'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/installer/postgresql-11.5/data'
backend_flag0 = 'ALLOW_TO_FAILOVER'

backend_hostname1 = 'pgpool-poc02.novalocal'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/installer/postgresql-11.5/data'
backend_flag1 = 'ALLOW_TO_FAILOVER'

heartbeat_destination0 = 'pgpool-poc02.novalocal'
heartbeat_destination_port0 = 9694

other_pgpool_hostname0 = 'pgpool-poc02.novalocal'
other_pgpool_port0 = 5433
other_wd_port0 = 9000

heartbeat_device0 = 'eth0'
##Addded b Raj --> pgpool-II 4.1 onwards available parameters

statement_level_load_balance = on

enable_consensus_with_half_votes = on

raj.pandey1982@gmail.com

2020-01-28 19:04

reporter   ~0003096

Slave Conf:-
[root@pgpool-poc02 bin]# cat /usr/share/pgpool/4.1.0/etc/pgpool.conf
# ----------------------------
# pgPool-II configuration file
# ----------------------------
#
# This file consists of lines of the form:
#
# name = value
#
# Whitespace may be used. Comments are introduced with "#" anywhere on a line.
# The complete list of parameter names and allowed values can be found in the
# pgPool-II documentation.
#
# This file is read on server startup and when the server receives a SIGHUP
# signal. If you edit the file on a running system, you have to SIGHUP the
# server for the changes to take effect, or use "pgpool reload". Some
# parameters, which are marked below, require a server shutdown and restart to
# take effect.
#


#------------------------------------------------------------------------------
# CONNECTIONS
#------------------------------------------------------------------------------

# - pgpool Connection Settings -

listen_addresses = '*'
                                   # Host name or IP address to listen on:
                                   # '*' for all, '' for no TCP/IP connections
                                   # (change requires restart)
port = 5433
                                   # Port number
                                   # (change requires restart)
socket_dir = '/var/run/postgresql'
                                   # Unix domain socket path
                                   # The Debian package defaults to
                                   # /var/run/postgresql
                                   # (change requires restart)
listen_backlog_multiplier = 2
                                   # Set the backlog parameter of listen(2) to
                                   # num_init_children * listen_backlog_multiplier.
                                   # (change requires restart)
serialize_accept = off
                                   # whether to serialize accept() call to avoid thundering herd problem
                                   # (change requires restart)

# - pgpool Communication Manager Connection Settings -

pcp_listen_addresses = '*'
                                   # Host name or IP address for pcp process to listen on:
                                   # '*' for all, '' for no TCP/IP connections
                                   # (change requires restart)
pcp_port = 9898
                                   # Port number for pcp
                                   # (change requires restart)
pcp_socket_dir = '/var/run/postgresql'
                                   # Unix domain socket path for pcp
                                   # The Debian package defaults to
                                   # /var/run/postgresql
                                   # (change requires restart)

# - Backend Connection Settings -

                                   # Host name or IP address to connect to for backend 0
                                   # Port number for backend 0
                                   # Weight for backend 0 (only in load balancing mode)
                                   # Data directory for backend 0
                                   # Controls various backend behavior
                                   # ALLOW_TO_FAILOVER, DISALLOW_TO_FAILOVER
                                   # or ALWAYS_MASTER

# - Authentication -

enable_pool_hba = on
                                   # Use pool_hba.conf for client authentication
pool_passwd = 'pool_passwd'
                                   # File name of pool_passwd for md5 authentication.
                                   # "" disables pool_passwd.
                                   # (change requires restart)
authentication_timeout = 60
                                   # Delay in seconds to complete client authentication
                                   # 0 means no timeout.

allow_clear_text_frontend_auth = off
                                   # Allow Pgpool-II to use clear text password authentication
                                   # with clients, when pool_passwd does not
                                   # contain the user password


# - SSL Connections -

ssl = off
                                   # Enable SSL support
                                   # (change requires restart)
#ssl_key = './server.key'
                                   # Path to the SSL private key file
                                   # (change requires restart)
#ssl_cert = './server.cert'
                                   # Path to the SSL public certificate file
                                   # (change requires restart)
#ssl_ca_cert = ''
                                   # Path to a single PEM format file
                                   # containing CA root certificate(s)
                                   # (change requires restart)
#ssl_ca_cert_dir = ''
                                   # Directory containing CA root certificate(s)
                                   # (change requires restart)

ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL'
                                   # Allowed SSL ciphers
                                   # (change requires restart)
ssl_prefer_server_ciphers = off
                                   # Use server's SSL cipher preferences,
                                   # rather than the client's
                                   # (change requires restart)
#------------------------------------------------------------------------------
# POOLS
#------------------------------------------------------------------------------

# - Concurrent session and pool size -

num_init_children = 1000
                                   # Number of concurrent sessions allowed
                                   # (change requires restart)
max_pool = 10
#max_pool = 4
                                   # Number of connection pool caches per connection
                                   # (change requires restart)

# - Life time -

child_life_time = 300
                                   # Pool exits after being idle for this many seconds
child_max_connections = 0
                                   # Pool exits after receiving that many connections
                                   # 0 means no exit
connection_life_time = 0
                                   # Connection to backend closes after being idle for this many seconds
                                   # 0 means no close
client_idle_limit = 0
                                   # Client is disconnected after being idle for that many seconds
                                   # (even inside an explicit transactions!)
                                   # 0 means no disconnection

reserved_connections = 1
#------------------------------------------------------------------------------
# LOGS
#------------------------------------------------------------------------------

# - Where to log -

log_destination = 'stderr'
                                   # Where to log
                                   # Valid values are combinations of stderr,
                                   # and syslog. Default to stderr.

# - What to log -

log_line_prefix = '%t: pid %p:'

log_connections = off
                                   # Log connections
log_hostname = off
                                   # Hostname will be shown in ps status
                                   # and in logs if connections are logged
log_statement = off
                                   # Log all statements
log_per_node_statement = off
                                   # Log all statements
                                   # with node and backend informations
log_client_messages = off
                                   # Log any client messages
log_standby_delay = 'none'
                                   # Log standby delay
                                   # Valid values are combinations of always,
                                   # if_over_threshold, none

# - Syslog specific -

syslog_facility = 'LOCAL0'
                                   # Syslog local facility. Default to LOCAL0
syslog_ident = 'pgpool'
                                   # Syslog program identification string
                                   # Default to 'pgpool'

# - Debug -

#log_error_verbosity = default # terse, default, or verbose messages

#client_min_messages = notice # values in order of decreasing detail:
                                        # debug5
                                        # debug4
                                        # debug3
                                        # debug2
                                        # debug1
                                        # log
                                        # notice
                                        # warning
                                        # error

#log_min_messages = warning # values in order of decreasing detail:
                                        # debug5
                                        # debug4
                                        # debug3
                                        # debug2
                                        # debug1
                                        # info
                                        # notice
                                        # warning
                                        # error
                                        # log
                                        # fatal
                                        # panic

#------------------------------------------------------------------------------
# FILE LOCATIONS
#------------------------------------------------------------------------------

pid_file_name = '/var/run/postgresql/pgpool.pid'
                                   # PID file name
                                   # Can be specified as relative to the"
                                   # location of pgpool.conf file or
                                   # as an absolute path
                                   # (change requires restart)
logdir = '/var/log/pgpool'
                                   # Directory of pgPool status file
                                   # (change requires restart)


#------------------------------------------------------------------------------
# CONNECTION POOLING
#------------------------------------------------------------------------------

connection_cache = on
                                   # Activate connection pools
                                   # (change requires restart)

                                   # Semicolon separated list of queries
                                   # to be issued at the end of a session
                                   # The default is for 8.3 and later
reset_query_list = 'ABORT; DISCARD ALL'
                                   # The following one is for 8.2 and before
#reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT'


#------------------------------------------------------------------------------
# REPLICATION MODE
#------------------------------------------------------------------------------

replication_mode = off
                                   # Activate replication mode
                                   # (change requires restart)
replicate_select = off
                                   # Replicate SELECT statements
                                   # when in replication mode
                                   # replicate_select is higher priority than
                                   # load_balance_mode.

insert_lock = on
                                   # Automatically locks a dummy row or a table
                                   # with INSERT statements to keep SERIAL data
                                   # consistency
                                   # Without SERIAL, no lock will be issued
lobj_lock_table = ''
                                   # When rewriting lo_creat command in
                                   # replication mode, specify table name to
                                   # lock

# - Degenerate handling -

replication_stop_on_mismatch = off
                                   # On disagreement with the packet kind
                                   # sent from backend, degenerate the node
                                   # which is most likely "minority"
                                   # If off, just force to exit this session

failover_if_affected_tuples_mismatch = off
                                   # On disagreement with the number of affected
                                   # tuples in UPDATE/DELETE queries, then
                                   # degenerate the node which is most likely
                                   # "minority".
                                   # If off, just abort the transaction to
                                   # keep the consistency


#------------------------------------------------------------------------------
# LOAD BALANCING MODE
#------------------------------------------------------------------------------

load_balance_mode = on
                                   # Activate load balancing mode
                                   # (change requires restart)
ignore_leading_white_space = on
                                   # Ignore leading white spaces of each query
white_function_list = ''
                                   # Comma separated list of function names
                                   # that don't write to database
                                   # Regexp are accepted
black_function_list = 'Get_Appt_code,walkin_appointment_token_no,findAppointmentFacilityTypeIdNew,findAppointmentReferUrgencyType,walkin_appointment_token_no,walkin_appointment_token_no'
black_function_list = 'currval,lastval,nextval,setval'
                                   # Comma separated list of function names
                                   # that write to database
                                   # Regexp are accepted

black_query_pattern_list = ''
                                   # Semicolon separated list of query patterns
                                   # that should be sent to primary node
                                   # Regexp are accepted
                                   # valid for streaming replicaton mode only.

database_redirect_preference_list = ''
                                   # comma separated list of pairs of database and node id.
                                   # example: postgres:primary,mydb[0-4]:1,mydb[5-9]:2'
                                   # valid for streaming replicaton mode only.
app_name_redirect_preference_list = ''
                                   # comma separated list of pairs of app name and node id.
                                   # example: 'psql:primary,myapp[0-4]:1,myapp[5-9]:standby'
                                   # valid for streaming replicaton mode only.
allow_sql_comments = off
                                   # if on, ignore SQL comments when judging if load balance or
                                   # query cache is possible.
                                   # If off, SQL comments effectively prevent the judgment
                                   # (pre 3.4 behavior).

disable_load_balance_on_write = 'transaction'
                                   # Load balance behavior when write query is issued
                                   # in an explicit transaction.
                                   # Note that any query not in an explicit transaction
                                   # is not affected by the parameter.
                                   # 'transaction' (the default): if a write query is issued,
                                   # subsequent read queries will not be load balanced
                                   # until the transaction ends.
                                   # 'trans_transaction': if a write query is issued,
                                   # subsequent read queries in an explicit transaction
                                   # will not be load balanced until the session ends.
                                   # 'always': if a write query is issued, read queries will
                                   # not be load balanced until the session ends.

#------------------------------------------------------------------------------
# MASTER/SLAVE MODE
#------------------------------------------------------------------------------

master_slave_mode = on
                                   # Activate master/slave mode
                                   # (change requires restart)
master_slave_sub_mode = 'stream'
                                   # Master/slave sub mode
                                   # Valid values are combinations stream, slony
                                   # or logical. Default is stream.
                                   # (change requires restart)

# - Streaming -

sr_check_period = 5 #0
                                   # Streaming replication check period
                                   # Disabled (0) by default
sr_check_user = 'replication'
                                   # Streaming replication check user
                                   # This is necessary even if you disable
                                   # streaming replication delay check with
                                   # sr_check_period = 0

sr_check_password = 'reppassword'
                                   # Password for streaming replication check user.
                                   # Leaving it empty will make Pgpool-II to first look for the
                                   # Password in pool_passwd file before using the empty password
sr_check_database = 'mawidstg01'
#sr_check_database = 'postgres'
                                   # Database name for streaming replication check
delay_threshold = 0
                                   # Threshold before not dispatching query to standby node
                                   # Unit is in bytes
                                   # Disabled (0) by default

# - Special commands -

follow_master_command = ''
                                   # Executes this command after master failover
                                   # Special values:
                                   # %d = node id
                                   # %h = host name
                                   # %p = port number
                                   # %D = database cluster path
                                   # %m = new master node id
                                   # %H = hostname of the new master node
                                   # %M = old master node id
                                   # %P = old primary node id
                                   # %r = new master port number
                                   # %R = new master database cluster path
                                   # %% = '%' character

#------------------------------------------------------------------------------
# HEALTH CHECK GLOBAL PARAMETERS
#------------------------------------------------------------------------------

health_check_period = 5
                                   # Health check period
                                   # Disabled (0) by default
health_check_timeout = 10
#health_check_timeout = 0
                                   # Health check timeout
                                   # 0 means no timeout
health_check_user = 'postgres'
                                   # Health check user
health_check_password = 'TEXTpostgrestg'
                                   # Password for health check user
                                   # Leaving it empty will make Pgpool-II to first look for the
                                   # Password in pool_passwd file before using the empty password

health_check_database = ''
                                   # Database name for health check. If '', tries 'postgres' frist, then 'template1'

health_check_max_retries = 0
                                   # Maximum number of times to retry a failed health check before giving up.
health_check_retry_delay = 1
                                   # Amount of time to wait (in seconds) between retries.
connect_timeout = 10000
                                   # Timeout value in milliseconds before giving up to connect to backend.
                                   # Default is 10000 ms (10 second). Flaky network user may want to increase
                                   # the value. 0 means no timeout.
                                   # Note that this value is not only used for health check,
                                   # but also for ordinary conection to backend.

#------------------------------------------------------------------------------
# HEALTH CHECK PER NODE PARAMETERS (OPTIONAL)
#------------------------------------------------------------------------------
#health_check_period0 = 0
#health_check_timeout0 = 20
#health_check_user0 = 'nobody'
#health_check_password0 = ''
#health_check_database0 = ''
#health_check_max_retries0 = 0
#health_check_retry_delay0 = 1
#connect_timeout0 = 10000

#------------------------------------------------------------------------------
# FAILOVER AND FAILBACK
#------------------------------------------------------------------------------
failover_command = '/etc/pgpool/4.1.0/failover.sh %d %P %H reppassword /installer/postgresql-11.5/data/im_the_master'
#failover_command = '/etc/pgpool-II/failover.sh %d %P %H reppassword /installer/postgresql-11.5/data/im_the_master'
                                   # Executes this command at failover
                                   # Special values:
                                   # %d = node id
                                   # %h = host name
                                   # %p = port number
                                   # %D = database cluster path
                                   # %m = new master node id
                                   # %H = hostname of the new master node
                                   # %M = old master node id
                                   # %P = old primary node id
                                   # %r = new master port number
                                   # %R = new master database cluster path
                                   # %% = '%' character
failback_command = ''
                                   # Executes this command at failback.
                                   # Special values:
                                   # %d = node id
                                   # %h = host name
                                   # %p = port number
                                   # %D = database cluster path
                                   # %m = new master node id
                                   # %H = hostname of the new master node
                                   # %M = old master node id
                                   # %P = old primary node id
                                   # %r = new master port number
                                   # %R = new master database cluster path
                                   # %% = '%' character

failover_on_backend_error = on
                                   # Initiates failover when reading/writing to the
                                   # backend communication socket fails
                                   # If set to off, pgpool will report an
                                   # error and disconnect the session.

detach_false_primary = off
                                   # Detach false primary if on. Only
                                   # valid in streaming replicaton
                                   # mode and with PostgreSQL 9.6 or
                                   # after.
search_primary_node_timeout = 10
#search_primary_node_timeout = 300
                                   # Timeout in seconds to search for the
                                   # primary node when a failover occurs.
                                   # 0 means no timeout, keep searching
                                   # for a primary node forever.

#------------------------------------------------------------------------------
# ONLINE RECOVERY
#------------------------------------------------------------------------------

recovery_user = 'postgres'
                                   # Online recovery user
recovery_password = 'postgrestg'
                                   # Online recovery password
                                   # Leaving it empty will make Pgpool-II to first look for the
                                   # Password in pool_passwd file before using the empty password

recovery_1st_stage_command = 'recovery_1st_stage.sh'
                                   # Executes a command in first stage
recovery_2nd_stage_command = ''
                                   # Executes a command in second stage
recovery_timeout = 90
                                   # Timeout in seconds to wait for the
                                   # recovering node's postmaster to start up
                                   # 0 means no wait
client_idle_limit_in_recovery = 0
                                   # Client is disconnected after being idle
                                   # for that many seconds in the second stage
                                   # of online recovery
                                   # 0 means no disconnection
                                   # -1 means immediate disconnection


#------------------------------------------------------------------------------
# WATCHDOG
#------------------------------------------------------------------------------

# - Enabling -

use_watchdog = on
                                    # Activates watchdog
                                    # (change requires restart)

# -Connection to up stream servers -
trusted_servers = 'mohvcasdb01.novalocal,mohcasdevdb.novalocal'
                                    # trusted server list which are used
                                    # to confirm network connection
                                    # (hostA,hostB,hostC,...)
                                    # (change requires restart)
ping_path = '/bin'
                                    # ping command path
                                    # (change requires restart)

# - Watchdog communication Settings -

wd_hostname = 'pgpool-poc02.novalocal'
                                    # Host name or IP address of this watchdog
                                    # (change requires restart)
wd_port = 9000
                                    # port number for watchdog service
                                    # (change requires restart)
wd_priority = 1
                                    # priority of this watchdog in leader election
                                    # (change requires restart)

wd_authkey = ''
                                    # Authentication key for watchdog communication
                                    # (change requires restart)

wd_ipc_socket_dir = '/var/run/postgresql'
                                    # Unix domain socket path for watchdog IPC socket
                                    # The Debian package defaults to
                                    # /var/run/postgresql
                                    # (change requires restart)


# - Virtual IP control Setting -

delegate_IP = '10.70.184.29'
                                    # delegate IP address
                                    # If this is empty, virtual IP never bring up.
                                    # (change requires restart)
if_cmd_path = '/sbin'
                                    # path to the directory where if_up/down_cmd exists
                                    # (change requires restart)
if_up_cmd = 'ip addr add $_IP_$/24 dev eth0 label eth0:0'
                                    # startup delegate IP command
                                    # (change requires restart)
if_down_cmd = 'ip addr del $_IP_$/24 dev eth0'
                                    # shutdown delegate IP command
                                    # (change requires restart)
arping_path = '/usr/sbin'
                                    # arping command path
                                    # (change requires restart)
arping_cmd = 'arping -U $_IP_$ -w 1 -I eth0'
                                    # arping command
                                    # (change requires restart)

# - Behaivor on escalation Setting -

clear_memqcache_on_escalation = on
                                    # Clear all the query cache on shared memory
                                    # when standby pgpool escalate to active pgpool
                                    # (= virtual IP holder).
                                    # This should be off if client connects to pgpool
                                    # not using virtual IP.
                                    # (change requires restart)
wd_escalation_command = ''
                                    # Executes this command at escalation on new active pgpool.
                                    # (change requires restart)
wd_de_escalation_command = ''
                                    # Executes this command when master pgpool resigns from being master.
                                    # (change requires restart)

# - Watchdog consensus settings for failover -

failover_when_quorum_exists = on
                                    # Only perform backend node failover
                                    # when the watchdog cluster holds the quorum
                                    # (change requires restart)

failover_require_consensus = on
                                    # Perform failover when majority of Pgpool-II nodes
                                    # aggrees on the backend node status change
                                    # (change requires restart)

allow_multiple_failover_requests_from_node = off
                                    # A Pgpool-II node can cast multiple votes
                                    # for building the consensus on failover
                                    # (change requires restart)

# - Lifecheck Setting -

# -- common --

wd_monitoring_interfaces_list = ''
                                    # if any interface from the list is active the watchdog will
                                    # consider the network is fine
                                    # 'any' to enable monitoring on all interfaces except loopback
                                    # '' to disable monitoring
                                    # (change requires restart)


wd_lifecheck_method = 'heartbeat'
                                    # Method of watchdog lifecheck ('heartbeat' or 'query' or 'external')
                                    # (change requires restart)
wd_interval = 3
                                    # lifecheck interval (sec) > 0
                                    # (change requires restart)

# -- heartbeat mode --

wd_heartbeat_port = 9694
                                    # Port number for receiving heartbeat signal
                                    # (change requires restart)
wd_heartbeat_keepalive = 2
                                    # Interval time of sending heartbeat signal (sec)
                                    # (change requires restart)
wd_heartbeat_deadtime = 30
                                    # Deadtime interval for heartbeat signal (sec)
                                    # (change requires restart)
                                    # Host name or IP address of destination 0
                                    # for sending heartbeat signal.
                                    # (change requires restart)
                                    # Port number of destination 0 for sending
                                    # heartbeat signal. Usually this is the
                                    # same as wd_heartbeat_port.
                                    # (change requires restart)
                                    # Name of NIC device (such like 'eth0')
                                    # used for sending/receiving heartbeat
                                    # signal to/from destination 0.
                                    # This works only when this is not empty
                                    # and pgpool has root privilege.
                                    # (change requires restart)

#heartbeat_destination1 = 'host0_ip2'
#heartbeat_destination_port1 = 9694
#heartbeat_device1 = ''

# -- query mode --

wd_life_point = 3
                                    # lifecheck retry times
                                    # (change requires restart)
wd_lifecheck_query = 'SELECT 1'
                                    # lifecheck query to pgpool from watchdog
                                    # (change requires restart)
wd_lifecheck_dbname = 'template1'
                                    # Database name connected for lifecheck
                                    # (change requires restart)
wd_lifecheck_user = 'nobody'
                                    # watchdog user monitoring pgpools in lifecheck
                                    # (change requires restart)
wd_lifecheck_password = ''
                                    # Password for watchdog user in lifecheck
                                    # Leaving it empty will make Pgpool-II to first look for the
                                    # Password in pool_passwd file before using the empty password
                                    # (change requires restart)

# - Other pgpool Connection Settings -

                                    # Host name or IP address to connect to for other pgpool 0
                                    # (change requires restart)
                                    # Port number for other pgpool 0
                                    # (change requires restart)
                                    # Port number for other watchdog 0
                                    # (change requires restart)
#other_pgpool_hostname1 = 'host1'
#other_pgpool_port1 = 5432
#other_wd_port1 = 9000


#------------------------------------------------------------------------------
# OTHERS
#------------------------------------------------------------------------------
relcache_expire = 0
                                   # Life time of relation cache in seconds.
                                   # 0 means no cache expiration(the default).
                                   # The relation cache is used for cache the
                                   # query result against PostgreSQL system
                                   # catalog to obtain various information
                                   # including table structures or if it's a
                                   # temporary table or not. The cache is
                                   # maintained in a pgpool child local memory
                                   # and being kept as long as it survives.
                                   # If someone modify the table by using
                                   # ALTER TABLE or some such, the relcache is
                                   # not consistent anymore.
                                   # For this purpose, cache_expiration
                                   # controls the life time of the cache.

relcache_size = 256
                                   # Number of relation cache
                                   # entry. If you see frequently:
                                   # "pool_search_relcache: cache replacement happend"
                                   # in the pgpool log, you might want to increate this number.

check_temp_table = on
                                   # If on, enable temporary table check in SELECT statements.
                                   # This initiates queries against system catalog of primary/master
                                   # thus increases load of master.
                                   # If you are absolutely sure that your system never uses temporary tables
                                   # and you want to save access to primary/master, you could turn this off.
                                   # Default is on.

check_unlogged_table = on
                                   # If on, enable unlogged table check in SELECT statements.
                                   # This initiates queries against system catalog of primary/master
                                   # thus increases load of master.
                                   # If you are absolutely sure that your system never uses unlogged tables
                                   # and you want to save access to primary/master, you could turn this off.
                                   # Default is on.

#------------------------------------------------------------------------------
# IN MEMORY QUERY MEMORY CACHE
#------------------------------------------------------------------------------
memory_cache_enabled = off
                                   # If on, use the memory cache functionality, off by default
                                   # (change requires restart)
memqcache_method = 'shmem'
                                   # Cache storage method. either 'shmem'(shared memory) or
                                   # 'memcached'. 'shmem' by default
                                   # (change requires restart)
memqcache_memcached_host = 'localhost'
                                   # Memcached host name or IP address. Mandatory if
                                   # memqcache_method = 'memcached'.
                                   # Defaults to localhost.
                                   # (change requires restart)
memqcache_memcached_port = 11211
                                   # Memcached port number. Mondatory if memqcache_method = 'memcached'.
                                   # Defaults to 11211.
                                   # (change requires restart)
memqcache_total_size = 67108864
                                   # Total memory size in bytes for storing memory cache.
                                   # Mandatory if memqcache_method = 'shmem'.
                                   # Defaults to 64MB.
                                   # (change requires restart)
memqcache_max_num_cache = 1000000
                                   # Total number of cache entries. Mandatory
                                   # if memqcache_method = 'shmem'.
                                   # Each cache entry consumes 48 bytes on shared memory.
                                   # Defaults to 1,000,000(45.8MB).
                                   # (change requires restart)
memqcache_expire = 0
                                   # Memory cache entry life time specified in seconds.
                                   # 0 means infinite life time. 0 by default.
                                   # (change requires restart)
memqcache_auto_cache_invalidation = on
                                   # If on, invalidation of query cache is triggered by corresponding
                                   # DDL/DML/DCL(and memqcache_expire). If off, it is only triggered
                                   # by memqcache_expire. on by default.
                                   # (change requires restart)
memqcache_maxcache = 409600
                                   # Maximum SELECT result size in bytes.
                                   # Must be smaller than memqcache_cache_block_size. Defaults to 400KB.
                                   # (change requires restart)
memqcache_cache_block_size = 1048576
                                   # Cache block size in bytes. Mandatory if memqcache_method = 'shmem'.
                                   # Defaults to 1MB.
                                   # (change requires restart)
memqcache_oiddir = '/var/log/pgpool/oiddir'
                                   # Temporary work directory to record table oids
                                   # (change requires restart)
white_memqcache_table_list = ''
                                   # Comma separated list of table names to memcache
                                   # that don't write to database
                                   # Regexp are accepted
black_memqcache_table_list = ''
                                   # Comma separated list of table names not to memcache
                                   # that don't write to database
                                   # Regexp are accepted
backend_hostname0 = 'pgpool-poc01.novalocal'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/installer/postgresql-11.5/data'
backend_flag0 = 'ALLOW_TO_FAILOVER'

backend_hostname1 = 'pgpool-poc02.novalocal'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/installer/postgresql-11.5/data'
backend_flag1 = 'ALLOW_TO_FAILOVER'

heartbeat_destination0 = 'pgpool-poc01.novalocal'
heartbeat_destination_port0 = 9694

other_pgpool_hostname0 = 'pgpool-poc01.novalocal'
other_pgpool_port0 = 5433
other_wd_port0 = 9000

heartbeat_device0 = 'eth0'
##Addded b Raj --> pgpool-II 4.1 onwards available parameters

statement_level_load_balance = on

enable_consensus_with_half_votes = on

[root@pgpool-poc02 bin]#

raj.pandey1982@gmail.com

2020-01-29 06:47

reporter   ~0003097

One Observation here

:- when ever after DB failover above mentioned issue occurs, if i restart the pgpool services on both the nodes again then Remote connection is happening (checked through connecting PostgreSql Adamin) and the new Master DB at node2 getting connected using VIP/Port. while pgpool log on both the node keep on giving message as below:-
 
2020-01-29 00:34:55: pid 30207:LOG: new IPC connection received
2020-01-29 00:34:56: pid 31230:LOG: get_query_result: no rows returned
2020-01-29 00:34:56: pid 31230:DETAIL: node id (1)
2020-01-29 00:34:56: pid 31230:CONTEXT: while checking replication time lag
2020-01-29 00:34:56: pid 31230:LOG: get_query_result falied: status: -1
2020-01-29 00:34:56: pid 31230:CONTEXT: while checking replication time lag
2020-01-29 00:35:01: pid 31230:LOG: get_query_result: no rows returned
2020-01-29 00:35:01: pid 31230:DETAIL: node id (1)
2020-01-29 00:35:01: pid 31230:CONTEXT: while checking replication time lag
2020-01-29 00:35:01: pid 31230:LOG: get_query_result falied: status: -1
2020-01-29 00:35:01: pid 31230:CONTEXT: while checking replication time lag

---> Hence something is not good here which is not separating(de-attaching ) the down old master node 1 and hence pgpool keep on trying to connect to this down node and not letting
the newly elected Master DB to accept connections via PGPOOL VIP/Port. And if i restart service then separation happening but with above "get_query_result falied: status: -1" error and then connections also happening from remote.

t-ishii

2020-01-29 11:49

developer   ~0003099

Last edited: 2020-01-29 12:02

View 2 revisions

When Pgpool-II goes into "unstable" state, can you show the output of "watchdog_info -h pgpool-poc02.novalocal -v"? This should give important info to study the issue.

In the mean time I noticed:
2020-01-22 12:39:16: pid 20780:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover

Status = 3 means, node 0 (pgpool-poc1) of PostgreSQL went into "invalid" status. The inavlid status could only be set if detach_false_primary = on. However in your pgpool.conf it is set to off. So I am confused...

t-ishii

2020-01-29 14:11

developer   ~0003100

> search_primary_node_timeout=10 also not working in my case to avoid below error again and again as this is streaming replication feature while mine is Master-Slave replication:-

What do you mean by "Master-Slave replication"? Can you please elaborate?

raj.pandey1982@gmail.com

2020-01-29 18:27

reporter   ~0003101

========================pcp_watchdog_info command on 1st/2nd node before failover

pcp_watchdog_info at node 1:-

[root@pgpool-poc01 etc]# pcp_watchdog_info -h pgpool-poc01.novalocal -U postgres -p 9898 -v -w
Watchdog Cluster Information
Total Nodes : 2
Remote Nodes : 1
Quorum state : QUORUM EXIST
Alive Remote Nodes : 1
VIP up on local node : YES
Master Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Master Host Name : pgpool-poc01.novalocal

Watchdog Node Information
Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Host Name : pgpool-poc01.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 4
Status Name : MASTER

Node Name : pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
Host Name : pgpool-poc02.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 7
Status Name : STANDBY

[root@pgpool-poc01 etc]# pcp_watchdog_info -h pgpool-poc02.novalocal -U postgres -p 9898 -v -w
Watchdog Cluster Information
Total Nodes : 2
Remote Nodes : 1
Quorum state : QUORUM EXIST
Alive Remote Nodes : 1
VIP up on local node : NO
Master Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Master Host Name : pgpool-poc01.novalocal

Watchdog Node Information
Node Name : pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
Host Name : pgpool-poc02.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 7
Status Name : STANDBY

Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Host Name : pgpool-poc01.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 4
Status Name : MASTER

[root@pgpool-poc01 etc]#



pcp_watchdog_info at node 2:-

[root@pgpool-poc02 replscripts]# pcp_watchdog_info -h pgpool-poc01.novalocal -U postgres -p 9898 -v -w
Watchdog Cluster Information
Total Nodes : 2
Remote Nodes : 1
Quorum state : QUORUM EXIST
Alive Remote Nodes : 1
VIP up on local node : YES
Master Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Master Host Name : pgpool-poc01.novalocal

Watchdog Node Information
Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Host Name : pgpool-poc01.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 4
Status Name : MASTER

Node Name : pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
Host Name : pgpool-poc02.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 7
Status Name : STANDBY

[root@pgpool-poc02 replscripts]# pcp_watchdog_info -h pgpool-poc02.novalocal -U postgres -p 9898 -v -w
Watchdog Cluster Information
Total Nodes : 2
Remote Nodes : 1
Quorum state : QUORUM EXIST
Alive Remote Nodes : 1
VIP up on local node : NO
Master Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Master Host Name : pgpool-poc01.novalocal

Watchdog Node Information
Node Name : pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
Host Name : pgpool-poc02.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 7
Status Name : STANDBY

Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Host Name : pgpool-poc01.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 4
Status Name : MASTER

[root@pgpool-poc02 replscripts]#

raj.pandey1982@gmail.com

2020-01-29 18:29

reporter   ~0003102

========================pcp_watchdog_info command on 1st/2nd node After failover


pcp_watchdog_info at node 1:-

[root@pgpool-poc01 etc]# pcp_watchdog_info -h pgpool-poc01.novalocal -U postgres -p 9898 -v -w
Watchdog Cluster Information
Total Nodes : 2
Remote Nodes : 1
Quorum state : QUORUM EXIST
Alive Remote Nodes : 1
VIP up on local node : YES
Master Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Master Host Name : pgpool-poc01.novalocal

Watchdog Node Information
Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Host Name : pgpool-poc01.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 4
Status Name : MASTER

Node Name : pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
Host Name : pgpool-poc02.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 7
Status Name : STANDBY

[root@pgpool-poc01 etc]# pcp_watchdog_info -h pgpool-poc02.novalocal -U postgres -p 9898 -v -w
Watchdog Cluster Information
Total Nodes : 2
Remote Nodes : 1
Quorum state : QUORUM EXIST
Alive Remote Nodes : 1
VIP up on local node : NO
Master Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Master Host Name : pgpool-poc01.novalocal

Watchdog Node Information
Node Name : pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
Host Name : pgpool-poc02.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 7
Status Name : STANDBY

Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Host Name : pgpool-poc01.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 4
Status Name : MASTER



pcp_watchdog_info at node 2:-

[root@pgpool-poc02 replscripts]# pcp_watchdog_info -h pgpool-poc01.novalocal -U postgres -p 9898 -v -w
Watchdog Cluster Information
Total Nodes : 2
Remote Nodes : 1
Quorum state : QUORUM EXIST
Alive Remote Nodes : 1
VIP up on local node : YES
Master Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Master Host Name : pgpool-poc01.novalocal

Watchdog Node Information
Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Host Name : pgpool-poc01.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 4
Status Name : MASTER

Node Name : pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
Host Name : pgpool-poc02.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 7
Status Name : STANDBY

[root@pgpool-poc02 replscripts]# pcp_watchdog_info -h pgpool-poc02.novalocal -U postgres -p 9898 -v -w
Watchdog Cluster Information
Total Nodes : 2
Remote Nodes : 1
Quorum state : QUORUM EXIST
Alive Remote Nodes : 1
VIP up on local node : NO
Master Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Master Host Name : pgpool-poc01.novalocal

Watchdog Node Information
Node Name : pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
Host Name : pgpool-poc02.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 7
Status Name : STANDBY

Node Name : pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal
Host Name : pgpool-poc01.novalocal
Delegate IP : 10.70.184.29
Pgpool port : 5433
Watchdog port : 9000
Node priority : 1
Status : 4
Status Name : MASTER

[root@pgpool-poc02 replscripts]#

raj.pandey1982@gmail.com

2020-01-29 18:33

reporter   ~0003103

>What do you mean by "Master-Slave replication"? Can you please elaborate?
i Mean Master Slave WAL replication i am using and SLOT replication.
#------------------------------------------------------------------------------
# REPLICATION MODE
#------------------------------------------------------------------------------

replication_mode = off
                                   # Activate replication mode
                                   # (change requires restart)
replicate_select = off

t-ishii

2020-01-29 20:55

developer   ~0003104

> i Mean Master Slave WAL replication i am using and SLOT replication.
That's actually called "streaming replication" in PostgreSQL world.

raj.pandey1982@gmail.com

2020-01-29 22:31

reporter   ~0003105

Any Luck , we are running out of time.Hope you understand the scenario under which issue is occurring.
>2020-01-23 12:14:21: pid 2909:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
above is i also think the first think to worry about why this is coming up. even why after successfull DB failouver and promotion any error would come!.

t-ishii

2020-01-29 23:29

developer   ~0003107

But from the code's point of view it should only when detach_false_primary = on as I said. If it's on I could make some theory. Have you ever turned it on?

raj.pandey1982@gmail.com

2020-01-30 04:48

reporter   ~0003108

I did not do this before but just i tried once with detach_false_primary= on and restarted Master-Standby pgpool services,but then remote connection stopped happning and logs give below error:-
2020-01-29 22:27:05: pid 24481:LOG: Backend status file /var/log/pgpool/pgpool_status discarded
2020-01-29 22:27:05: pid 24481:LOG: memory cache initialized
2020-01-29 22:27:05: pid 24481:DETAIL: memcache blocks :64
2020-01-29 22:27:05: pid 24481:LOG: pool_discard_oid_maps: discarded memqcache oid maps
2020-01-29 22:27:05: pid 24481:LOG: waiting for watchdog to initialize
2020-01-29 22:27:05: pid 24483:LOG: setting the local watchdog node name to "pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal"
2020-01-29 22:27:05: pid 24483:LOG: watchdog cluster is configured with 1 remote nodes
2020-01-29 22:27:05: pid 24483:LOG: watchdog remote node:0 on pgpool-poc02.novalocal:9000
2020-01-29 22:27:05: pid 24483:LOG: interface monitoring is disabled in watchdog
2020-01-29 22:27:05: pid 24483:LOG: watchdog node state changed from [DEAD] to [LOADING]
2020-01-29 22:27:10: pid 24483:LOG: watchdog node state changed from [LOADING] to [JOINING]
2020-01-29 22:27:14: pid 24483:LOG: watchdog node state changed from [JOINING] to [INITIALIZING]
2020-01-29 22:27:15: pid 24483:LOG: I am the only alive node in the watchdog cluster
2020-01-29 22:27:15: pid 24483:HINT: skipping stand for coordinator state
2020-01-29 22:27:15: pid 24483:LOG: watchdog node state changed from [INITIALIZING] to [MASTER]
2020-01-29 22:27:15: pid 24483:LOG: I am announcing my self as master/coordinator watchdog node
2020-01-29 22:27:19: pid 24483:LOG: I am the cluster leader node
2020-01-29 22:27:19: pid 24483:DETAIL: our declare coordinator message is accepted by all nodes
2020-01-29 22:27:19: pid 24483:LOG: setting the local node "pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal" as watchdog cluster master
2020-01-29 22:27:19: pid 24483:LOG: I am the cluster leader node. Starting escalation process
2020-01-29 22:27:19: pid 24481:LOG: watchdog process is initialized
2020-01-29 22:27:19: pid 24481:DETAIL: watchdog messaging data version: 1.1
2020-01-29 22:27:19: pid 24483:LOG: escalation process started with PID:24485
2020-01-29 22:27:19: pid 24485:LOG: watchdog: escalation started
2020-01-29 22:27:19: pid 24483:LOG: new IPC connection received
2020-01-29 22:27:19: pid 24481:LOG: Setting up socket for 0.0.0.0:5433
2020-01-29 22:27:19: pid 24481:LOG: Setting up socket for :::5433
2020-01-29 22:27:19: pid 24483:LOG: new IPC connection received
2020-01-29 22:27:19: pid 24486:LOG: 2 watchdog nodes are configured for lifecheck
2020-01-29 22:27:19: pid 24486:LOG: watchdog nodes ID:0 Name:"pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal"
2020-01-29 22:27:19: pid 24486:DETAIL: Host:"pgpool-poc01.novalocal" WD Port:9000 pgpool-II port:5433
2020-01-29 22:27:19: pid 24486:LOG: watchdog nodes ID:1 Name:"Not_Set"
2020-01-29 22:27:19: pid 24486:DETAIL: Host:"pgpool-poc02.novalocal" WD Port:9000 pgpool-II port:5433
2020-01-29 22:27:19: pid 24486:LOG: watchdog lifecheck trusted server "mohvcasdb01.novalocal" added for the availability check
2020-01-29 22:27:19: pid 24486:LOG: watchdog lifecheck trusted server "mohcasdevdb.novalocal" added for the availability check
2020-01-29 22:27:20: pid 24481:LOG: find_primary_node_repeatedly: waiting for finding a primary node
2020-01-29 22:27:20: pid 24481:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:27:20: pid 24481:LOG: find_primary_node: primary node is 0
2020-01-29 22:27:20: pid 24481:LOG: find_primary_node: standby node is 1
2020-01-29 22:27:20: pid 25492:LOG: PCP process: 25492 started
2020-01-29 22:27:20: pid 24481:LOG: pgpool-II successfully started. version 4.1.0 (karasukiboshi)
2020-01-29 22:27:20: pid 24481:LOG: node status[0]: 1
2020-01-29 22:27:20: pid 24481:LOG: node status[1]: 2
2020-01-29 22:27:20: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:27:20: pid 24493:LOG: createing watchdog heartbeat receive socket.
2020-01-29 22:27:20: pid 24493:DETAIL: bind receive socket to device: "eth0"
2020-01-29 22:27:20: pid 24493:LOG: set SO_REUSEPORT option to the socket
2020-01-29 22:27:20: pid 24495:LOG: creating socket for sending heartbeat
2020-01-29 22:27:20: pid 24495:DETAIL: bind send socket to device: eth0
2020-01-29 22:27:20: pid 24493:LOG: creating watchdog heartbeat receive socket.
2020-01-29 22:27:20: pid 24493:DETAIL: set SO_REUSEPORT
2020-01-29 22:27:20: pid 24495:LOG: set SO_REUSEPORT option to the socket
2020-01-29 22:27:20: pid 24495:LOG: creating socket for sending heartbeat
2020-01-29 22:27:20: pid 24495:DETAIL: set SO_REUSEPORT
2020-01-29 22:27:23: pid 24485:LOG: successfully acquired the delegate IP:"10.70.184.29"
2020-01-29 22:27:23: pid 24485:DETAIL: 'if_up_cmd' returned with success
2020-01-29 22:27:23: pid 24483:LOG: watchdog escalation process with pid: 24485 exit with SUCCESS.
2020-01-29 22:27:25: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:27:30: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:27:35: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:27:40: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:27:45: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:27:50: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:27:55: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:28:00: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:28:03: pid 24483:LOG: new watchdog node connection is received from "10.70.184.28:57574"
2020-01-29 22:28:03: pid 24483:LOG: new node joined the cluster hostname:"pgpool-poc02.novalocal" port:9000 pgpool_port:5433
2020-01-29 22:28:03: pid 24483:DETAIL: Pgpool-II version:"4.1.0" watchdog messaging version: 1.1
2020-01-29 22:28:03: pid 24483:LOG: new outbound connection to pgpool-poc02.novalocal:9000
2020-01-29 22:28:05: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:28:09: pid 24483:LOG: adding watchdog node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal" to the standby list
2020-01-29 22:28:09: pid 24481:LOG: Pgpool-II parent process received watchdog quorum change signal from watchdog
2020-01-29 22:28:09: pid 24483:LOG: new IPC connection received
2020-01-29 22:28:09: pid 24481:LOG: watchdog cluster now holds the quorum
2020-01-29 22:28:09: pid 24481:DETAIL: updating the state of quarantine backend nodes
2020-01-29 22:28:09: pid 24483:LOG: new IPC connection received
2020-01-29 22:28:10: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:28:15: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:28:20: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:28:25: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:28:30: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:28:35: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:28:40: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:28:45: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:28:50: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:28:55: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
2020-01-29 22:29:00: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1

t-ishii

2020-01-30 10:38

developer   ~0003109

> 2020-01-29 22:28:10: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
It seems PostgreSQL 0 and 1 are alive and PostgreSQL 0 is primary. But PostgreSQL 1 is not properly connected to the primary as a streaming replication standby. You can confirm this by issuing "show pool_nodes" or send query "select * from pg_stat_replication" to PostgreSQL primary.

Going back to the original problem:
> 2020-01-22 12:39:16: pid 20780:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
I believe that detach_false_primary= on at that time. Anyway, set detach_false_primary= off should solve the problem.

raj.pandey1982@gmail.com

2020-01-30 15:10

reporter   ~0003110

.>You can confirm this by issuing "show pool_nodes" or send query "select * from pg_stat_replication" to PostgreSQL primary.
show pool_nodes:-

mawidstg01=# show pool_nodes;
 node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_stat
e | last_status_change
---------+------------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+----------------------
--+---------------------
 0 | pgpool-poc01.novalocal | 5432 | up | 0.500000 | primary | 0 | false | 0 | |
  | 2020-01-30 09:04:22
 1 | pgpool-poc02.novalocal | 5432 | up | 0.500000 | standby | 0 | true | 0 | |
  | 2020-01-30 09:04:22
(2 rows)


select * from pg_stat_replication:
pid usesysid usename application_name client_addr client_hostname client_port backend_start backend_xmin state sent_lsn write_lsn flush_lsn replay_lsn write_lag flush_lag replay_lag sync_priority sync_state
24424 98470 replication walreceiver 10.70.184.28 43510 2020-01-29 22:23:27.035277+03 8915528 streaming 109/B301F4B8 109/B301F4B8 109/B301F4B8 109/B301F4B8 0 async


>I believe that detach_false_primary= on at that time. Anyway, set detach_false_primary= off should solve the problem.
> 2020-01-29 22:28:10: pid 25493:LOG: verify_backend_node_status: primary 0 owns only 0 standbys out of 1
>> the above error i produced after keeping detach_false_primary=on and restarting pgpool services (master slave were up that time ) just to showcase you ,else by default detach_false_primary is off only.

t-ishii

2020-01-30 15:32

developer   ~0003111

In show pool_nodes, "replication_state" column is empty. This means Pgpool-II fails to collect streaming replication state. Check sr_check* parameters. Since "select * from pg_stat_replication:" seems to work (actually Pgpool-II sends the same query), streaming replication itself is working. Probably role or permission or password is not appropriate.

raj.pandey1982@gmail.com

2020-01-30 18:27

reporter   ~0003112

for sr_check* i changed user from 'replication' to 'postgres and password accordingly. but and did the fail over still same error:-

I dont understand when i make Master DB down and log gives degenerate backend request, why its still trying to connect node 1 master DB only even after failover and promotion.:-
(FYI : show pgpool_nodes still gives "replication_state" blank even from PGAdmin client when i fire select * from pg_stat_replication , all field values appear)

2020-01-30 12:21:06: pid 14552:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-30 12:21:06: pid 14552:LOG: received degenerate backend request for node_id: 0 from pid [14552]
2020-01-30 12:21:06: pid 15546:LOG: reading and processing packets
2020-01-30 12:21:06: pid 15546:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-30 12:21:06: pid 15546:LOG: received degenerate backend request for node_id: 0 from pid [15546]
2020-01-30 12:21:06: pid 14527:LOG: new IPC connection received
2020-01-30 12:21:06: pid 14527:LOG: new IPC connection received
2020-01-30 12:21:06: pid 14527:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-30 12:21:06: pid 14527:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-30 12:21:06: pid 14527:LOG: we have got the consensus to perform the failover
2020-01-30 12:21:06: pid 14527:DETAIL: 1 node(s) voted in the favor
2020-01-30 12:21:06: pid 14527:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-30 12:21:06: pid 14527:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-30 12:21:06: pid 14527:LOG: we have got the consensus to perform the failover
2020-01-30 12:21:06: pid 14527:DETAIL: 1 node(s) voted in the favor
2020-01-30 12:21:06: pid 14525:LOG: Pgpool-II parent process has received failover request
2020-01-30 12:21:06: pid 14527:LOG: new IPC connection received
2020-01-30 12:21:06: pid 14527:LOG: received the failover indication from Pgpool-II on IPC interface
2020-01-30 12:21:06: pid 14527:LOG: watchdog is informed of failover start by the main process
2020-01-30 12:21:06: pid 14525:LOG: starting degeneration. shutdown host pgpool-poc01.novalocal(5432)
2020-01-30 12:21:06: pid 15522:LOG: reading and processing packets
2020-01-30 12:21:06: pid 15522:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2020-01-30 12:21:06: pid 15522:LOG: received degenerate backend request for node_id: 0 from pid [15522]
2020-01-30 12:21:06: pid 14527:LOG: new IPC connection received
2020-01-30 12:21:06: pid 14527:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-30 12:21:06: pid 14527:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-30 12:21:06: pid 14527:LOG: we have got the consensus to perform the failover
2020-01-30 12:21:06: pid 14527:DETAIL: 1 node(s) voted in the favor
2020-01-30 12:21:06: pid 14525:LOG: Restart all children
2020-01-30 12:21:06: pid 14525:LOG: execute command: /usr/share/pgpool/4.1.0/etc/failover.sh 0 0 pgpool-poc02.novalocal reppassword /installer/postgresql-11.5/data/im_the_master
Authorized Uses Only.All activity may be Monitored and Reported
promote - Start
DEBUG: The script will be executed with the following arguments:
DEBUG: --trigger-file=/installer/postgresql-11.5/data/im_the_master
DEBUG: --standby_file=/installer/postgresql-11.5/data/im_slave
DEBUG: --demote-host=
DEBUG: --user=replication
DEBUG: --password=reppassword
DEBUG: --force
INFO: Checking if standby file exists...
INFO: Checking if trigger file exists...
INFO: Deleting recovery.conf file...
INFO: Checking if postgresql.conf file exists...
INFO: postgresql.conf file found. Checking if it is for primary server...
INFO: postgresql.conf file corresponds to primary server file. Nothing to do.
pg_ctl: server is running (PID: 2689)
/usr/local/pgsql11.5/bin/postgres "-D" "/installer/postgresql-11.5/data"
INFO: Restarting postgresql service...
waiting for server to shut down....2020-01-30 12:21:06: pid 15554:LOG: failed to connect to PostgreSQL server on "pgpool-poc02.novalocal:5432", getsockopt() detected error "Connection refused"
2020-01-30 12:21:06: pid 15554:ERROR: failed to make persistent db connection
2020-01-30 12:21:06: pid 15554:DETAIL: connection to host:"pgpool-poc02.novalocal:5432" failed
2020-01-30 12:21:06: pid 15554:LOG: health check failed on node 1 (timeout:0)
2020-01-30 12:21:06: pid 15554:LOG: received degenerate backend request for node_id: 1 from pid [15554]
2020-01-30 12:21:06: pid 14527:LOG: new IPC connection received
2020-01-30 12:21:06: pid 14527:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-01-30 12:21:06: pid 14527:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-01-30 12:21:06: pid 14527:LOG: we have got the consensus to perform the failover
2020-01-30 12:21:06: pid 14527:DETAIL: 1 node(s) voted in the favor
 done
server stopped
waiting for server to start....2020-01-30 12:21:06 +03 LOG: listening on IPv4 address "0.0.0.0", port 5432
2020-01-30 12:21:06 +03 LOG: listening on IPv6 address "::", port 5432
2020-01-30 12:21:06 +03 LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2020-01-30 12:21:06 +03 LOG: redirecting log output to logging collector process
2020-01-30 12:21:06 +03 HINT: Future log output will appear in directory "/dblogs/logs".
 done
server started
pg_ctl: server is running (PID: 4709)
/usr/local/pgsql11.5/bin/postgres "-D" "/installer/postgresql-11.5/data"
INFO: postgresql already running.
INFO: Ensuring replication role and password...
INFO: Replication role found. Ensuring password...
ALTER ROLE
INFO: Creating primary info file...
promote - Done!
2020-01-30 12:21:08: pid 15551:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-01-30 12:21:08: pid 15551:ERROR: failed to make persistent db connection
2020-01-30 12:21:08: pid 15551:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-01-30 12:21:09: pid 14527:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-01-30 12:21:09: pid 14527:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-01-30 12:21:09: pid 14527:LOG: we have got the consensus to perform the failover
2020-01-30 12:21:09: pid 14527:DETAIL: 1 node(s) voted in the favor
2020-01-30 12:21:09: pid 14527:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
2020-01-30 12:21:13: pid 15551:ERROR: Failed to check replication time lag
2020-01-30 12:21:13: pid 15551:DETAIL: No persistent db connection for the node 0
2020-01-30 12:21:13: pid 15551:HINT: check sr_check_user and sr_check_password
2020-01-30 12:21:13: pid 15551:CONTEXT: while checking replication time lag
2020-01-30 12:21:13: pid 15551:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-01-30 12:21:13: pid 15551:ERROR: failed to make persistent db connection
2020-01-30 12:21:13: pid 15551:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-01-30 12:21:14: pid 14527:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-01-30 12:21:14: pid 14527:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-01-30 12:21:14: pid 14527:LOG: we have got the consensus to perform the failover
2020-01-30 12:21:14: pid 14527:DETAIL: 1 node(s) voted in the favor
2020-01-30 12:21:14: pid 14527:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover

raj.pandey1982@gmail.com

2020-01-30 19:15

reporter   ~0003113

iwa sloking into " https://www.pgpool.net/mantisbt/view.php?id=421" where almost same error in log and you suggested search_primary_node_timeout=10 and it worked for that guy.But not happening in my case.

raj.pandey1982@gmail.com

2020-01-31 04:59

reporter   ~0003114

Problem Statement: I am not able to Write/Read to newly promoted master node Node thru pgpool VIP. I need to restart pgpool services then only able to read-write to newly elected master.

Scenario 1 : Manual promotion of standby node as Master :-
(a) stopped Master DB manually
(b) run manually failover.sh script for promotion of standby node as new Master node.
(c) read/write is happening thru pgpool vip without restarting pgpool service.

Scenario 2: Automated promotion thru pgpoll(using FAILOVER_COMMAND ) of standby node as Master:-
(a) stopped Master DB manually
(b) got the consensus to perform failover and failover.sh script for promotion of standby node as New Master node got successful.
(c) Read/Write is NOT HAPPENING thru pgpool vip without restarting pgpool service. If i restaet pgpool service then read/write is happening properly.

In both the scenarios configuration files and parameter files were same except in 1st Scenario when failover_command parameter was commented for manual promotion.

Need your help, so that in automated failover scenerio i dont need to restart pgpool services to read/write to newly elected Master DB using pgpool VIP.

Attaching the latest pgpool.conf.

pgpool.conf (41,787 bytes)

raj.pandey1982@gmail.com

2020-02-02 18:23

reporter   ~0003115

Might i get some update here please.

raj.pandey1982@gmail.com

2020-02-02 22:03

reporter   ~0003116

replication_state | replication_sync_state showing blank even when i have set below:-

backend_application_name0 = 'pgpool-poc01.novalocal'
backend_application_name1 = 'pgpool-poc02.novalocal'
--------------------------------------------------------------------------------------------------------------------
Also At Node 2 Stand by server : node status show 0 in log

2020-02-02 15:51:57: pid 32249:LOG: node status[0]: 0
2020-02-02 15:51:57: pid 32249:LOG: node status[1]: 0

--------------------------------------------------------------------------------------------------------------------------
while At Node 2 Stand by server : node status show 1/2 in log:-
2020-02-02 15:48:24: pid 31835:LOG: node status[0]: 1
2020-02-02 15:48:24: pid 31835:LOG: node status[1]: 2

is this related to pgpool not responding properly after failover ?.

t-ishii

2020-02-04 13:48

developer   ~0003120

> Also At Node 2 Stand by server : node status show 0 in log
>
> 2020-02-02 15:51:57: pid 32249:LOG: node status[0]: 0
> 2020-02-02 15:51:57: pid 32249:LOG: node status[1]: 0
>
> --------------------------------------------------------------------------------------------------------------------------
> while At Node 2 Stand by server : node status show 1/2 in log:-
> 2020-02-02 15:48:24: pid 31835:LOG: node status[0]: 1
> 2020-02-02 15:48:24: pid 31835:LOG: node status[1]: 2

So sometime Node 2 shows status 0, 0 and sometimes 1, 2?

t-ishii

2020-02-04 13:52

developer   ~0003121

I think you haven't tell me the version of PostgreSQL. Can you share it?

raj.pandey1982@gmail.com

2020-02-04 14:30

reporter   ~0003122

>So sometime Node 2 shows status 0, 0 and sometimes 1, 2?
PGPOOL Master node always showes node0=1 and node1=2 ,PGPOOL StandBy node always showes node0=0 and node1=0.

>I think you haven't tell me the version of PostgreSQL. Can you share it?>
 PostgreSql 11.5 .

>One more finding for your help:- same kind of error i get with pgpool version 4.0 while failover.

t-ishii

2020-02-04 14:49

developer   ~0003123

Can you attach the latest pgpool.conf, postgresql.conf and recovery.conf here? Cut&paste to the comments are is hard to read. Also you seem to have made modifications to them.

raj.pandey1982@gmail.com

2020-02-04 15:58

reporter  

pg_hba.conf (11,678 bytes)
pgpool-2.conf (41,871 bytes)
failover.sh (855 bytes)

raj.pandey1982@gmail.com

2020-02-04 15:58

reporter   ~0003124

Please find the attached file as per request.

recovery.conf (352 bytes)
promote.sh (11,131 bytes)
pool_hba.conf (4,461 bytes)
postgresql.conf (24,089 bytes)

t-ishii

2020-02-04 16:34

developer   ~0003125

I see a problem in recovery.conf:
primary_conninfo = 'user=replication password=reppassword host=10.70.184.27 port=5432 sslmode=disable sslcompression=0 target_session_attrs=any'

You need to add "application_name" parameter something like:
application_name=pgpool-poc01.novalocal
(this is an example for pgpool-poc01.novalocal)

raj.pandey1982@gmail.com

2020-02-04 16:50

reporter   ~0003126

> You need to add "application_name" parameter something like:>

Is below correct now:? i just added application_name=<hostname>

primary_conninfo = 'user=replication password=reppassword host=10.70.184.27 application_name=pgpool-poc01.novalocal port=5432 sslmode=disable sslcompression=0 target_session_attrs=any

But this above i use only once while making slave DB through PG_backup. and while db failover/promotion this recovery.conf file gets deleted too....so not sure how this file could be the cause of the issue.

t-ishii

2020-02-04 17:13

developer   ~0003127

> Is below correct now:?
Looks good to me.

> But this above i use only once while making slave DB through PG_backup. and while db failover/promotion this recovery.conf file gets deleted too....so not sure how this file could be the cause of the issue.

No. recovery.conf is used by standby PostgreSQL to sync with primary PostgreSQL server even after failover. So you need to keep recovery.conf after making new standby PostgreSQL. The work should be done by recovery_1st_stage_command (in your case recovery_1st_stage.sh). If you are not sure how to do that, you can find an example at:
src/sample/scripts/recovery_1st_stage.sample

raj.pandey1982@gmail.com

2020-02-04 17:46

reporter   ~0003128

>No. recovery.conf is used by standby PostgreSQL to sync with primary PostgreSQL server even after failover.
i am not deleting recovery.conf while/after preparing slave db. Only when i shutdown master db then while failover recovery.conf gets deleted by failover script to promote the salve as Master. and master/Slave sync work properly.

Now i just created slave DB again (as i always need to do once slave is open as master after old master failover) after added the application_name=pgpool-poc01.novalocal parameter in recovery.conf file .
Both Master/Slave are in sync now:-

[postgres@pgpool-poc02 logs]$ tail -200f mawidstg01-2020-02-04_113709.log
2020-02-04 11:37:09 +03 LOG: database system was interrupted; last known up at 2020-02-04 11:00:25 +03
2020-02-04 11:37:09 +03 LOG: entering standby mode
cp: cannot stat ‘/installer/archivedir/0000000100000109000000CD’: Not a directory
2020-02-04 11:37:09 +03 LOG: redo starts at 109/CD000028
2020-02-04 11:37:09 +03 LOG: consistent recovery state reached at 109/CD000130
2020-02-04 11:37:09 +03 LOG: database system is ready to accept read only connections
cp: cannot stat ‘/installer/archivedir/0000000100000109000000CE’: Not a directory
2020-02-04 11:37:09 +03 LOG: started streaming WAL from primary at 109/CE000000 on timeline 1

Now i started PGPOOL servcies on both the nodes and checked the pool_nodes:

[postgres@pgpool-poc01 logs]$ /usr/local/pgsql11.5/bin/psql -h 10.70.184.29 -p 5433 -U postgres -d mawidstg01
psql (11.5)
Type "help" for help.

mawidstg01=# show pool_nodes;
 node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+------------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0 | pgpool-poc01.novalocal | 5432 | up | 0.500000 | primary | 210 | true | 0 | | | 2020-02-04 11:38:29
 1 | pgpool-poc02.novalocal | 5432 | up | 0.500000 | standby | 0 | false | 0 | | | 2020-02-04 11:38:29
(2 rows)

Again replication_state | replication_sync_state looks blank.

raj.pandey1982@gmail.com

2020-02-04 17:47

reporter   ~0003129

[postgres@pgpool-poc02 data]$ cat recovery.conf
standby_mode = 'on'
primary_conninfo = 'user=replication password=reppassword host=10.70.184.27 application_name=pgpool-poc01.novalocal port=5432 sslmode=disable sslcompression=0 target_session_attrs=any'
primary_conninfo = 'host=10.70.184.27 port=5432 user=replication password=reppassword'
trigger_file = '/installer/tmp/im_the_master'
restore_command = 'cp /installer/archivedir/%f "%p"'

[postgres@pgpool-poc02 data]$

raj.pandey1982@gmail.com

2020-02-04 18:03

reporter   ~0003130

I want to clear one more thing:- In our setup , aim is to do failover from Master to Slave and once Slave is promoted as Master Application/remote/client should connect to it.

If above is successful then i can even create a new slave DB any time from above Newly promoted master, either by manually restore using rsync OR or PG_Backup or automation thru PGPOOL.

But right now what i want to achieve that "when Master DB goes down Slave should be able to accept connections as a newly promoted Master". Now Script is doing its work and promoting slave as master but pgpool is not leaving the down node and keep on bugging it and not
accepting remote connection.

raj.pandey1982@gmail.com

2020-02-04 18:37

reporter   ~0003131

After DB failover, if i restart PGPOLL services manually at Master and Stand By , then Remote connection is happening to newly promoted Master (but then its manual work and not automatic) with below message in pgpool log:-

2020-02-04 12:32:46: pid 31014:CONTEXT: while checking replication time lag
2020-02-04 12:32:46: pid 31014:LOG: get_query_result falied: status: -1
2020-02-04 12:32:46: pid 31014:CONTEXT: while checking replication time lag
2020-02-04 12:32:51: pid 31014:LOG: get_query_result: no rows returned
2020-02-04 12:32:51: pid 31014:DETAIL: node id (1)
2020-02-04 12:32:51: pid 31014:CONTEXT: while checking replication time lag
2020-02-04 12:32:51: pid 31014:LOG: get_query_result falied: status: -1
2020-02-04 12:32:51: pid 31014:CONTEXT: while checking replication time lag
2020-02-04 12:32:56: pid 31014:LOG: get_query_result: no rows returned
2020-02-04 12:32:56: pid 31014:DETAIL: node id (1)
2020-02-04 12:32:56: pid 31014:CONTEXT: while checking replication time lag
2020-02-04 12:32:56: pid 31014:LOG: get_query_result falied: status: -1
2020-02-04 12:32:56: pid 31014:CONTEXT: while checking replication time lag
2020-02-04 12:33:01: pid 31014:LOG: get_query_result: no rows returned
2020-02-04 12:33:01: pid 31014:DETAIL: node id (1)
2020-02-04 12:33:01: pid 31014:CONTEXT: while checking replication time lag
2020-02-04 12:33:01: pid 31014:LOG: get_query_result falied: status: -1
2020-02-04 12:33:01: pid 31014:CONTEXT: while checking replication time lag

raj.pandey1982@gmail.com

2020-02-05 16:44

reporter   ~0003137

Hello Team, An findings here to resolve this wih some fixes.

t-ishii

2020-02-05 17:45

developer   ~0003138

> Again replication_state | replication_sync_state looks blank.
There must be something wrong with your settings.

Can you try following command on pgpool-poc01.novalocal ?

PGPASSWORD=postgrestg psql -p 5432 -h pgpool-poc01.novalocal -d postgres -U postgres -w

raj.pandey1982@gmail.com

2020-02-05 17:47

reporter   ~0003139

do i need to execute it as it is or to put in pgpool.conf file?

raj.pandey1982@gmail.com

2020-02-05 17:51

reporter   ~0003140

below are working

[postgres@pgpool-poc01 data]$ /usr/local/pgsql11.5/bin/psql -h 10.70.184.28 -p 5432 -U postgres -d mawidstg01
psql (11.5)
Type "help" for help.

mawidstg01=# \q
[postgres@pgpool-poc01 data]$ /usr/local/pgsql11.5/bin/psql -h pgpool-poc01.novalocal -p 5432 -U postgres -d mawidstg01
psql (11.5)
Type "help" for help.

mawidstg01=# \q
[postgres@pgpool-poc01 data]$ /usr/local/pgsql11.5/bin/psql -h pgpool-poc01.novalocal -p 5432 -U postgres -d postgres
psql (11.5)
Type "help" for help.

postgres=#

raj.pandey1982@gmail.com

2020-02-05 17:52

reporter   ~0003141

postgres=# \q
[postgres@pgpool-poc01 data]$ /usr/local/pgsql11.5/bin/psql -h pgpool-poc01.novalocal -p 5432 -U postgres -d postgres PGPASSWORD=postgrestg -w
psql: warning: extra command-line argument "PGPASSWORD=postgrestg" ignored
psql (11.5)
Type "help" for help.

postgres=# \q
[postgres@pgpool-poc01 data]$ /usr/local/pgsql11.5/bin/psql -h pgpool-poc01.novalocal -p 5432 -U postgres -d postgres PGPASSWORD=postgrestg
psql: warning: extra command-line argument "PGPASSWORD=postgrestg" ignored
psql (11.5)
Type "help" for help.

postgres=#

raj.pandey1982@gmail.com

2020-02-05 21:54

reporter   ~0003142

Hi I run this as you asked for and this worked with both db an pgpool port but sill status is blank:-
[postgres@pgpool-poc01 postgresql-11.5]$ PGPASSWORD=postgrestg /usr/local/pgsql11.5/bin/psql -h pgpool-poc01.novalocal -p 5432 -U postgres -d postgres -w
psql (11.5)
Type "help" for help.

postgres=# select * from pg_catalog.pg_stat_replication;
  pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_
lag | replay_lag | sync_priority | sync_state
-------+----------+-------------+------------------+--------------+-----------------+-------------+-------------------------------+--------------+-----------+--------------+--------------+--------------+--------------+-----------+-------
----+------------+---------------+------------
 23100 | 98470 | replication | walreceiver | 10.70.184.28 | | 46798 | 2020-02-04 17:14:34.695439+03 | 8915781 | streaming | 109/D413DF50 | 109/D413DF50 | 109/D413DF50 | 109/D413DF50 | |
    | | 0 | async
(1 row)

postgres=# \q
[postgres@pgpool-poc01 postgresql-11.5]$ PGPASSWORD=postgrestg /usr/local/pgsql11.5/bin/psql -h pgpool-poc01.novalocal -p 5433 -U postgres -d postgres -w
psql (11.5)
Type "help" for help.

postgres=# show pool_nodes;
 node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+------------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0 | pgpool-poc01.novalocal | 5432 | up | 0.500000 | primary | 0 | true | 0 | | | 2020-02-05 15:34:17
 1 | pgpool-poc02.novalocal | 5432 | up | 0.500000 | standby | 0 | false | 0 | | | 2020-02-05 15:34:17
(2 rows)

postgres=#

t-ishii

2020-02-06 00:08

developer   ~0003143

Last edited: 2020-02-06 00:15

View 2 revisions

Oops. I forgot to add the SQL to be executed by the psql command:
PGPASSWORD=postgrestg psql -p 5432 -h pgpool-poc01.novalocal -d postgres -U postgres -w -c "SELECT application_name, state, sync_state FROM pg_stat_replication"

This is effectively same SQL with database, user and password specified in pgpool.conf, executed in pgpool to collect information used by show pool_nodes command.

raj.pandey1982@gmail.com

2020-02-06 01:59

reporter   ~0003144

postgres@pgpool-poc01 postgresql-11.5]$ PGPASSWORD=postgrestg /usr/local/pgsql11.5/bin/psql -h pgpool-poc01.novalocal -p 5432 -U postgres -d postgres -w -c "SELECT application_name, state, sync_state FROM pg_stat_replication"
 application_name | state | sync_state
------------------+-----------+------------
 walreceiver | streaming | async
(1 row)

[postgres@pgpool-poc01 postgresql-11.5]$

raj.pandey1982@gmail.com

2020-02-06 02:02

reporter   ~0003145

Finally your query gives the result as per expectation.So what is wrong with my pgpool settings

t-ishii

2020-02-06 08:13

developer   ~0003146

application_name should be pgpool-poc02.novalocal rather than walreceiver. walreceiver is the default application name for wal receiver process. This means that application name is not properly set in recovery.con on pgpool-poc02.novalocal Have you set application name parameter to pgpool-poc02.novalocal in the recovery.con on pgpool-poc02.novalocal? I mean:
application_name=pgpool-poc02.novalocal
in the primary_conninfo line.

raj.pandey1982@gmail.com

2020-02-06 16:33

reporter   ~0003147

Hi Friend,
Thanks.I did the changes and this worked .Both the fields values now showing up Now, what next thing to check as i tried failover after this but same issue :-
[postgres@pgpool-poc02 data]$ cat /tmp/slaveconfbkp/recovery.conf
standby_mode = 'on'
primary_conninfo = 'user=replication password=reppassword host=10.70.184.27 application_name=pgpool-poc02.novalocal port=5432 sslmode=disable sslcompression=0 target_session_attrs=any'
trigger_file = '/installer/tmp/im_the_master'
restore_command = 'cp /installer/archivedir/%f "%p"'


This worked
[postgres@pgpool-poc01 postgresql-11.5]$ PGPASSWORD=postgrestg /usr/local/pgsql11.5/bin/psql -h pgpool-poc01.novalocal -p 5433 -U postgres -d postgres -w -c "SELECT application_name, state, sync_state FROM pg_stat_replication"
    application_name | state | sync_state
------------------------+-----------+------------
 pgpool-poc02.novalocal | streaming | async
(1 row)

[postgres@pgpool-poc01 postgresql-11.5]$ /usr/local/pgsql11.5/bin/psql -h 10.70.184.29 -p 5433 -U postgres -d mawidstg01
psql (11.5)
Type "help" for help.

mawidstg01=# show pool_nodes;
 node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+------------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0 | pgpool-poc01.novalocal | 5432 | up | 0.500000 | primary | 3 | false | 0 | | | 2020-02-06 10:16:31
 1 | pgpool-poc02.novalocal | 5432 | up | 0.500000 | standby | 0 | true | 0 | streaming | async | 2020-02-06 10:16:31
(2 rows)

raj.pandey1982@gmail.com

2020-02-06 16:35

reporter   ~0003148

After replication_state | replication_sync_state started displaying i again did the DB failover but got same error again :-(...Please suggest :-

the 2020-02-06 10:16:09: pid 18247:LOG: successfully acquired the delegate IP:"10.70.184.29"
2020-02-06 10:16:09: pid 18247:DETAIL: 'if_up_cmd' returned with success
2020-02-06 10:16:09: pid 18243:LOG: watchdog escalation process with pid: 18247 exit with SUCCESS.
2020-02-06 10:16:17: pid 18243:LOG: new watchdog node connection is received from "10.70.184.28:21705"
2020-02-06 10:16:17: pid 18243:LOG: new node joined the cluster hostname:"pgpool-poc02.novalocal" port:9000 pgpool_port:5433
2020-02-06 10:16:17: pid 18243:DETAIL: Pgpool-II version:"4.1.0" watchdog messaging version: 1.1
2020-02-06 10:16:17: pid 18243:LOG: new outbound connection to pgpool-poc02.novalocal:9000
2020-02-06 10:16:23: pid 18243:LOG: adding watchdog node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal" to the standby list
2020-02-06 10:16:23: pid 18241:LOG: Pgpool-II parent process received watchdog quorum change signal from watchdog
2020-02-06 10:16:23: pid 18243:LOG: new IPC connection received
2020-02-06 10:16:23: pid 18241:LOG: watchdog cluster now holds the quorum
2020-02-06 10:16:23: pid 18241:DETAIL: updating the state of quarantine backend nodes
2020-02-06 10:16:23: pid 18243:LOG: new IPC connection received
2020-02-06 10:16:31: pid 19253:LOG: pool_reuse_block: blockid: 0
2020-02-06 10:16:31: pid 19253:CONTEXT: while searching system catalog, When relcache is missed
2020-02-06 10:16:32: pid 19255:LOG: forked new pcp worker, pid=19282 socket=7
2020-02-06 10:16:32: pid 19255:LOG: PCP process with pid: 19282 exit with SUCCESS.
2020-02-06 10:16:32: pid 19255:LOG: PCP process with pid: 19282 exits with status 0
2020-02-06 10:16:32: pid 19255:LOG: forked new pcp worker, pid=19285 socket=7
2020-02-06 10:16:32: pid 19255:LOG: PCP process with pid: 19285 exit with SUCCESS.
2020-02-06 10:16:32: pid 19255:LOG: PCP process with pid: 19285 exits with status 0
2020-02-06 10:16:32: pid 19255:LOG: forked new pcp worker, pid=19288 socket=7
2020-02-06 10:16:32: pid 19255:LOG: PCP process with pid: 19288 exit with SUCCESS.
2020-02-06 10:16:32: pid 19255:LOG: PCP process with pid: 19288 exits with status 0
2020-02-06 10:20:05: pid 19255:LOG: forked new pcp worker, pid=19545 socket=7
2020-02-06 10:20:05: pid 19255:LOG: PCP process with pid: 19545 exit with SUCCESS.
2020-02-06 10:20:05: pid 19255:LOG: PCP process with pid: 19545 exits with status 0
2020-02-06 10:20:05: pid 19255:LOG: forked new pcp worker, pid=19548 socket=7
2020-02-06 10:20:05: pid 19255:LOG: PCP process with pid: 19548 exit with SUCCESS.
2020-02-06 10:20:05: pid 19255:LOG: PCP process with pid: 19548 exits with status 0
2020-02-06 10:20:05: pid 19255:LOG: forked new pcp worker, pid=19551 socket=7
2020-02-06 10:20:05: pid 19255:LOG: PCP process with pid: 19551 exit with SUCCESS.
2020-02-06 10:20:05: pid 19255:LOG: PCP process with pid: 19551 exits with status 0
2020-02-06 10:21:31: pid 18241:LOG: child process with pid: 19253 exits with status 256
2020-02-06 10:21:31: pid 18241:LOG: fork a new child process with pid: 19646
2020-02-06 10:21:32: pid 18241:LOG: child process with pid: 19232 exits with status 256
2020-02-06 10:21:32: pid 18241:LOG: fork a new child process with pid: 19647
2020-02-06 10:21:58: pid 18241:LOG: child process with pid: 19235 exits with status 256
2020-02-06 10:21:58: pid 18241:LOG: fork a new child process with pid: 19679
2020-02-06 10:23:37: pid 19256:LOG: received degenerate backend request for node_id: 0 from pid [19256]
2020-02-06 10:23:37: pid 18243:LOG: new IPC connection received
2020-02-06 10:23:37: pid 18243:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-02-06 10:23:37: pid 18243:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-02-06 10:23:37: pid 18243:LOG: we have got the consensus to perform the failover
2020-02-06 10:23:37: pid 18243:DETAIL: 1 node(s) voted in the favor
2020-02-06 10:23:37: pid 19256:ERROR: unable to read data from DB node 0
2020-02-06 10:23:37: pid 19256:DETAIL: socket read failed with an error "Success"
2020-02-06 10:23:37: pid 18241:LOG: Pgpool-II parent process has received failover request
2020-02-06 10:23:37: pid 18243:LOG: new IPC connection received
2020-02-06 10:23:37: pid 18243:LOG: received the failover indication from Pgpool-II on IPC interface
2020-02-06 10:23:37: pid 18243:LOG: watchdog is informed of failover start by the main process
2020-02-06 10:23:37: pid 18241:LOG: starting degeneration. shutdown host pgpool-poc01.novalocal(5432)
2020-02-06 10:23:37: pid 18241:LOG: Restart all children
2020-02-06 10:23:37: pid 18241:LOG: execute command: /usr/share/pgpool/4.1.0/etc/failover.sh 0 0 pgpool-poc02.novalocal postgrestg /installer/postgresql-11.5/data/im_the_master
Authorized Uses Only.All activity may be Monitored and Reported
promote - Start
DEBUG: The script will be executed with the following arguments:
DEBUG: --trigger-file=/installer/postgresql-11.5/data/im_the_master
DEBUG: --standby_file=/installer/postgresql-11.5/data/im_slave
DEBUG: --demote-host=
DEBUG: --user=postgres
DEBUG: --password=postgrestg
DEBUG: --force
INFO: Checking if standby file exists...
INFO: Checking if trigger file exists...
INFO: Deleting recovery.conf file...
INFO: Checking if postgresql.conf file exists...
INFO: postgresql.conf file found. Checking if it is for primary server...
INFO: postgresql.conf file corresponds to primary server file. Nothing to do.
pg_ctl: server is running (PID: 28770)
/usr/local/pgsql11.5/bin/postgres "-D" "/installer/postgresql-11.5/data"
INFO: Restarting postgresql service...
waiting for server to shut down.... done
server stopped
waiting for server to start....2020-02-06 10:23:37 +03 LOG: listening on IPv4 address "0.0.0.0", port 5432
2020-02-06 10:23:37 +03 LOG: listening on IPv6 address "::", port 5432
2020-02-06 10:23:37 +03 LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2020-02-06 10:23:37 +03 LOG: redirecting log output to logging collector process
2020-02-06 10:23:37 +03 HINT: Future log output will appear in directory "/dblogs/logs".
 done
server started
pg_ctl: server is running (PID: 30372)
/usr/local/pgsql11.5/bin/postgres "-D" "/installer/postgresql-11.5/data"
INFO: postgresql already running.
INFO: Ensuring replication role and password...
INFO: Replication role found. Ensuring password...
ALTER ROLE
INFO: Creating primary info file...
promote - Done!
2020-02-06 10:23:42: pid 19256:ERROR: Failed to check replication time lag
2020-02-06 10:23:42: pid 19256:DETAIL: No persistent db connection for the node 0
2020-02-06 10:23:42: pid 19256:HINT: check sr_check_user and sr_check_password
2020-02-06 10:23:42: pid 19256:CONTEXT: while checking replication time lag
2020-02-06 10:23:42: pid 19256:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-02-06 10:23:42: pid 19256:ERROR: failed to make persistent db connection
2020-02-06 10:23:42: pid 19256:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-02-06 10:23:42: pid 18243:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-02-06 10:23:42: pid 18243:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-02-06 10:23:42: pid 18243:LOG: we have got the consensus to perform the failover
2020-02-06 10:23:42: pid 18243:DETAIL: 1 node(s) voted in the favor
2020-02-06 10:23:42: pid 18243:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
2020-02-06 10:23:47: pid 19256:ERROR: Failed to check replication time lag
2020-02-06 10:23:47: pid 19256:DETAIL: No persistent db connection for the node 0
2020-02-06 10:23:47: pid 19256:HINT: check sr_check_user and sr_check_password
2020-02-06 10:23:47: pid 19256:CONTEXT: while checking replication time lag
2020-02-06 10:23:47: pid 19256:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-02-06 10:23:47: pid 19256:ERROR: failed to make persistent db connection
2020-02-06 10:23:47: pid 19256:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-02-06 10:23:50: pid 18243:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-02-06 10:23:50: pid 18243:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-02-06 10:23:50: pid 18243:LOG: we have got the consensus to perform the failover
2020-02-06 10:23:50: pid 18243:DETAIL: 1 node(s) voted in the favor
2020-02-06 10:23:50: pid 18243:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
2020-02-06 10:23:52: pid 19256:ERROR: Failed to check replication time lag
2020-02-06 10:23:52: pid 19256:DETAIL: No persistent db connection for the node 0
2020-02-06 10:23:52: pid 19256:HINT: check sr_check_user and sr_check_password
2020-02-06 10:23:52: pid 19256:CONTEXT: while checking replication time lag
2020-02-06 10:23:52: pid 19256:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-02-06 10:23:52: pid 19256:ERROR: failed to make persistent db connection
2020-02-06 10:23:52: pid 19256:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-02-06 10:23:57: pid 19256:ERROR: Failed to check replication time lag
2020-02-06 10:23:57: pid 19256:DETAIL: No persistent db connection for the node 0
2020-02-06 10:23:57: pid 19256:HINT: check sr_check_user and sr_check_password
2020-02-06 10:23:57: pid 19256:CONTEXT: while checking replication time lag
2020-02-06 10:23:57: pid 19256:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-02-06 10:23:57: pid 19256:ERROR: failed to make persistent db connection
2020-02-06 10:23:57: pid 19256:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-02-06 10:23:58: pid 18243:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-02-06 10:23:58: pid 18243:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-02-06 10:23:58: pid 18243:LOG: we have got the consensus to perform the failover
2020-02-06 10:23:58: pid 18243:DETAIL: 1 node(s) voted in the favor
2020-02-06 10:23:58: pid 18243:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
2020-02-06 10:24:02: pid 19256:ERROR: Failed to check replication time lag
2020-02-06 10:24:02: pid 19256:DETAIL: No persistent db connection for the node 0
2020-02-06 10:24:02: pid 19256:HINT: check sr_check_user and sr_check_password
2020-02-06 10:24:02: pid 19256:CONTEXT: while checking replication time lag
2020-02-06 10:24:02: pid 19256:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-02-06 10:24:02: pid 19256:ERROR: failed to make persistent db connection
2020-02-06 10:24:02: pid 19256:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-02-06 10:24:06: pid 18243:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-02-06 10:24:06: pid 18243:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-02-06 10:24:06: pid 18243:LOG: we have got the consensus to perform the failover
2020-02-06 10:24:06: pid 18243:DETAIL: 1 node(s) voted in the favor
2020-02-06 10:24:06: pid 18243:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
2020-02-06 10:24:07: pid 19256:ERROR: Failed to check replication time lag
2020-02-06 10:24:07: pid 19256:DETAIL: No persistent db connection for the node 0

raj.pandey1982@gmail.com

2020-02-06 17:20

reporter   ~0003149

if after above issue i restart Primary and Standby pgpool Services ,all goes good (Remote connection start happening) with below log message:-(but i expect this thing to happen when first time only i made master db down at node 1):-

2020-02-06 11:11:33: pid 20295:LOG: watchdog node state changed from [INITIALIZING] to [MASTER]
2020-02-06 11:11:33: pid 20295:LOG: I am announcing my self as master/coordinator watchdog node
2020-02-06 11:11:37: pid 20295:LOG: I am the cluster leader node
2020-02-06 11:11:37: pid 20295:DETAIL: our declare coordinator message is accepted by all nodes
2020-02-06 11:11:37: pid 20295:LOG: setting the local node "pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal" as watchdog cluster master
2020-02-06 11:11:37: pid 20295:LOG: I am the cluster leader node. Starting escalation process
2020-02-06 11:11:37: pid 20293:LOG: watchdog process is initialized
2020-02-06 11:11:37: pid 20293:DETAIL: watchdog messaging data version: 1.1
2020-02-06 11:11:37: pid 20295:LOG: escalation process started with PID:20296
2020-02-06 11:11:37: pid 20296:LOG: watchdog: escalation started
2020-02-06 11:11:37: pid 20295:LOG: new IPC connection received
2020-02-06 11:11:37: pid 20295:LOG: new IPC connection received
2020-02-06 11:11:37: pid 20297:LOG: 2 watchdog nodes are configured for lifecheck
2020-02-06 11:11:37: pid 20297:LOG: watchdog nodes ID:0 Name:"pgpool-poc01.novalocal:5433 Linux pgpool-poc01.novalocal"
2020-02-06 11:11:37: pid 20297:DETAIL: Host:"pgpool-poc01.novalocal" WD Port:9000 pgpool-II port:5433
2020-02-06 11:11:37: pid 20297:LOG: watchdog nodes ID:1 Name:"Not_Set"
2020-02-06 11:11:37: pid 20297:DETAIL: Host:"pgpool-poc02.novalocal" WD Port:9000 pgpool-II port:5433
2020-02-06 11:11:37: pid 20293:LOG: Setting up socket for 0.0.0.0:5433
2020-02-06 11:11:37: pid 20293:LOG: Setting up socket for :::5433
2020-02-06 11:11:37: pid 20297:LOG: watchdog lifecheck trusted server "mohvcasdb01.novalocal" added for the availability check
2020-02-06 11:11:37: pid 20297:LOG: watchdog lifecheck trusted server "mohcasdevdb.novalocal" added for the availability check
RTNETLINK answers: File exists
2020-02-06 11:11:37: pid 20296:LOG: failed to acquire the delegate IP address
2020-02-06 11:11:37: pid 20296:DETAIL: 'if_up_cmd' failed
2020-02-06 11:11:37: pid 20296:WARNING: watchdog escalation failed to acquire delegate IP
2020-02-06 11:11:37: pid 20295:LOG: watchdog escalation process with pid: 20296 exit with SUCCESS.
2020-02-06 11:11:38: pid 20293:LOG: find_primary_node_repeatedly: waiting for finding a primary node
2020-02-06 11:11:38: pid 20293:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-02-06 11:11:38: pid 20293:ERROR: failed to make persistent db connection
2020-02-06 11:11:38: pid 20293:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-02-06 11:11:38: pid 20293:LOG: find_primary_node: make_persistent_db_connection_noerror failed on node 0
2020-02-06 11:11:38: pid 20293:LOG: find_primary_node: primary node is 1
2020-02-06 11:11:38: pid 21305:LOG: PCP process: 21305 started
2020-02-06 11:11:38: pid 21306:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-02-06 11:11:38: pid 21306:ERROR: failed to make persistent db connection
2020-02-06 11:11:38: pid 21306:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-02-06 11:11:38: pid 21307:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-02-06 11:11:38: pid 21307:ERROR: failed to make persistent db connection
2020-02-06 11:11:38: pid 21307:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-02-06 11:11:38: pid 21307:LOG: health check retrying on DB node: 0 (round:1)
2020-02-06 11:11:38: pid 20293:LOG: pgpool-II successfully started. version 4.1.0 (karasukiboshi)
2020-02-06 11:11:38: pid 20293:LOG: node status[0]: 0
2020-02-06 11:11:38: pid 20293:LOG: node status[1]: 1
2020-02-06 11:11:38: pid 20299:LOG: createing watchdog heartbeat receive socket.
2020-02-06 11:11:38: pid 20299:DETAIL: bind receive socket to device: "eth0"
2020-02-06 11:11:38: pid 20299:LOG: set SO_REUSEPORT option to the socket
2020-02-06 11:11:38: pid 20299:LOG: creating watchdog heartbeat receive socket.
2020-02-06 11:11:38: pid 20299:DETAIL: set SO_REUSEPORT
2020-02-06 11:11:38: pid 20302:LOG: creating socket for sending heartbeat
2020-02-06 11:11:38: pid 20302:DETAIL: bind send socket to device: eth0
2020-02-06 11:11:38: pid 20302:LOG: set SO_REUSEPORT option to the socket
2020-02-06 11:11:38: pid 20302:LOG: creating socket for sending heartbeat
2020-02-06 11:11:38: pid 20302:DETAIL: set SO_REUSEPORT
2020-02-06 11:11:39: pid 21307:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-02-06 11:11:39: pid 21307:ERROR: failed to make persistent db connection
2020-02-06 11:11:39: pid 21307:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-02-06 11:11:39: pid 21307:LOG: health check retrying on DB node: 0 (round:2)
2020-02-06 11:11:40: pid 21307:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-02-06 11:11:40: pid 21307:ERROR: failed to make persistent db connection
2020-02-06 11:11:40: pid 21307:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-02-06 11:11:40: pid 21307:LOG: health check retrying on DB node: 0 (round:3)
2020-02-06 11:11:41: pid 21307:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-02-06 11:11:41: pid 21307:ERROR: failed to make persistent db connection
2020-02-06 11:11:41: pid 21307:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-02-06 11:11:41: pid 21307:LOG: health check failed on node 0 (timeout:0)
2020-02-06 11:11:41: pid 21307:LOG: received degenerate backend request for node_id: 0 from pid [21307]
2020-02-06 11:11:41: pid 20295:LOG: new IPC connection received
2020-02-06 11:11:41: pid 20295:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-02-06 11:11:41: pid 20295:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-02-06 11:11:41: pid 20295:LOG: we have got the consensus to perform the failover
2020-02-06 11:11:41: pid 20295:DETAIL: 1 node(s) voted in the favor
2020-02-06 11:11:41: pid 20293:LOG: Pgpool-II parent process has received failover request
2020-02-06 11:11:41: pid 20295:LOG: new IPC connection received
2020-02-06 11:11:41: pid 20295:LOG: received the failover indication from Pgpool-II on IPC interface
2020-02-06 11:11:41: pid 20295:LOG: watchdog is informed of failover start by the main process
2020-02-06 11:11:41: pid 20293:LOG: starting degeneration. shutdown host pgpool-poc01.novalocal(5432)
2020-02-06 11:11:41: pid 20293:LOG: Do not restart children because we are switching over node id 0 host: pgpool-poc01.novalocal port: 5432 and we are in streaming replication mode
2020-02-06 11:11:41: pid 20293:LOG: execute command: /usr/share/pgpool/4.1.0/etc/failover.sh 0 1 pgpool-poc02.novalocal postgrestg /installer/postgresql-11.5/data/im_the_master
2020-02-06 11:11:41: pid 20293:LOG: failover: set new primary node: 1
2020-02-06 11:11:41: pid 20293:LOG: failover: set new master node: 1
2020-02-06 11:11:41: pid 21306:ERROR: Failed to check replication time lag
2020-02-06 11:11:41: pid 21306:DETAIL: No persistent db connection for the node 0
2020-02-06 11:11:41: pid 21306:HINT: check sr_check_user and sr_check_password
2020-02-06 11:11:41: pid 21306:CONTEXT: while checking replication time lag
2020-02-06 11:11:41: pid 21306:LOG: worker process received restart request
2020-02-06 11:11:41: pid 20295:LOG: new IPC connection received
2020-02-06 11:11:41: pid 20295:LOG: received the failover indication from Pgpool-II on IPC interface
2020-02-06 11:11:41: pid 20295:LOG: watchdog is informed of failover end by the main process
failover done. shutdown host pgpool-poc01.novalocal(5432)2020-02-06 11:11:41: pid 20293:LOG: failover done. shutdown host pgpool-poc01.novalocal(5432)
2020-02-06 11:11:42: pid 21305:LOG: restart request received in pcp child process
2020-02-06 11:11:42: pid 20293:LOG: PCP child 21305 exits with status 0 in failover()
2020-02-06 11:11:42: pid 20293:LOG: fork a new PCP child pid 21313 in failover()
2020-02-06 11:11:42: pid 20293:LOG: worker child process with pid: 21306 exits with status 256
2020-02-06 11:11:42: pid 21313:LOG: PCP process: 21313 started
2020-02-06 11:11:42: pid 20293:LOG: fork a new worker child process with pid: 21314
2020-02-06 11:11:42: pid 21314:LOG: get_query_result: no rows returned
2020-02-06 11:11:42: pid 21314:DETAIL: node id (1)
2020-02-06 11:11:42: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:11:42: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:11:42: pid 21314:LOG: get_query_result falied: status: -1
2020-02-06 11:11:42: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:11:47: pid 21314:LOG: get_query_result: no rows returned
2020-02-06 11:11:47: pid 21314:DETAIL: node id (1)
2020-02-06 11:11:47: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:11:47: pid 21314:LOG: get_query_result falied: status: -1
2020-02-06 11:11:47: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:11:52: pid 21314:LOG: get_query_result: no rows returned
2020-02-06 11:11:52: pid 21314:DETAIL: node id (1)
2020-02-06 11:11:52: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:11:52: pid 21314:LOG: get_query_result falied: status: -1
2020-02-06 11:11:52: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:11:57: pid 21314:LOG: get_query_result: no rows returned
2020-02-06 11:11:57: pid 21314:DETAIL: node id (1)
2020-02-06 11:11:57: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:11:57: pid 21314:LOG: get_query_result falied: status: -1
2020-02-06 11:11:57: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:12:02: pid 21314:LOG: get_query_result: no rows returned
2020-02-06 11:12:02: pid 21314:DETAIL: node id (1)
2020-02-06 11:12:02: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:12:02: pid 21314:LOG: get_query_result falied: status: -1
2020-02-06 11:12:02: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:12:07: pid 21314:LOG: get_query_result: no rows returned
2020-02-06 11:12:07: pid 21314:DETAIL: node id (1)
2020-02-06 11:12:07: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:12:07: pid 21314:LOG: get_query_result falied: status: -1
2020-02-06 11:12:07: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:12:12: pid 21314:LOG: get_query_result: no rows returned
2020-02-06 11:12:12: pid 21314:DETAIL: node id (1)
2020-02-06 11:12:12: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:12:12: pid 21314:LOG: get_query_result falied: status: -1
2020-02-06 11:12:12: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:12:17: pid 21314:LOG: get_query_result: no rows returned
2020-02-06 11:12:17: pid 21314:DETAIL: node id (1)
2020-02-06 11:12:17: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:12:17: pid 21314:LOG: get_query_result falied: status: -1
2020-02-06 11:12:17: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:12:22: pid 21314:LOG: get_query_result: no rows returned
2020-02-06 11:12:22: pid 21314:DETAIL: node id (1)
2020-02-06 11:12:22: pid 21314:CONTEXT: while checking replication time lag

raj.pandey1982@gmail.com

2020-02-06 17:35

reporter   ~0003150

Sadly, After Promotion pgpool should not worry about node1 but should be good with node 2.But its not happening here.

raj.pandey1982@gmail.com

2020-02-06 17:56

reporter   ~0003151

See the difference here in after log message "1 node(s) voted in the favor" after db failover and after db failover with pgpool service restart.
============================================================================================================================
Log after DB failouver :-
2020-02-06 10:23:50: pid 18243:DETAIL: 1 node(s) voted in the favor
2020-02-06 10:23:50: pid 18243:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
020-02-06 10:23:47: pid 19256:ERROR: Failed to check replication time lag

Log after restarting services after Db failover:-

2020-02-06 11:11:41: pid 20295:DETAIL: 1 node(s) voted in the favor
2020-02-06 11:11:41: pid 20293:LOG: Pgpool-II parent process has received failover request
2020-02-06 11:11:41: pid 20295:LOG: new IPC connection received
2020-02-06 11:11:41: pid 20295:LOG: received the failover indication from Pgpool-II on IPC interface
2020-02-06 11:11:41: pid 20295:LOG: watchdog is informed of failover start by the main process
2020-02-06 11:11:41: pid 20293:LOG: starting degeneration. shutdown host pgpool-poc01.novalocal(5432)
2020-02-06 11:11:41: pid 20293:LOG: Do not restart children because we are switching over node id 0 host: pgpool-poc01.novalocal port: 5432 and we are in streaming replication mode


So this message where difference starts:- "invalid degenerate backend request, node id : 0 status: [3] is not valid for failover".

something is not picking up here well just after DB failover which is picking well after service restart after db failover.

t-ishii

2020-02-07 07:50

developer   ~0003152

> After replication_state | replication_sync_state started displaying i again did the DB failover but got same error again :-(...Please suggest :-

What exactly did you do? You shutdtown pgpool-poc02.novalocal? Or just shutdown postmaster on pgpool-poc02.novalocal?

raj.pandey1982@gmail.com

2020-02-07 18:00

reporter   ~0003153

Master DB and Primary PGPOOL is configured on node1 (pgpool-poc01novalocal)
Slave DB and Standby PGOOL is configured on node 2 (pgpool-poc02novalocal).
1st i added the recovery.conf with application_name=pgpool-poc02novalocal.
2nd i stopped PGPOOL Standby
3rd i stopped PGPOOL Master.
4th i stopped Slave DB and started again
5th I started PGPOOL at Primary & Standby
6th i checked replication_state | replication_sync_state showing thru 'pool_nodes' command.
7th i did failover by shutting down Master Database at Node1.
8th then i checked PGPOOL log on both Master / Stand by
9th both logs showed up SLAVE DB at node 2 promoted as Master and then started same old error like below and not allowing remote connection:-

2020-02-06 10:23:42: pid 19256:ERROR: Failed to check replication time lag
2020-02-06 10:23:42: pid 19256:DETAIL: No persistent db connection for the node 0
2020-02-06 10:23:42: pid 19256:HINT: check sr_check_user and sr_check_password
2020-02-06 10:23:42: pid 19256:CONTEXT: while checking replication time lag
2020-02-06 10:23:42: pid 19256:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-02-06 10:23:42: pid 19256:ERROR: failed to make persistent db connection
2020-02-06 10:23:42: pid 19256:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed
2020-02-06 10:23:42: pid 18243:LOG: watchdog received the failover command from remote pgpool-II node "pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal"
2020-02-06 10:23:42: pid 18243:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from pgpool-poc02.novalocal:5433 Linux pgpool-poc02.novalocal
2020-02-06 10:23:42: pid 18243:LOG: we have got the consensus to perform the failover
2020-02-06 10:23:42: pid 18243:DETAIL: 1 node(s) voted in the favor
2020-02-06 10:23:42: pid 18243:LOG: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover
2020-02-06 10:23:47: pid 19256:ERROR: Failed to check replication time lag
2020-02-06 10:23:47: pid 19256:DETAIL: No persistent db connection for the node 0
2020-02-06 10:23:47: pid 19256:HINT: check sr_check_user and sr_check_password
2020-02-06 10:23:47: pid 19256:CONTEXT: while checking replication time lag
2020-02-06 10:23:47: pid 19256:LOG: failed to connect to PostgreSQL server on "pgpool-poc01.novalocal:5432", getsockopt() detected error "Connection refused"
2020-02-06 10:23:47: pid 19256:ERROR: failed to make persistent db connection
2020-02-06 10:23:47: pid 19256:DETAIL: connection to host:"pgpool-poc01.novalocal:5432" failed

10th i stopped pgpool servcies on Standby
11th i stopped pgpool services on Master.
12th i started pgpool services at Master
13th i started pgpool services at Slave
14th Now Remote connection (checked with PostgreSql Admin) happening with below log messages:-

020-02-06 11:11:52: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:11:52: pid 21314:LOG: get_query_result falied: status: -1
2020-02-06 11:11:52: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:11:57: pid 21314:LOG: get_query_result: no rows returned
2020-02-06 11:11:57: pid 21314:DETAIL: node id (1)
2020-02-06 11:11:57: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:11:57: pid 21314:LOG: get_query_result falied: status: -1
2020-02-06 11:11:57: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:12:02: pid 21314:LOG: get_query_result: no rows returned
2020-02-06 11:12:02: pid 21314:DETAIL: node id (1)
2020-02-06 11:12:02: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:12:02: pid 21314:LOG: get_query_result falied: status: -1
2020-02-06 11:12:02: pid 21314:CONTEXT: while checking replication time lag
2020-02-06 11:12:07: pid 21314:LOG: get_query_result: no rows returned
2020-02-06 11:12:07: pid 21314:DETAIL: node id (1)

Now Please suggest what else i am lagging here. Time is running out for me. I will have to give up the PGPOOL Implementation if issue not resolved in a week.After 1 week week i have to complete UAT with pgpool and then only can go live if successful.

t-ishii

2020-02-08 19:30

developer   ~0003154

You seem to do too much at one time. I suggest you test the system step by step. If you find problem there, you'd better not to proceed to next step until you fix the problem.

> Master DB and Primary PGPOOL is configured on node1 (pgpool-poc01novalocal)
> Slave DB and Standby PGOOL is configured on node 2 (pgpool-poc02novalocal).

Probably the first test is stopping postmaster on node 2. Pgpool-II should trigger failover and "show pool_nodes" command should show that node 2 is down.

raj.pandey1982@gmail.com

2020-02-08 23:58

reporter   ~0003155

Sorry for goign fast with multiple steps.
Now i did as you asked for and as per your expectation results also appeared as below:-

>Probably the first test is stopping postmaster on node 2. Pgpool-II should trigger failover and "show pool_nodes" command should show that node 2 is down.

Before Postgres DB shutdown at node 2;-

[postgres@pgpool-poc01 logs]$ /usr/local/pgsql11.5/bin/psql -h 10.70.184.29 -p 5433 -U postgres -d mawidstg01
psql (11.5)
Type "help" for help.

mawidstg01=# show pool_nodes;
 node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_stat
e | last_status_change
---------+------------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+----------------------
--+---------------------
 0 | pgpool-poc01.novalocal | 5432 | up | 0.500000 | primary | 1913 | true | 0 | |
  | 2020-02-08 17:38:56
 1 | pgpool-poc02.novalocal | 5432 | up | 0.500000 | standby | 0 | false | 0 | streaming | async
  | 2020-02-08 17:38:56
(2 rows)


[postgres@pgpool-poc02 logs]$ /usr/local/pgsql11.5/bin/psql -h 10.70.184.29 -p 5433 -U postgres -d mawidstg01
psql (11.5)
Type "help" for help.

mawidstg01=# show pool_nodes;
 node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_stat
e | last_status_change
---------+------------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+----------------------
--+---------------------
 0 | pgpool-poc01.novalocal | 5432 | up | 0.500000 | primary | 0 | true | 0 | |
  | 2020-02-08 17:47:13
 1 | pgpool-poc02.novalocal | 5432 | up | 0.500000 | standby | 0 | false | 0 | streaming | async
  | 2020-02-08 17:47:13
(2 rows)



After Postgres DB shutdown at node 2:-

[postgres@pgpool-poc02 logs]$ /usr/local/pgsql11.5/bin/pg_ctl -D /installer/postgresql-11.5/data stop
waiting for server to shut down.... done
server stopped

[postgres@pgpool-poc02 logs]$ /usr/local/pgsql11.5/bin/psql -h 10.70.184.29 -p 5433 -U postgres -d mawidstg01
psql (11.5)
Type "help" for help.

mawidstg01=# show pool_nodes;
 node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_stat
e | last_status_change
---------+------------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+----------------------
--+---------------------
 0 | pgpool-poc01.novalocal | 5432 | up | 0.500000 | primary | 0 | true | 0 | |
  | 2020-02-08 17:48:56
 1 | pgpool-poc02.novalocal | 5432 | down | 0.500000 | standby | 0 | false | 0 | |
  | 2020-02-08 17:48:52
(2 rows)



mawidstg01=# show pool_nodes;

 node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_stat
e | last_status_change
---------+------------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+----------------------
--+---------------------
 0 | pgpool-poc01.novalocal | 5432 | up | 0.500000 | primary | 2553 | true | 0 | |
  | 2020-02-08 17:38:56
 1 | pgpool-poc02.novalocal | 5432 | down | 0.500000 | standby | 0 | false | 0 | |
  | 2020-02-08 17:48:52
(2 rows)



PGPOOL Log while above activty :-

2020-02-08 17:48:52: pid 15644:LOG: reading and processing packets
2020-02-08 17:48:52: pid 15644:DETAIL: postmaster on DB node 1 was shutdown by administrative command
2020-02-08 17:48:52: pid 15644:LOG: received degenerate backend request for node_id: 1 from pid [15644]
2020-02-08 17:48:52: pid 14643:LOG: new IPC connection received
2020-02-08 17:48:52: pid 14643:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-02-08 17:48:52: pid 15029:LOG: reading and processing packets
2020-02-08 17:48:52: pid 15029:DETAIL: postmaster on DB node 1 was shutdown by administrative command
2020-02-08 17:48:52: pid 14643:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-02-08 17:48:52: pid 14643:LOG: we have got the consensus to perform the failover
2020-02-08 17:48:52: pid 14643:DETAIL: 1 node(s) voted in the favor
2020-02-08 17:48:52: pid 14641:LOG: Pgpool-II parent process has received failover request
2020-02-08 17:48:52: pid 14643:LOG: new IPC connection received
2020-02-08 17:48:52: pid 14643:LOG: received the failover indication from Pgpool-II on IPC interface
2020-02-08 17:48:52: pid 14643:LOG: watchdog is informed of failover start by the main process
2020-02-08 17:48:52: pid 14641:LOG: starting degeneration. shutdown host pgpool-poc02.novalocal(5432)
2020-02-08 17:48:52: pid 14641:LOG: Do not restart children because we are switching over node id 1 host: pgpool-poc02.novalocal port: 5432 and we are in streaming replication mode
2020-02-08 17:48:52: pid 14641:LOG: child pid 15644 needs to restart because pool 0 uses backend 1
2020-02-08 17:48:52: pid 14641:LOG: execute command: /usr/share/pgpool/4.1.0/etc/failover.sh 1 0 pgpool-poc01.novalocal postgrestg /installer/postgresql-11.5/data/im_the_master
2020-02-08 17:48:52: pid 14641:LOG: failover: set new primary node: 0
2020-02-08 17:48:52: pid 14641:LOG: failover: set new master node: 0
2020-02-08 17:48:52: pid 14641:LOG: child pid 15644 needs to restart because pool 0 uses backend 1
2020-02-08 17:48:52: pid 14643:LOG: new IPC connection received
2020-02-08 17:48:52: pid 14643:LOG: received the failover indication from Pgpool-II on IPC interface
2020-02-08 17:48:52: pid 14643:LOG: watchdog is informed of failover end by the main process
2020-02-08 17:48:52: pid 15663:LOG: worker process received restart request
failover done. shutdown host pgpool-poc02.novalocal(5432)2020-02-08 17:48:52: pid 14641:LOG: failover done. shutdown host pgpool-poc02.novalocal(5432)
2020-02-08 17:48:53: pid 15662:LOG: restart request received in pcp child process
2020-02-08 17:48:53: pid 14641:LOG: PCP child 15662 exits with status 0 in failover()
2020-02-08 17:48:53: pid 14641:LOG: fork a new PCP child pid 16406 in failover()
2020-02-08 17:48:53: pid 14641:LOG: child process with pid: 15644 exits with status 256
2020-02-08 17:48:53: pid 14641:LOG: child process with pid: 15644 exited with success and will not be restarted
2020-02-08 17:48:53: pid 16406:LOG: PCP process: 16406 started
2020-02-08 17:48:53: pid 14641:LOG: worker child process with pid: 15663 exits with status 256
2020-02-08 17:48:53: pid 14641:LOG: fork a new worker child process with pid: 16407
2020-02-08 17:48:53: pid 16407:LOG: get_query_result: no rows returned
2020-02-08 17:48:53: pid 16407:DETAIL: node id (0)
2020-02-08 17:48:53: pid 16407:CONTEXT: while checking replication time lag
2020-02-08 17:48:53: pid 16407:LOG: get_query_result falied: status: -1
2020-02-08 17:48:53: pid 16407:CONTEXT: while checking replication time lag
2020-02-08 17:48:58: pid 16407:LOG: get_query_result: no rows returned
2020-02-08 17:48:58: pid 16407:DETAIL: node id (0)
2020-02-08 17:48:58: pid 16407:CONTEXT: while checking replication time lag
2020-02-08 17:48:58: pid 16407:LOG: get_query_result falied: status: -1
2020-02-08 17:48:58: pid 16407:CONTEXT: while checking replication time lag
2020-02-08 17:49:03: pid 16407:LOG: get_query_result: no rows returned
2020-02-08 17:49:03: pid 16407:DETAIL: node id (0)
2020-02-08 17:49:03: pid 16407:CONTEXT: while checking replication time lag
2020-02-08 17:49:03: pid 16407:LOG: get_query_result falied: status: -1
2020-02-08 17:49:03: pid 16407:CONTEXT: while checking replication time lag
2020-02-08 17:49:08: pid 16407:LOG: get_query_result: no rows returned
2020-02-08 17:49:08: pid 16407:DETAIL: node id (0)
2020-02-08 17:49:08: pid 16407:CONTEXT: while checking replication time lag
2020-02-08 17:49:08: pid 16407:LOG: get_query_result falied: status: -1

t-ishii

2020-02-09 15:24

developer   ~0003156

Looks good to me. In the log:
2020-02-08 17:49:08: pid 16407:LOG: get_query_result: no rows returned
2020-02-08 17:49:08: pid 16407:DETAIL: node id (0)
2020-02-08 17:49:08: pid 16407:CONTEXT: while checking replication time lag
2020-02-08 17:49:08: pid 16407:LOG: get_query_result falied: status: -1

This says that query "SELECT pg_stat_replication..." returns no row. This is normal because there's no standby node.

Next test is if you can recover the standby node by using pcp_recovery_node. After issuing pcp_recovery_node command something like:

pcp_recovery_node -h pgpool-poc01.novalocal 1

you should see that pgpool-poc02.novalocal comes back online.

raj.pandey1982@gmail.com

2020-02-09 15:34

reporter   ~0003157

Sure i will do and share the result.
But in between,one thing i had tested yesterday after sending you update:- i started slave DB and it came in sync then i attached the node thru pcp_attach node that went fine.

t-ishii

2020-02-09 15:44

developer   ~0003158

Yes, it should work too as long as the standby node is healthy after restarting. pcp_recovery_node can recover a stanbdy node which is out of sync with the primary node for example. So testing pcp_recovery_node is important.

raj.pandey1982@gmail.com

2020-02-09 16:37

reporter   ~0003159

[root@pgpool-poc01 replscripts]# pcp_recovery_node -h pgpool-poc02.novalocal -p 9898 -n 1 -U postgres -w
ERROR: executing recovery, execution of command failed at "1st stage"
DETAIL: command:"recovery_1st_stage.sh"

[root@pgpool-poc01 replscripts]#

t-ishii

2020-02-09 19:31

developer   ~0003160

You'd better to check PostgreSQL log of pgpool-poc01.novalocal and pgpool.log. Can you share them?

raj.pandey1982@gmail.com

2020-02-09 19:40

reporter  

mawidstg01-2020-02-09_000000.log (10,485,834 bytes)
mawidstg01-2020-02-09_131237.log (128,546 bytes)
pgpool-3.conf (41,865 bytes)

raj.pandey1982@gmail.com

2020-02-09 19:53

reporter   ~0003161

++ pgpool log too

raj.pandey1982@gmail.com

2020-02-09 19:55

reporter  

pgpool.zip (1,210,974 bytes)

t-ishii

2020-02-10 07:50

developer   ~0003162

Which one is the pgpool-poc01.novalocal log when you executed pcp_recovery_node?

raj.pandey1982@gmail.com

2020-02-10 15:09

reporter   ~0003164

The zip file pgpool.zip is the pgpool log of pgpool-poc01.novalocal.

t-ishii

2020-02-10 17:14

developer   ~0003165

When did you execute pcp_recovery_node?

raj.pandey1982@gmail.com

2020-02-10 17:32

reporter   ~0003166

For your convince i remove dold log just did it agian today at 11:26 AM
Step 1:
[postgres@pgpool-poc02 logs]$ /usr/local/pgsql11.5/bin/pg_ctl -D /installer/postgresql-11.5/data stop
waiting for server to shut down.... done
server stopped
[postgres@pgpool-poc02 logs]$

Step 2
[root@pgpool-poc01 pgpool]# pcp_recovery_node -h pgpool-poc02.novalocal -p 9898 -n 1 -U postgres -w
ERROR: executing recovery, execution of command failed at "1st stage"
DETAIL: command:"recovery_1st_stage.sh"


Step 3 Checked pgpool log at pgpool-poc01.novalocal :-

2020-02-10 11:26:24: pid 11668:DETAIL: postmaster on DB node 1 was shutdown by administrative command
2020-02-10 11:26:24: pid 11668:LOG: received degenerate backend request for node_id: 1 from pid [11668]
2020-02-10 11:26:24: pid 11029:LOG: new IPC connection received
2020-02-10 11:26:24: pid 11029:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-02-10 11:26:24: pid 11029:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-02-10 11:26:24: pid 11029:LOG: we have got the consensus to perform the failover
2020-02-10 11:26:24: pid 11029:DETAIL: 1 node(s) voted in the favor
2020-02-10 11:26:24: pid 11027:LOG: Pgpool-II parent process has received failover request
2020-02-10 11:26:24: pid 11029:LOG: new IPC connection received
2020-02-10 11:26:24: pid 11029:LOG: received the failover indication from Pgpool-II on IPC interface
2020-02-10 11:26:24: pid 11029:LOG: watchdog is informed of failover start by the main process
2020-02-10 11:26:24: pid 11726:LOG: reading and processing packets
2020-02-10 11:26:24: pid 11726:DETAIL: postmaster on DB node 1 was shutdown by administrative command
2020-02-10 11:26:24: pid 11726:LOG: received degenerate backend request for node_id: 1 from pid [11726]
2020-02-10 11:26:24: pid 11027:LOG: starting degeneration. shutdown host pgpool-poc02.novalocal(5432)
2020-02-10 11:26:24: pid 11029:LOG: new IPC connection received
2020-02-10 11:26:24: pid 11029:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2020-02-10 11:26:24: pid 11029:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2020-02-10 11:26:24: pid 11029:LOG: we have got the consensus to perform the failover
2020-02-10 11:26:24: pid 11029:DETAIL: 1 node(s) voted in the favor
2020-02-10 11:26:24: pid 11027:LOG: Do not restart children because we are switching over node id 1 host: pgpool-poc02.novalocal port: 5432 and we are in streaming replication mode
2020-02-10 11:26:24: pid 11027:LOG: child pid 11668 needs to restart because pool 0 uses backend 1
2020-02-10 11:26:24: pid 11027:LOG: child pid 11726 needs to restart because pool 0 uses backend 1
2020-02-10 11:26:24: pid 11027:LOG: execute command: /usr/share/pgpool/4.1.0/etc/failover.sh 1 0 pgpool-poc01.novalocal postgrestg /installer/postgresql-11.5/data/im_the_master
2020-02-10 11:26:24: pid 11027:LOG: failover: set new primary node: 0
2020-02-10 11:26:24: pid 11027:LOG: failover: set new master node: 0
2020-02-10 11:26:24: pid 11027:LOG: child pid 11668 needs to restart because pool 0 uses backend 1
2020-02-10 11:26:24: pid 11027:LOG: child pid 11726 needs to restart because pool 0 uses backend 1
2020-02-10 11:26:24: pid 11029:LOG: new IPC connection received
2020-02-10 11:26:24: pid 11029:LOG: received the failover indication from Pgpool-II on IPC interface
2020-02-10 11:26:24: pid 11029:LOG: watchdog is informed of failover end by the main process
failover done. shutdown host pgpool-poc02.novalocal(5432)2020-02-10 11:26:24: pid 11027:LOG: failover done. shutdown host pgpool-poc02.novalocal(5432)
2020-02-10 11:26:24: pid 12056:LOG: worker process received restart request
2020-02-10 11:26:24: pid 11029:LOG: new IPC connection received
2020-02-10 11:26:24: pid 11029:LOG: received the failover indication from Pgpool-II on IPC interface
2020-02-10 11:26:24: pid 11029:LOG: watchdog is informed of failover start by the main process
2020-02-10 11:26:24: pid 11027:LOG: failover: no backends are degenerated
2020-02-10 11:26:25: pid 12055:LOG: restart request received in pcp child process
2020-02-10 11:26:25: pid 11027:LOG: PCP child 12055 exits with status 0 in failover()
2020-02-10 11:26:25: pid 11027:LOG: fork a new PCP child pid 12172 in failover()
2020-02-10 11:26:25: pid 11027:LOG: child process with pid: 11668 exits with status 256
2020-02-10 11:26:25: pid 11027:LOG: child process with pid: 11668 exited with success and will not be restarted
2020-02-10 11:26:25: pid 11027:LOG: child process with pid: 11726 exits with status 256
2020-02-10 11:26:25: pid 11027:LOG: child process with pid: 11726 exited with success and will not be restarted
2020-02-10 11:26:25: pid 11027:LOG: worker child process with pid: 12056 exits with status 256
2020-02-10 11:26:25: pid 12172:LOG: PCP process: 12172 started
2020-02-10 11:26:25: pid 11027:LOG: fork a new worker child process with pid: 12173
2020-02-10 11:26:25: pid 12173:LOG: get_query_result: no rows returned
2020-02-10 11:26:25: pid 12173:DETAIL: node id (0)
2020-02-10 11:26:25: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:26:25: pid 12173:LOG: get_query_result falied: status: -1
2020-02-10 11:26:25: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:26:30: pid 12173:LOG: get_query_result: no rows returned
2020-02-10 11:26:30: pid 12173:DETAIL: node id (0)
2020-02-10 11:26:30: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:26:30: pid 12173:LOG: get_query_result falied: status: -1
2020-02-10 11:26:30: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:26:35: pid 12173:LOG: get_query_result: no rows returned
2020-02-10 11:26:35: pid 12173:DETAIL: node id (0)
2020-02-10 11:26:35: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:26:35: pid 12173:LOG: get_query_result falied: status: -1
2020-02-10 11:26:35: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:26:40: pid 12173:LOG: get_query_result: no rows returned
2020-02-10 11:26:40: pid 12173:DETAIL: node id (0)
2020-02-10 11:26:40: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:26:40: pid 12173:LOG: get_query_result falied: status: -1
2020-02-10 11:26:40: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:26:45: pid 12173:LOG: get_query_result: no rows returned
2020-02-10 11:26:45: pid 12173:DETAIL: node id (0)
2020-02-10 11:26:45: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:26:45: pid 12173:LOG: get_query_result falied: status: -1
2020-02-10 11:26:45: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:26:50: pid 12173:LOG: get_query_result: no rows returned
2020-02-10 11:26:50: pid 12173:DETAIL: node id (0)
2020-02-10 11:26:50: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:26:50: pid 12173:LOG: get_query_result falied: status: -1
2020-02-10 11:26:50: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:26:55: pid 12173:LOG: get_query_result: no rows returned

Step 4:- attaching the full logfile also.

raj.pandey1982@gmail.com

2020-02-10 17:34

reporter  

pgpool_10feb1126am.log (340,102 bytes)

raj.pandey1982@gmail.com

2020-02-10 17:44

reporter   ~0003167

Hello Friend i got this in the log while executing recovery command:-

[root@pgpool-poc01 pgpool]# pcp_recovery_node -h pgpool-poc01.novalocal -p 9898 -n 1 -U postgres -w
ERROR: executing recovery, execution of command failed at "1st stage"
DETAIL: command:"recovery_1st_stage.sh"


2020-02-10 11:42:13: pid 14200:LOG: starting recovering node 1
2020-02-10 11:42:13: pid 14200:LOG: executing recovery
2020-02-10 11:42:13: pid 14200:DETAIL: starting recovery command: "SELECT pgpool_recovery('recovery_1st_stage.sh', 'pgpool-poc02.novalocal', '/installer/postgresql-11.5/data', '5432', 1, '5432')"
2020-02-10 11:42:13: pid 14200:LOG: executing recovery
2020-02-10 11:42:13: pid 14200:DETAIL: disabling statement_timeout
2020-02-10 11:42:13: pid 14200:ERROR: executing recovery, execution of command failed at "1st stage"
2020-02-10 11:42:13: pid 14200:DETAIL: command:"recovery_1st_stage.sh"
2020-02-10 11:42:13: pid 12172:LOG: PCP process with pid: 14200 exit with SUCCESS.
2020-02-10 11:42:13: pid 12172:LOG: PCP process with pid: 14200 exits with status 0
2020-02-10 11:42:17: pid 12173:LOG: get_query_result: no rows returned
2020-02-10 11:42:17: pid 12173:DETAIL: node id (0)
2020-02-10 11:42:17: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:42:17: pid 12173:LOG: get_query_result falied: status: -1
2020-02-10 11:42:17: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:42:22: pid 12173:LOG: get_query_result: no rows returned
2020-02-10 11:42:22: pid 12173:DETAIL: node id (0)
2020-02-10 11:42:22: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:42:22: pid 12173:LOG: get_query_result falied: status: -1
2020-02-10 11:42:22: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:42:27: pid 12173:LOG: get_query_result: no rows returned
2020-02-10 11:42:27: pid 12173:DETAIL: node id (0)
2020-02-10 11:42:27: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:42:27: pid 12173:LOG: get_query_result falied: status: -1
2020-02-10 11:42:27: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:42:32: pid 12173:LOG: get_query_result: no rows returned
2020-02-10 11:42:32: pid 12173:DETAIL: node id (0)
2020-02-10 11:42:32: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:42:32: pid 12173:LOG: get_query_result falied: status: -1
2020-02-10 11:42:32: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:42:37: pid 12173:LOG: get_query_result: no rows returned
2020-02-10 11:42:37: pid 12173:DETAIL: node id (0)
2020-02-10 11:42:37: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:42:37: pid 12173:LOG: get_query_result falied: status: -1
2020-02-10 11:42:37: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:42:42: pid 12173:LOG: get_query_result: no rows returned
2020-02-10 11:42:42: pid 12173:DETAIL: node id (0)
2020-02-10 11:42:42: pid 12173:CONTEXT: while checking replication time lag
2020-02-10 11:42:42: pid 12173:LOG: get_query_result falied: status: -1
2020-02-10 11:42:42: pid 12173:CONTEXT: while checking replication time lag

t-ishii

2020-02-10 17:48

developer   ~0003168

Can you share PostgreSQL log at the same time on pgpool-poc01.novalocal?

raj.pandey1982@gmail.com

2020-02-10 20:50

reporter  

pgpool.log_1126amLogWithRecoveryError (1,163,032 bytes)

raj.pandey1982@gmail.com

2020-02-11 16:02

reporter   ~0003170

Hi Friend, an findings here!/

t-ishii

2020-02-11 16:28

developer   ~0003171

Sorry but what I wanted was PostgreSQL log, not pgpool log.

raj.pandey1982@gmail.com

2020-02-11 19:50

reporter   ~0003172

I did this again to reproduce the issue, here is the Master DB node postgres log out put (attaching the full log as well.:-
2020-02-11 13:45:20 +03 LOG: statement: SELECT pg_current_wal_lsn()
2020-02-11 13:45:20 +03 LOG: statement: SELECT application_name, state, sync_state FROM pg_stat_replication
2020-02-11 13:45:20 +03 LOG: statement: SELECT pg_is_in_recovery()
2020-02-11 13:45:21 +03 LOG: statement: SELECT
           (SELECT count(*) FROM pg_stat_activity) AS "Total",
           (SELECT count(*) FROM pg_stat_activity WHERE state = 'active') AS "Active",
           (SELECT count(*) FROM pg_stat_activity WHERE state = 'idle') AS "Idle"
2020-02-11 13:45:21 +03 LOG: statement: SET statement_timeout To 0
2020-02-11 13:45:21 +03 LOG: statement: SELECT pgpool_recovery('recovery_1st_stage.sh', 'pgpool-poc02.novalocal', '/installer/postgresql-11.5/data', '5432', 1, '5432')
2020-02-11 13:45:21 +03 ERROR: function pgpool_recovery(unknown, unknown, unknown, unknown, integer, unknown) does not exist at character 8
2020-02-11 13:45:21 +03 HINT: No function matches the given name and argument types. You might need to add explicit type casts.
2020-02-11 13:45:21 +03 STATEMENT: SELECT pgpool_recovery('recovery_1st_stage.sh', 'pgpool-poc02.novalocal', '/installer/postgresql-11.5/data', '5432', 1, '5432')
2020-02-11 13:45:21 +03 LOG: statement: SELECT pg_current_wal_lsn()
2020-02-11 13:45:21 +03 LOG: statement: SELECT application_name, state, sync_state FROM pg_stat_replication
2020-02-11 13:45:21 +03 LOG: statement: SELECT pg_is_in_recovery()
2020-02-11 13:45:22 +03 LOG: statement: SELECT
           (SELECT count(*) FROM pg_stat_activity) AS "Total",
           (SELECT count(*) FROM pg_stat_activity WHERE state = 'active') AS "Active",
           (SELECT count(*) FROM pg_stat_activity WHERE state = 'idle') AS "Idle"
2020-02-11 13:45:23 +03 LOG: statement: SELECT
           (SELECT count(*) FROM pg_stat_activity) AS "Total",
           (SELECT count(*) FROM pg_stat_activity WHERE state = 'active') AS "Active",
           (SELECT count(*) FROM pg_stat_activity WHERE state = 'idle') AS "Idle"
2020-02-11 13:45:24 +03 LOG: statement: SELECT
           (SELECT count(*) FROM pg_stat_activity) AS "Total",
           (SELECT count(*) FROM pg_stat_activity WHERE state = 'active') AS "Active",
           (SELECT count(*) FROM pg_stat_activity WHERE state = 'idle') AS "Idle"
2020-02-11 13:45:25 +03 LOG: statement: SELECT
           (SELECT count(*) FROM pg_stat_activity) AS "Total",
           (SELECT count(*) FROM pg_stat_activity WHERE state = 'active') AS "Active",

mawidstg01-2020-02-11_133657.log (351,585 bytes)

t-ishii

2020-02-11 20:05

developer   ~0003173

You need to install pgpool_recovery extension which comes with Pgpool-II, to all PostgreSQL.

raj.pandey1982@gmail.com

2020-02-11 20:52

reporter   ~0003174

Its already there i just checked:-

select * from pg_available_extensions;
'plpgsql','1.0','1.0','PL/pgSQL procedural language'
'pgcrypto','1.3','1.3','cryptographic functions'
'pg_stat_statements','1.6','1.4','track execution statistics of all SQL statements executed'
'pg_buffercache','1.3','1.2','examine the shared buffer cache'
'pgpool_adm','1.2',,'Administrative functions for pgPool'
'pgpool_recovery','1.3',,'recovery functions for pgpool-II for V4.1 or later'
'pgpool_regclass','1.0',,'replacement for regclass'

raj.pandey1982@gmail.com

2020-02-11 20:56

reporter   ~0003175

Also the Recovery 1st stage script that i am using:-

[postgres@pgpool-poc01 data]$ cat recovery_1st_stage.sh
#! /bin/sh
psql=/usr/local/pgsqli-11.5/bin/psql
DATADIR_BASE=/installer/postgresql-11.5/data
PGSUPERUSER=postgres
master_db_cluster=$1
recovery_node_host_name=$2
DEST_CLUSTER=$3
PORT=$4
recovery_node=$5
pg_rewind_failed="true"
log=/installer/replscripts/recovery.log
echo >> $log
date >> $log
if [ $pg_rewind_failed = "true" ];then
$psql -p $PORT -c "SELECT pg_start_backup('Streaming Replication', true)" postgres
echo "source: $master_db_cluster dest: $DEST_CLUSTER" >> $log
rsync -C -a -c --delete --exclude postgresql.conf --exclude postmaster.pid \
--exclude postmaster.opts --exclude pg_log \
--exclude recovery.conf --exclude recovery.done \
--exclude pg_xlog \
$master_db_cluster/ $DEST_CLUSTER/
rm -fr $DEST_CLUSTER/pg_xlog
mkdir $DEST_CLUSTER/pg_xlog
chmod 700 $DEST_CLUSTER/pg_xlog
rm $DEST_CLUSTER/recovery.done
fi
cat > $DEST_CLUSTER/recovery.conf $lt;$lt;REOF
standby_mode = 'on'
primary_conninfo = 'port=$PORT user=$PGSUPERUSER'
recovery_target_timeline='latest'
restore_command = 'cp /installer/archivedir/%f "%p" 2> /dev/null'
REOF
if [ $pg_rewind_failed = "true" ];then
$psql -p $PORT -c "SELECT pg_stop_backup()" postgres
fi
if [ $pg_rewind_failed = "false" ];then
cp /tmp/slaveconfbkp/postgresql.conf $DEST_CLUSTER/
fi

t-ishii

2020-02-11 21:59

developer   ~0003176

The extension needs to be installed into template1 database. Have you done it?

raj.pandey1982@gmail.com

2020-02-11 22:09

reporter   ~0003177

did now :-

root@pgpool-poc01 sql]# sudo -u postgres /usr/local/pgsql11.5/bin/psql -f pgpool-recovery.sql template1
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
[root@pgpool-poc01 sql]#

raj.pandey1982@gmail.com

2020-02-11 22:12

reporter   ~0003178

Template1 Database output:-
select * from pg_available_extensions;
'plpgsql','1.0','1.0','PL/pgSQL procedural language'
'pgcrypto','1.3',,'cryptographic functions'
'pg_stat_statements','1.6',,'track execution statistics of all SQL statements executed'
'pg_buffercache','1.3',,'examine the shared buffer cache'
'pgpool_adm','1.2',,'Administrative functions for pgPool'
'pgpool_recovery','1.3',,'recovery functions for pgpool-II for V4.1 or later'
'pgpool_regclass','1.0',,'replacement for regclass'

raj.pandey1982@gmail.com

2020-02-11 22:35

reporter   ~0003179

again tried to shudown slave and fired recovery command:-
2020-02-11 16:30:42 +03 LOG: statement: SELECT
           (SELECT sum(blks_read) FROM pg_stat_database WHERE datname = (SELECT datname FROM pg_database WHERE oid = 16416)) AS "Reads",
           (SELECT sum(blks_hit) FROM pg_stat_database WHERE datname = (SELECT datname FROM pg_database WHERE oid = 16416)) AS "Hits"
2020-02-11 16:30:42 +03 LOG: statement: SET statement_timeout To 0
2020-02-11 16:30:42 +03 LOG: statement: SELECT pgpool_recovery('recovery_1st_stage.sh', 'pgpool-poc02.novalocal', '/installer/postgresql-11.5/data', '5432', 1, '5432')
2020-02-11 16:30:42 +03 ERROR: function pgpool_recovery(unknown, unknown, unknown, unknown, integer, unknown) does not exist at character 8
2020-02-11 16:30:42 +03 HINT: No function matches the given name and argument types. You might need to add explicit type casts.
2020-02-11 16:30:42 +03 STATEMENT: SELECT pgpool_recovery('recovery_1st_stage.sh', 'pgpool-poc02.novalocal', '/installer/postgresql-11.5/data', '5432', 1, '5432')
2020-02-11 16:30:42 +03 LOG: statement: SELECT
           (SELECT count(*) FROM pg_stat_activity WHERE datname = (SELECT datname FROM pg_database WHERE oid = 16416)) AS "Total",
           (SELECT count(*) FROM pg_stat_activity WHERE state = 'active' AND datname = (SELECT datname FROM pg_database WHERE oid = 16416)) AS "Acti

t-ishii

2020-02-11 22:45

developer   ~0003180

Why don't you use the CREATE EXTENSION command? Using psql and SQL script lke pgpool-recovery.sql is an obsoleted way to register functions.

raj.pandey1982@gmail.com

2020-02-12 15:45

reporter   ~0003185

its giving error:-

template1=# CREATE EXTENSION pgpool_recovery;
ERROR: extension "pgpool_recovery" has no installation script nor update path for version "1.3"

raj.pandey1982@gmail.com

2020-02-12 15:47

reporter   ~0003186

[postgres@pgpool-poc01 logs]$ locate pgpool_recovery
/installer/pgpool-II-4.1.0/src/sql/pgpool-recovery/pgpool_recovery--1.1--1.2.sql
/installer/pgpool-II-4.1.0/src/sql/pgpool-recovery/pgpool_recovery--1.1.sql
/installer/pgpool-II-4.1.0/src/sql/pgpool-recovery/pgpool_recovery--1.2--1.3.sql
/installer/pgpool-II-4.1.0/src/sql/pgpool-recovery/pgpool_recovery--1.2.sql
/installer/pgpool-II-4.1.0/src/sql/pgpool-recovery/pgpool_recovery--1.3.sql
/installer/pgpool-II-4.1.0/src/sql/pgpool-recovery/pgpool_recovery.control
/usr/local/pgsql11.5/share/extension/pgpool_recovery--1.1.sql
/usr/local/pgsql11.5/share/extension/pgpool_recovery.control
[postgres@pgpool-poc01 logs]$

t-ishii

2020-02-12 16:42

developer   ~0003187

It's an installation mistake. pgpool_recovery--1.3.sql is not there. Ask help to someone who did the installation.

raj.pandey1982@gmail.com

2020-02-12 16:57

reporter   ~0003188

i copied th script from installer and did it again but as i have already created function its givng already exist message.
[postgres@pgpool-poc01 lib]$ cp -p /installer/pgpool-II-4.1.0/src/sql/pgpool-recovery/pgpool_recovery--1.3.sql /usr/local/pgsql11.5/share/extension/
[postgres@pgpool-poc01 lib]$ /usr/local/pgsql11.5/bin/psql -h 10.70.184.27 -p 5432 -U postgres -d template1
psql (11.5)
Type "help" for help.

template1=# CREATE EXTENSION pgpool_recovery;
ERROR: function "pgpool_recovery" already exists with same argument types
template1=#

raj.pandey1982@gmail.com

2020-02-12 19:49

reporter   ~0003189

itr ied again after this:-same issue:-
020-02-12 13:42:14 +03 LOG: statement: SELECT pg_current_wal_lsn()
2020-02-12 13:42:14 +03 LOG: statement: SELECT application_name, state, sync_state FROM pg_stat_replication
2020-02-12 13:42:14 +03 LOG: statement: SELECT pg_is_in_recovery()
2020-02-12 13:42:15 +03 LOG: statement: SET statement_timeout To 0
2020-02-12 13:42:15 +03 LOG: statement: SELECT pgpool_recovery('recovery_1st_stage.sh', 'pgpool-poc02.novalocal', '/installer/postgresql-11.5/data', '5432', 1, '5432')
2020-02-12 13:42:15 +03 ERROR: function pgpool_recovery(unknown, unknown, unknown, unknown, integer, unknown) does not exist at character 8
2020-02-12 13:42:15 +03 HINT: No function matches the given name and argument types. You might need to add explicit type casts.
2020-02-12 13:42:15 +03 STATEMENT: SELECT pgpool_recovery('recovery_1st_stage.sh', 'pgpool-poc02.novalocal', '/installer/postgresql-11.5/data', '5432', 1, '5432')
2020-02-12 13:42:19 +03 LOG: statement: SELECT pg_current_wal_lsn()
2020-02-12 13:42:19 +03 LOG: statement: SELECT application_name, state, sync_state FROM pg_stat_replication
2020-02-12 13:42:19 +03 LOG: statement: SELECT pg_is_in_recovery()
2020-02-12 13:42:22 +03 LOG: execute <unnamed>: SET extra_float_digits = 3
2020-02-12 13:42:22 +03 LOG: execute <unnamed>: SET application_name = 'his-syncreport-svc'
2020-02-12 13:42:24 +03 LOG: statement: SELECT pg_current_wal_lsn()
2020-02-12 13:42:24 +03 LOG: statement: SELECT application_name, state, sync_state FROM pg_stat_replication

raj.pandey1982@gmail.com

2020-02-13 02:10

reporter   ~0003190

I changed the failover script from mine one to one PGPOOL4.1 doc example script and then tried to do failover by shutting down DB at node 1:-

Failover happened well (slave promoted to master)but again i could not make connections with PGAdmin consol . I checked below commands and node info command showed newly promoted S Master status as' waiting' for log time and front end connections were not happening. then i checked after 15/20 mints .,tHe newly promoted Master status was showing as UP. But still Remote connections were not happening:-
 
[root@pgpool-poc01 etc]# pcp_node_info --verbose -h 10.70.184.27 -U postgres 0 -w
Hostname : pgpool-poc01.novalocal
Port : 5432
Status : 3
Weight : 0.500000
Status Name : down
Role : standby
Replication Delay : 0
Replication State :
Replication Sync State :
Last Status Change : 2020-02-12 18:59:49
[root@pgpool-poc01 etc]# pcp_node_info --verbose -h 10.70.184.28 -U postgres 1 -w
Hostname : pgpool-poc02.novalocal
Port : 5432
Status : 1
Weight : 0.500000
Status Name : waiting
Role : primary
Replication Delay : 0
Replication State :
Replication Sync State :
Last Status Change : 1970-01-01 03:00:00
[root@pgpool-poc01 etc]# /usr/local/pgsql11.5/bin/psql -h 10.70.184.29 -p 5433 -U postgres -d mawidstg01
psql (11.5)
Type "help" for help.

mawidstg01=# show pool_nodes
mawidstg01-# show pool_nodes;
ERROR: syntax error at or near "show"
LINE 2: show pool_nodes;
        ^
mawidstg01=# ^C
mawidstg01=# show pool_node;
ERROR: unrecognized configuration parameter "pool_node"
mawidstg01=# \q
[root@pgpool-poc01 etc]# /usr/local/pgsql11.5/bin/psql -h 10.70.184.29 -p 5433 -U postgres -d mawidstg01
psql (11.5)
Type "help" for help.

mawidstg01=# show pool_nodes;
 node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_stat
e | last_status_change
---------+------------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+----------------------
--+---------------------
 0 | pgpool-poc01.novalocal | 5432 | down | 0.500000 | standby | 236 | false | 0 | |
  | 2020-02-12 18:59:49
 1 | pgpool-poc02.novalocal | 5432 | up | 0.500000 | primary | 0 | true | 0 | |
  | 2020-02-12 18:59:49
(2 rows)

PGPOOL Log of that time showed:-
2020-02-12 19:04:31: pid 19354:LOG: get_query_result: no rows returned
2020-02-12 19:04:31: pid 19354:DETAIL: node id (1)
2020-02-12 19:04:31: pid 19354:CONTEXT: while checking replication time lag
2020-02-12 19:04:31: pid 19354:LOG: get_query_result falied: status: -1
2020-02-12 19:04:31: pid 19354:CONTEXT: while checking replication time lag

raj.pandey1982@gmail.com

2020-02-13 02:15

reporter   ~0003191

2020-02-12 19:00:05: pid 19354:LOG: get_query_result falied: status: -1

raj.pandey1982@gmail.com

2020-02-13 02:20

reporter   ~0003192

2020-02-12 18:59:50: pid 17238:LOG: child process with pid: 18254 exited with success and will not be restarted
2020-02-12 18:59:50: pid 17238:LOG: child process with pid: 18255 exits with status 256
2020-02-12 18:59:50: pid 17238:LOG: child process with pid: 18255 exited with success and will not be restarted
2020-02-12 18:59:50: pid 17238:LOG: child process with pid: 18256 exits with status 256
2020-02-12 18:59:50: pid 17238:LOG: child process with pid: 18256 exited with success and will not be restarted
2020-02-12 18:59:50: pid 17238:LOG: worker child process with pid: 18259 exits with status 256
2020-02-12 18:59:50: pid 17238:LOG: fork a new worker child process with pid: 19354
2020-02-12 18:59:50: pid 17238:LOG: child process with pid: 18353 exits with status 256
2020-02-12 18:59:50: pid 17238:LOG: fork a new child process with pid: 19355
2020-02-12 18:59:50: pid 19354:LOG: get_query_result: no rows returned
2020-02-12 18:59:50: pid 19354:DETAIL: node id (1)
2020-02-12 18:59:50: pid 19354:CONTEXT: while checking replication time lag
2020-02-12 18:59:50: pid 19354:LOG: get_query_result falied: status: -1
2020-02-12 18:59:50: pid 19354:CONTEXT: while checking replication time lag
2020-02-12 18:59:55: pid 19354:LOG: get_query_result: no rows returned
2020-02-12 18:59:55: pid 19354:DETAIL: node id (1)
2020-02-12 18:59:55: pid 19354:CONTEXT: while checking replication time lag
2020-02-12 18:59:55: pid 19354:LOG: get_query_result falied: status: -1
2020-02-12 18:59:55: pid 19354:CONTEXT: while checking replication time lag
2020-02-12 19:00:00: pid 19354:LOG: get_query_result: no rows returned
2020-02-12 19:00:00: pid 19354:DETAIL: node id (1)
2020-02-12 19:00:00: pid 19354:CONTEXT: while checking replication time lag
2020-02-12 19:00:00: pid 19354:LOG: get_query_result falied: status: -1
2020-02-12 19:00:00: pid 19354:CONTEXT: while checking replication time lag
2020-02-12 19:00:05: pid 19354:LOG: get_query_result: no rows returned
2020-02-12 19:00:05: pid 19354:DETAIL: node id (1)
2020-02-12 19:00:05: pid 19354:CONTEXT: while checking replication time lag
2020-02-12 19:00:05: pid 19354:LOG: get_query_result falied: status: -1

pgpool_log_12feb.zip (630,397 bytes)
postgresdblog-2020-02-12_185142.zip (67,948 bytes)

raj.pandey1982@gmail.com

2020-02-13 02:23

reporter   ~0003193

Please help here. I am sure i am very close to resolution, there must be something you would have for this 'waiting' status of the newly promoted master node to and if after failover this waiting get changed immediately to UP state then remote connection would start happening.

raj.pandey1982@gmail.com

2020-02-13 02:29

reporter   ~0003194

I just restarted the pgpool services at both the nodes and now VIP connections are happening. But why to restart services :-(. it should not be required.pgpool should take care byitslef.

t-ishii

2020-02-13 07:37

developer   ~0003195

> i copied th script from installer and did it again but as i have already created function its givng already exist message.
You cannot do that way. You should do "make install" under src/sql/pgpool-recovery/.

raj.pandey1982@gmail.com

2020-02-16 14:14

reporter   ~0003199

Might you please check the post 0003190-0003193, here i can see after starting instance primary/master be in waiting state and Postgres Admin is not able to make connections. I restart the pgpool 2/3 times and try to connect with PGAdmin then it works. Now after this when i perform DB failover, this looks like again goes in waiting state.

(Note: i dont thing now any issue with pg-recovery function as its already available and i had used binary so was not required to install separately using make command

t-ishii

2020-02-16 14:40

developer   ~0003200

> (Note: i dont thing now any issue with pg-recovery function as its already available and i had used binary so was not required to install separately using make command

No. PostgreSQL's extension consists of not only binaries (.so files) but some control files and so on. If you don't have enough knowledge of extension, you should use the make command to install them. Again I suggest you to test step by step. With random tries you will not get anything closer to the goal.

raj.pandey1982@gmail.com

2020-02-16 15:14

reporter   ~0003201

All thepgpool-recovery libraries and control functions are available :-

/usr/local/pgsql11.5/lib/pgpool-recovery.c
/usr/local/pgsql11.5/lib/pgpool-recovery.so
[root@pgpool-poc01 bin]#
/usr/local/pgsql11.5/share/extension/pgpool_recovery--1.1.sql
/usr/local/pgsql11.5/share/extension/pgpool_recovery--1.3.sql
/usr/local/pgsql11.5/share/extension/pgpool_recovery.control

raj.pandey1982@gmail.com

2020-02-16 19:15

reporter   ~0003202

Its again saying same thing:- what is this error "You might need to add explicit type casts."
2020-02-16 13:07:25 +03 LOG: statement: SET statement_timeout To 0
2020-02-16 13:07:25 +03 LOG: statement: SELECT pgpool_recovery('recovery_1st_stage.sh', 'pgpool-poc02.novalocal', '/installer/postgresql-11.5/data', '5432', 1, '5432')
2020-02-16 13:07:25 +03 ERROR: function pgpool_recovery(unknown, unknown, unknown, unknown, integer, unknown) does not exist at character 8
2020-02-16 13:07:25 +03 HINT: No function matches the given name and argument types. You might need to add explicit type casts.
2020-02-16 13:07:25 +03 STATEMENT: SELECT pgpool_recovery('recovery_1st_stage.sh', 'pgpool-poc02.novalocal', '/installer/postgresql-11.5/data', '5432', 1, '5432')

t-ishii

2020-02-16 20:06

developer   ~0003203

I don't know why you keep on ignoring my suggestions: make install under src/sql.
What does following command show?

psql template1
select * from pg_available_extensions;

If you do not see following line, then pgpool_recovery is not installed correctly.

pgpool_recovery | 1.3 | 1.3 | recovery functions for pgpool-II for V4.1 or later

raj.pandey1982@gmail.com

2020-02-17 01:07

reporter   ~0003204

Sorry Sir,
Here is the output :
[postgres@pgpool-poc01 logs]$ /usr/local/pgsql11.5/bin/psql template1
psql (11.5)
Type "help" for help.

template1=# select * from pg_available_extensions;
        name | default_version | installed_version | comment
--------------------+-----------------+-------------------+-----------------------------------------------------------
 plpgsql | 1.0 | 1.0 | PL/pgSQL procedural language
 pgcrypto | 1.3 | | cryptographic functions
 pg_stat_statements | 1.6 | | track execution statistics of all SQL statements executed
 pg_buffercache | 1.3 | | examine the shared buffer cache
 pgpool_adm | 1.2 | | Administrative functions for pgPool
 pgpool_recovery | 1.3 | | recovery functions for pgpool-II for V4.1 or later
 pgpool_regclass | 1.0 | | replacement for regclass
(7 rows)

template1=#

t-ishii

2020-02-17 12:49

developer   ~0003205

"installed_version" is empty. It is apparent that you did not execute CREATE EXTENSION on template1.

raj.pandey1982@gmail.com

2020-02-17 15:55

reporter   ~0003206

[postgres@pgpool-poc01 ~]$ sudo -u postgres /usr/local/pgsql11.5/bin/psql template1

template1=# CREATE EXTENSION pgpool_recovery;
ERROR: function "pgpool_recovery" already exists with same argument types
template1=#

t-ishii

2020-02-17 16:25

developer   ~0003207

That's because you installed the function without using CREATE EXTENSION command. You need to remove pgpool_recovery function(s) manually from template1 database first.

raj.pandey1982@gmail.com

2020-02-17 16:28

reporter   ~0003208

should i use drop command? drop extension something?

raj.pandey1982@gmail.com

2020-02-17 16:30

reporter   ~0003209

[postgres@pgpool-poc01 sql]$ /usr/local/pgsql11.5/bin/psql template1
psql (11.5)
Type "help" for help.

template1=# DROP EXTENSION pgpool_recovery;
ERROR: extension "pgpool_recovery" does not exist
template1=# select * from pg_available_extensions;
        name | default_version | installed_version | comment
--------------------+-----------------+-------------------+-----------------------------------------------------------
 plpgsql | 1.0 | 1.0 | PL/pgSQL procedural language
 pgcrypto | 1.3 | | cryptographic functions
 pg_stat_statements | 1.6 | | track execution statistics of all SQL statements executed
 pg_buffercache | 1.3 | | examine the shared buffer cache
 pgpool_adm | 1.2 | | Administrative functions for pgPool
 pgpool_recovery | 1.3 | | recovery functions for pgpool-II for V4.1 or later
 pgpool_regclass | 1.0 | | replacement for regclass
(7 rows)

template1=#

raj.pandey1982@gmail.com

2020-02-17 16:46

reporter   ~0003210

Is it about :HINT: No function matches the given name and argument types. You might need to add explicit type casts. ?

ERROR: function pgpool_recovery(unknown, unknown, unknown, unknown, integer, unknown) does not exist
LINE 1: SELECT pgpool_recovery('recovery_1st_stage.sh', 'pgpool-poc...
                ^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
SQL state: 42883
Character: 9

t-ishii

2020-02-17 16:53

developer   ~0003211

> DROP EXTENSION pgpool_recovery;
No, you mistakenly installed pgpool_recovery without using CREATE EXTENSION. So you need to remove them by using DROP FUNCTION command.

raj.pandey1982@gmail.com

2020-02-17 17:03

reporter   ~0003212

template1=# drop FUNCTION pgpool_recovery;
DROP FUNCTION
template1=# CREATE EXTENSION pgpool_recovery;
ERROR: function "pgpool_remote_start" already exists with same argument types
template1=#

t-ishii

2020-02-18 09:11

developer   ~0003214

You need to DROP all functions defined in pgpool_recovery. Check pgpool_recovery--1.3.sql.

raj.pandey1982@gmail.com

2020-02-19 17:31

reporter   ~0003218

[postgres@pgpool-poc01 ~]$ /usr/local/pgsql11.5/bin/psql template1
psql (11.5)
Type "help" for help.

template1=# drop FUNCTION pgpool_recovery;
DROP FUNCTION
template1=# CREATE EXTENSION pgpool_recovery;
ERROR: function "pgpool_remote_start" already exists with same argument types
template1=#
You need to DROP all functions defined in pgpool_recovery. Check pgpool_recovery--1.3.sql.
template1=# drop FUNCTION pgpool_remote_start ;
DROP FUNCTION
template1=# CREATE EXTENSION pgpool_recovery;
CREATE EXTENSION
template1=# select * from pg_available_extensions;
        name | default_version | installed_version | comment
--------------------+-----------------+-------------------+-----------------------------------------------------------
 plpgsql | 1.0 | 1.0 | PL/pgSQL procedural language
 pgcrypto | 1.3 | | cryptographic functions
 pg_stat_statements | 1.6 | | track execution statistics of all SQL statements executed
 pg_buffercache | 1.3 | | examine the shared buffer cache
 pgpool_adm | 1.2 | | Administrative functions for pgPool
 pgpool_recovery | 1.3 | 1.3 | recovery functions for pgpool-II for V4.1 or later
 pgpool_regclass | 1.0 | | replacement for regclass
(7 rows)

Now installed_version is showing for pgpool_recovery which was previously blank.

t-ishii

2020-02-20 08:46

developer   ~0003219

Last edited: 2020-02-20 08:49

View 2 revisions

Ok. Now is the time to try following test:

Shut down postmaster on pgpool-poc02.novalocal . You should see it becomes down status using "show pool_nodes".
>Next test is if you can recover the standby node by using pcp_recovery_node. After issuing pcp_recovery_node command something like:
>
> pcp_recovery_node -h pgpool-poc01.novalocal 1
>
> you should see that pgpool-poc02.novalocal comes back online.

raj.pandey1982@gmail.com

2020-02-23 18:00

reporter   ~0003220

When i would shutdown slave and execute pcp_recovery_node, the below recovery_1st_stage.sh script will execute which will delete slave data files and restore it back using pg_basebackup.
But I have one doubt here:- why we need to cleanup the all slave data ? should the script not just try to sync and fill the gape ?
For Exaple: if i make slave down for 5 mitns then why i have to cleanup all data and restore from scratch!!

> recovery_1st_stage.sh THAT I WOULD BE USING HERE:-
#!/bin/bash
# This script is executed by "recovery_1st_stage" to recovery a Standby node.

set -o xtrace
exec > >(logger -i -p local1.info) 2>&1

PRIMARY_NODE_PGDATA="$1"
DEST_NODE_HOST="$2"
DEST_NODE_PGDATA="$3"
PRIMARY_NODE_PORT="$4"
DEST_NODE_ID="$5"
DEST_NODE_PORT="$6"

PRIMARY_NODE_HOST=$(hostname)
PGHOME=/usr/pgsql-11
ARCHIVEDIR=/var/lib/pgsql/archivedir
REPLUSER=repl

logger -i -p local1.info recovery_1st_stage: start: pg_basebackup for Standby node $DEST_NODE_ID

## Test passwrodless SSH
ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null postgres@${DEST_NODE_HOST} -i ~/.ssh/id_rsa_pgpool ls /tmp > /dev/null

if [ $? -ne 0 ]; then
    logger -i -p local1.info recovery_1st_stage: passwrodless SSH to postgres@${DEST_NODE_HOST} failed. Please setup passwrodless SSH.
    exit 1
fi

## Get PostgreSQL major version
PGVERSION=`${PGHOME}/bin/initdb -V | awk '{print $3}' | sed 's/\..*//' | sed 's/\([0-9]*\)[a-zA-Z].*/\1/'`
if [ $PGVERSION -ge 12 ]; then
    RECOVERYCONF=${DEST_NODE_PGDATA}/myrecovery.conf
else
    RECOVERYCONF=${DEST_NODE_PGDATA}/recovery.conf
fi

## Create replication slot "${DEST_NODE_HOST}"
${PGHOME}/bin/psql -p ${PRIMARY_NODE_PORT} << EOQ
SELECT pg_create_physical_replication_slot('${DEST_NODE_HOST}');
EOQ

## Execute pg_basebackup to recovery Standby node
ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null postgres@$DEST_NODE_HOST -i ~/.ssh/id_rsa_pgpool "

    set -o errexit

    rm -rf $DEST_NODE_PGDATA
    rm -rf $ARCHIVEDIR/*

    ${PGHOME}/bin/pg_basebackup -h $PRIMARY_NODE_HOST -U $REPLUSER -p $PRIMARY_NODE_PORT -D $DEST_NODE_PGDATA -X stream

    if [ ${PGVERSION} -ge 12 ]; then
        sed -i -e \"\\\$ainclude_if_exists = '$(echo ${RECOVERYCONF} | sed -e 's/\//\\\//g')'\" \
               -e \"/^include_if_exists = '$(echo ${RECOVERYCONF} | sed -e 's/\//\\\//g')'/d\" ${DEST_NODE_PGDATA}/postgresql.conf
    fi

    cat > ${RECOVERYCONF} << EOT
primary_conninfo = 'host=${PRIMARY_NODE_HOST} port=${PRIMARY_NODE_PORT} user=${REPLUSER} application_name=${DEST_NODE_HOST} passfile=''/var/lib/pgsql/.pgpass'''
recovery_target_timeline = 'latest'
restore_command = 'scp ${PRIMARY_NODE_HOST}:${ARCHIVEDIR}/%f %p'
primary_slot_name = '${DEST_NODE_HOST}'
EOT

    if [ ${PGVERSION} -ge 12 ]; then
            touch ${DEST_NODE_PGDATA}/standby.signal
    else
            echo \"standby_mode = 'on'\" >> ${RECOVERYCONF}
    fi

    sed -i \"s/#*port = .*/port = ${DEST_NODE_PORT}/\" ${DEST_NODE_PGDATA}/postgresql.conf
"

if [ $? -ne 0 ]; then

    ${PGHOME}/bin/psql -p ${PRIMARY_NODE_PORT} << EOQ
SELECT pg_drop_replication_slot('${DEST_NODE_HOST}');
EOQ

    logger -i -p local1.error recovery_1st_stage: end: pg_basebackup failed. online recovery failed
    exit 1
fi

logger -i -p local1.info recovery_1st_stage: end: recovery_1st_stage complete
exit 0

raj.pandey1982@gmail.com

2020-02-24 16:13

reporter   ~0003221

Hello Friend, might i have an update on this.

Issue History

Date Modified Username Field Change
2020-01-22 23:09 raj.pandey1982@gmail.com New Issue
2020-01-23 00:23 raj.pandey1982@gmail.com Note Added: 0003083
2020-01-26 17:49 raj.pandey1982@gmail.com Note Added: 0003084
2020-01-26 17:55 raj.pandey1982@gmail.com Note Added: 0003085
2020-01-26 18:20 raj.pandey1982@gmail.com Note Added: 0003086
2020-01-27 17:41 raj.pandey1982@gmail.com Note Added: 0003087
2020-01-27 17:45 raj.pandey1982@gmail.com Note Added: 0003088
2020-01-27 19:21 raj.pandey1982@gmail.com Note Added: 0003089
2020-01-27 19:25 raj.pandey1982@gmail.com Note Added: 0003090
2020-01-28 09:12 t-ishii Assigned To => t-ishii
2020-01-28 09:12 t-ishii Status new => assigned
2020-01-28 18:22 t-ishii Note Added: 0003094
2020-01-28 19:01 raj.pandey1982@gmail.com Note Added: 0003095
2020-01-28 19:04 raj.pandey1982@gmail.com Note Added: 0003096
2020-01-29 06:47 raj.pandey1982@gmail.com Note Added: 0003097
2020-01-29 11:49 t-ishii Note Added: 0003099
2020-01-29 11:49 t-ishii Status assigned => feedback
2020-01-29 12:02 t-ishii Note Edited: 0003099 View Revisions
2020-01-29 14:11 t-ishii Note Added: 0003100
2020-01-29 18:27 raj.pandey1982@gmail.com Note Added: 0003101
2020-01-29 18:27 raj.pandey1982@gmail.com Status feedback => assigned
2020-01-29 18:29 raj.pandey1982@gmail.com Note Added: 0003102
2020-01-29 18:33 raj.pandey1982@gmail.com Note Added: 0003103
2020-01-29 20:55 t-ishii Note Added: 0003104
2020-01-29 22:31 raj.pandey1982@gmail.com Note Added: 0003105
2020-01-29 23:29 t-ishii Note Added: 0003107
2020-01-30 04:48 raj.pandey1982@gmail.com Note Added: 0003108
2020-01-30 10:38 t-ishii Note Added: 0003109
2020-01-30 10:38 t-ishii Status assigned => feedback
2020-01-30 15:10 raj.pandey1982@gmail.com Note Added: 0003110
2020-01-30 15:10 raj.pandey1982@gmail.com Status feedback => assigned
2020-01-30 15:32 t-ishii Note Added: 0003111
2020-01-30 18:27 raj.pandey1982@gmail.com Note Added: 0003112
2020-01-30 19:15 raj.pandey1982@gmail.com Note Added: 0003113
2020-01-31 04:59 raj.pandey1982@gmail.com File Added: pgpool.conf
2020-01-31 04:59 raj.pandey1982@gmail.com Note Added: 0003114
2020-02-02 18:23 raj.pandey1982@gmail.com Note Added: 0003115
2020-02-02 22:03 raj.pandey1982@gmail.com Note Added: 0003116
2020-02-04 13:48 t-ishii Note Added: 0003120
2020-02-04 13:52 t-ishii Note Added: 0003121
2020-02-04 14:30 raj.pandey1982@gmail.com Note Added: 0003122
2020-02-04 14:49 t-ishii Note Added: 0003123
2020-02-04 15:58 raj.pandey1982@gmail.com File Added: failover.sh
2020-02-04 15:58 raj.pandey1982@gmail.com File Added: pg_hba.conf
2020-02-04 15:58 raj.pandey1982@gmail.com File Added: pgpool-2.conf
2020-02-04 15:58 raj.pandey1982@gmail.com File Added: pool_hba.conf
2020-02-04 15:58 raj.pandey1982@gmail.com File Added: postgresql.conf
2020-02-04 15:58 raj.pandey1982@gmail.com File Added: promote.sh
2020-02-04 15:58 raj.pandey1982@gmail.com File Added: recovery.conf
2020-02-04 15:58 raj.pandey1982@gmail.com Note Added: 0003124
2020-02-04 16:34 t-ishii Note Added: 0003125
2020-02-04 16:50 raj.pandey1982@gmail.com Note Added: 0003126
2020-02-04 17:13 t-ishii Note Added: 0003127
2020-02-04 17:46 raj.pandey1982@gmail.com Note Added: 0003128
2020-02-04 17:47 raj.pandey1982@gmail.com Note Added: 0003129
2020-02-04 18:03 raj.pandey1982@gmail.com Note Added: 0003130
2020-02-04 18:37 raj.pandey1982@gmail.com Note Added: 0003131
2020-02-05 16:44 raj.pandey1982@gmail.com Note Added: 0003137
2020-02-05 17:45 t-ishii Note Added: 0003138
2020-02-05 17:47 raj.pandey1982@gmail.com Note Added: 0003139
2020-02-05 17:51 raj.pandey1982@gmail.com Note Added: 0003140
2020-02-05 17:52 raj.pandey1982@gmail.com Note Added: 0003141
2020-02-05 21:54 raj.pandey1982@gmail.com Note Added: 0003142
2020-02-06 00:08 t-ishii Note Added: 0003143
2020-02-06 00:15 t-ishii Note Edited: 0003143 View Revisions
2020-02-06 01:59 raj.pandey1982@gmail.com Note Added: 0003144
2020-02-06 02:02 raj.pandey1982@gmail.com Note Added: 0003145
2020-02-06 08:13 t-ishii Note Added: 0003146
2020-02-06 16:33 raj.pandey1982@gmail.com Note Added: 0003147
2020-02-06 16:35 raj.pandey1982@gmail.com Note Added: 0003148
2020-02-06 17:20 raj.pandey1982@gmail.com Note Added: 0003149
2020-02-06 17:35 raj.pandey1982@gmail.com Note Added: 0003150
2020-02-06 17:56 raj.pandey1982@gmail.com Note Added: 0003151
2020-02-07 07:50 t-ishii Note Added: 0003152
2020-02-07 18:00 raj.pandey1982@gmail.com Note Added: 0003153
2020-02-08 19:30 t-ishii Note Added: 0003154
2020-02-08 23:58 raj.pandey1982@gmail.com Note Added: 0003155
2020-02-09 15:24 t-ishii Note Added: 0003156
2020-02-09 15:34 raj.pandey1982@gmail.com Note Added: 0003157
2020-02-09 15:44 t-ishii Note Added: 0003158
2020-02-09 16:37 raj.pandey1982@gmail.com Note Added: 0003159
2020-02-09 19:31 t-ishii Note Added: 0003160
2020-02-09 19:40 raj.pandey1982@gmail.com File Added: mawidstg01-2020-02-09_000000.log
2020-02-09 19:40 raj.pandey1982@gmail.com File Added: mawidstg01-2020-02-09_131237.log
2020-02-09 19:40 raj.pandey1982@gmail.com File Added: pgpool-3.conf
2020-02-09 19:53 raj.pandey1982@gmail.com Note Added: 0003161
2020-02-09 19:55 raj.pandey1982@gmail.com File Added: pgpool.zip
2020-02-10 07:50 t-ishii Note Added: 0003162
2020-02-10 15:09 raj.pandey1982@gmail.com Note Added: 0003164
2020-02-10 17:14 t-ishii Note Added: 0003165
2020-02-10 17:32 raj.pandey1982@gmail.com Note Added: 0003166
2020-02-10 17:34 raj.pandey1982@gmail.com File Added: pgpool_10feb1126am.log
2020-02-10 17:44 raj.pandey1982@gmail.com Note Added: 0003167
2020-02-10 17:48 t-ishii Note Added: 0003168
2020-02-10 20:50 raj.pandey1982@gmail.com File Added: pgpool.log_1126amLogWithRecoveryError
2020-02-11 16:02 raj.pandey1982@gmail.com Note Added: 0003170
2020-02-11 16:28 t-ishii Note Added: 0003171
2020-02-11 19:50 raj.pandey1982@gmail.com File Added: mawidstg01-2020-02-11_133657.log
2020-02-11 19:50 raj.pandey1982@gmail.com Note Added: 0003172
2020-02-11 20:05 t-ishii Note Added: 0003173
2020-02-11 20:52 raj.pandey1982@gmail.com Note Added: 0003174
2020-02-11 20:56 raj.pandey1982@gmail.com Note Added: 0003175
2020-02-11 21:59 t-ishii Note Added: 0003176
2020-02-11 22:09 raj.pandey1982@gmail.com Note Added: 0003177
2020-02-11 22:12 raj.pandey1982@gmail.com Note Added: 0003178
2020-02-11 22:35 raj.pandey1982@gmail.com Note Added: 0003179
2020-02-11 22:45 t-ishii Note Added: 0003180
2020-02-12 15:45 raj.pandey1982@gmail.com Note Added: 0003185
2020-02-12 15:47 raj.pandey1982@gmail.com Note Added: 0003186
2020-02-12 16:42 t-ishii Note Added: 0003187
2020-02-12 16:57 raj.pandey1982@gmail.com Note Added: 0003188
2020-02-12 19:49 raj.pandey1982@gmail.com Note Added: 0003189
2020-02-13 02:10 raj.pandey1982@gmail.com Note Added: 0003190
2020-02-13 02:15 raj.pandey1982@gmail.com Note Added: 0003191
2020-02-13 02:20 raj.pandey1982@gmail.com File Added: pgpool_log_12feb.zip
2020-02-13 02:20 raj.pandey1982@gmail.com File Added: postgresdblog-2020-02-12_185142.zip
2020-02-13 02:20 raj.pandey1982@gmail.com Note Added: 0003192
2020-02-13 02:23 raj.pandey1982@gmail.com Note Added: 0003193
2020-02-13 02:29 raj.pandey1982@gmail.com Note Added: 0003194
2020-02-13 07:37 t-ishii Note Added: 0003195
2020-02-16 14:14 raj.pandey1982@gmail.com Note Added: 0003199
2020-02-16 14:40 t-ishii Note Added: 0003200
2020-02-16 15:14 raj.pandey1982@gmail.com Note Added: 0003201
2020-02-16 19:15 raj.pandey1982@gmail.com Note Added: 0003202
2020-02-16 20:06 t-ishii Note Added: 0003203
2020-02-17 01:07 raj.pandey1982@gmail.com Note Added: 0003204
2020-02-17 12:49 t-ishii Note Added: 0003205
2020-02-17 15:55 raj.pandey1982@gmail.com Note Added: 0003206
2020-02-17 16:25 t-ishii Note Added: 0003207
2020-02-17 16:28 raj.pandey1982@gmail.com Note Added: 0003208
2020-02-17 16:30 raj.pandey1982@gmail.com Note Added: 0003209
2020-02-17 16:46 raj.pandey1982@gmail.com Note Added: 0003210
2020-02-17 16:53 t-ishii Note Added: 0003211
2020-02-17 17:03 raj.pandey1982@gmail.com Note Added: 0003212
2020-02-18 09:11 t-ishii Note Added: 0003214
2020-02-19 17:31 raj.pandey1982@gmail.com Note Added: 0003218
2020-02-20 08:46 t-ishii Note Added: 0003219
2020-02-20 08:49 t-ishii Note Edited: 0003219 View Revisions
2020-02-23 18:00 raj.pandey1982@gmail.com Note Added: 0003220
2020-02-24 16:13 raj.pandey1982@gmail.com Note Added: 0003221
2020-03-02 11:21 t-ishii Status assigned => feedback