View Issue Details

IDProjectCategoryView StatusLast Update
0000726Pgpool-IIBugpublic2022-06-28 12:00
Reporterraj.pandey1982@gmail.com Assigned Tot-ishii  
PriorityhighSeveritymajorReproducibilitysometimes
Status closedResolutionopen 
PlatformcentosOS x86_64 GNU/LinuxOS VersionRedhat7
Product Version4.1.0 
Summary0000726: postmaster on DB node 0 was shutdown by administrative command
Descriptionsome times i see Master DB is not down but failover happened with pgpool log message :" postmaster on DB node 0 was shutdown by administrative command" . SLave became master while Master was nvewr down.

========================Master =Node 1 log================

2021-07-30 09:16:43: pid 424:LOG: fork a new child process with pid: 29330
2021-07-30 09:16:43: pid 424:LOG: child process with pid: 20492 exits with status 256
2021-07-30 09:16:43: pid 424:LOG: fork a new child process with pid: 29331
2021-07-30 09:16:43: pid 424:LOG: child process with pid: 20955 exits with status 256
2021-07-30 09:16:43: pid 424:LOG: fork a new child process with pid: 29332
2021-07-30 10:47:47: pid 434:LOG: read from socket failed, remote end closed the connection
2021-07-30 10:47:47: pid 434:LOG: client socket of mwdp3prdds01.moh.gov.sa:5433 Linux mwdp3prdds01.moh.gov.sa is closed
2021-07-30 10:47:47: pid 434:LOG: remote node "mwdp3prdds01.moh.gov.sa:5433 Linux mwdp3prdds01.moh.gov.sa" is shutting down
2021-07-30 10:47:47: pid 434:LOG: removing watchdog node "mwdp3prdds01.moh.gov.sa:5433 Linux mwdp3prdds01.moh.gov.sa" from the standby list
2021-07-30 10:47:47: pid 424:LOG: Pgpool-II parent process received watchdog quorum change signal from watchdog
2021-07-30 10:47:47: pid 434:LOG: new IPC connection received
2021-07-30 10:47:47: pid 424:LOG: watchdog cluster now holds the quorum
2021-07-30 10:47:47: pid 424:DETAIL: updating the state of quarantine backend nodes
2021-07-30 10:47:47: pid 434:LOG: new IPC connection received
2021-07-30 10:47:51: pid 434:LOG: Watchdog is shutting down
2021-07-30 10:47:51: pid 397:LOG: watchdog: de-escalation started
2021-07-30 10:47:51: pid 397:LOG: successfully released the delegate IP:"10.70.185.66"
2021-07-30 10:47:51: pid 397:DETAIL: 'if_down_cmd' returned with success
2021-07-30 10:48:08: pid 542:WARNING: checking setuid bit of if_up_cmd
2021-07-30 10:48:08: pid 542:DETAIL: ifup[/sbin/ip] doesn't have setuid bit
2021-07-30 10:48:08: pid 542:WARNING: checking setuid bit of if_down_cmd
2021-07-30 10:48:08: pid 542:DETAIL: ifdown[/sbin/ip] doesn't have setuid bit
2021-07-30 10:48:08: pid 542:WARNING: checking setuid bit of arping command
2021-07-30 10:48:08: pid 542:DETAIL: arping[/usr/sbin/arping] doesn't have setuid bit
2021-07-30 10:48:08: pid 542:LOG: Backend status file /var/log/pgpool/pgpool_status discarded
2021-07-30 10:48:08: pid 542:LOG: memory cache initialized
2021-07-30 10:48:08: pid 542:DETAIL: memcache blocks :64
2021-07-30 10:48:08: pid 542:LOG: pool_discard_oid_maps: discarded memqcache oid maps
2021-07-30 10:48:09: pid 542:LOG: waiting for watchdog to initialize
2021-07-30 10:48:09: pid 545:LOG: setting the local watchdog node name to "mwdp3prddm01.moh.gov.sa:5433 Linux mwdp3prddm01.moh.gov.sa"
2021-07-30 10:48:09: pid 545:LOG: watchdog cluster is configured with 1 remote nodes
2021-07-30 10:48:09: pid 545:LOG: watchdog remote node:0 on mwdp3prdds01.moh.gov.sa:9000
2021-07-30 10:48:09: pid 545:LOG: interface monitoring is disabled in watchdog



2021-07-30 10:48:09: pid 545:LOG: watchdog node state changed from [DEAD] to [LOADING]
2021-07-30 10:48:14: pid 545:LOG: watchdog node state changed from [LOADING] to [JOINING]
2021-07-30 10:48:18: pid 545:LOG: watchdog node state changed from [JOINING] to [INITIALIZING]
2021-07-30 10:48:19: pid 545:LOG: I am the only alive node in the watchdog cluster
2021-07-30 10:48:19: pid 545:HINT: skipping stand for coordinator state
2021-07-30 10:48:19: pid 545:LOG: watchdog node state changed from [INITIALIZING] to [MASTER]
2021-07-30 10:48:19: pid 545:LOG: I am announcing my self as master/coordinator watchdog node
2021-07-30 10:48:23: pid 545:LOG: I am the cluster leader node
2021-07-30 10:48:23: pid 545:DETAIL: our declare coordinator message is accepted by all nodes
2021-07-30 10:48:23: pid 545:LOG: setting the local node "mwdp3prddm01.moh.gov.sa:5433 Linux mwdp3prddm01.moh.gov.sa" as watchdog cluster master
2021-07-30 10:48:23: pid 545:LOG: I am the cluster leader node. Starting escalation process
2021-07-30 10:48:23: pid 542:LOG: watchdog process is initialized
2021-07-30 10:48:23: pid 542:DETAIL: watchdog messaging data version: 1.1
2021-07-30 10:48:23: pid 545:LOG: escalation process started with PID:577
2021-07-30 10:48:23: pid 577:LOG: watchdog: escalation started
2021-07-30 10:48:23: pid 545:LOG: new IPC connection received
2021-07-30 10:48:23: pid 545:LOG: new IPC connection received
2021-07-30 10:48:23: pid 578:LOG: 2 watchdog nodes are configured for lifecheck
2021-07-30 10:48:23: pid 578:LOG: watchdog nodes ID:0 Name:"mwdp3prddm01.moh.gov.sa:5433 Linux mwdp3prddm01.moh.gov.sa"
2021-07-30 10:48:23: pid 578:DETAIL: Host:"mwdp3prddm01.moh.gov.sa" WD Port:9000 pgpool-II port:5433
2021-07-30 10:48:23: pid 578:LOG: watchdog nodes ID:1 Name:"Not_Set"
2021-07-30 10:48:23: pid 578:DETAIL: Host:"mwdp3prdds01.moh.gov.sa" WD Port:9000 pgpool-II port:5433
2021-07-30 10:48:23: pid 542:LOG: Setting up socket for 0.0.0.0:5433
2021-07-30 10:48:23: pid 542:LOG: Setting up socket for :::5433
2021-07-30 10:48:23: pid 578:LOG: watchdog lifecheck trusted server "mohvcasdb01.novalocal" added for the availability check
2021-07-30 10:48:23: pid 578:LOG: watchdog lifecheck trusted server "castestdb01.novalocal" added for the availability check
2021-07-30 10:48:24: pid 580:LOG: createing watchdog heartbeat receive socket.
2021-07-30 10:48:24: pid 580:DETAIL: bind receive socket to device: "eth0"
2021-07-30 10:48:24: pid 580:LOG: set SO_REUSEPORT option to the socket
2021-07-30 10:48:24: pid 580:LOG: creating watchdog heartbeat receive socket.
2021-07-30 10:48:24: pid 580:DETAIL: set SO_REUSEPORT
2021-07-30 10:48:24: pid 581:LOG: creating socket for sending heartbeat
2021-07-30 10:48:24: pid 581:DETAIL: bind send socket to device: eth0
2021-07-30 10:48:24: pid 581:LOG: set SO_REUSEPORT option to the socket
2021-07-30 10:48:24: pid 581:LOG: creating socket for sending heartbeat
2021-07-30 10:48:24: pid 581:DETAIL: set SO_REUSEPORT
2021-07-30 10:48:26: pid 542:LOG: find_primary_node_repeatedly: waiting for finding a primary node
2021-07-30 10:48:26: pid 542:LOG: find_primary_node: primary node is 0
2021-07-30 10:48:26: pid 542:LOG: find_primary_node: standby node is 1
2021-07-30 10:48:26: pid 5181:LOG: PCP process: 5181 started
2021-07-30 10:48:27: pid 542:LOG: pgpool-II successfully started. version 4.1.0 (karasukiboshi)
2021-07-30 10:48:27: pid 542:LOG: node status[0]: 1


2021-07-30 10:48:27: pid 542:LOG: node status[1]: 2
2021-07-30 10:48:27: pid 577:LOG: successfully acquired the delegate IP:"10.70.185.66"
2021-07-30 10:48:27: pid 577:DETAIL: 'if_up_cmd' returned with success
2021-07-30 10:48:27: pid 545:LOG: watchdog escalation process with pid: 577 exit with SUCCESS.
2021-07-30 10:49:08: pid 545:LOG: new watchdog node connection is received from "10.70.185.63:44727"
2021-07-30 10:49:08: pid 545:LOG: new node joined the cluster hostname:"mwdp3prdds01.moh.gov.sa" port:9000 pgpool_port:5433
2021-07-30 10:49:08: pid 545:DETAIL: Pgpool-II version:"4.1.0" watchdog messaging version: 1.1
2021-07-30 10:49:08: pid 545:LOG: new outbound connection to mwdp3prdds01.moh.gov.sa:9000
2021-07-30 10:49:14: pid 545:LOG: adding watchdog node "mwdp3prdds01.moh.gov.sa:5433 Linux mwdp3prdds01.moh.gov.sa" to the standby list
2021-07-30 10:49:14: pid 542:LOG: Pgpool-II parent process received watchdog quorum change signal from watchdog
2021-07-30 10:49:14: pid 545:LOG: new IPC connection received
2021-07-30 10:49:14: pid 542:LOG: watchdog cluster now holds the quorum
2021-07-30 10:49:14: pid 542:DETAIL: updating the state of quarantine backend nodes
2021-07-30 10:49:14: pid 545:LOG: new IPC connection received
2021-07-30 10:49:51: pid 5136:LOG: pool_reuse_block: blockid: 0
2021-07-30 10:49:51: pid 5136:CONTEXT: while searching system catalog, When relcache is missed
2021-07-30 10:50:01: pid 2153:LOG: reading and processing packets
2021-07-30 10:50:01: pid 2153:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2021-07-30 10:50:01: pid 2153:LOG: received degenerate backend request for node_id: 0 from pid [2153]
2021-07-30 10:50:01: pid 5136:LOG: reading and processing packets
2021-07-30 10:50:01: pid 5136:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2021-07-30 10:50:01: pid 5136:LOG: received degenerate backend request for node_id: 0 from pid [5136]
2021-07-30 10:50:01: pid 545:LOG: new IPC connection received
2021-07-30 10:50:01: pid 545:LOG: new IPC connection received
2021-07-30 10:50:01: pid 545:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2021-07-30 10:50:01: pid 545:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2021-07-30 10:50:01: pid 545:LOG: we have got the consensus to perform the failover
2021-07-30 10:50:01: pid 545:DETAIL: 1 node(s) voted in the favor
2021-07-30 10:50:01: pid 545:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2021-07-30 10:50:01: pid 545:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2021-07-30 10:50:01: pid 545:LOG: we have got the consensus to perform the failover
2021-07-30 10:50:01: pid 545:DETAIL: 1 node(s) voted in the favor
2021-07-30 10:50:01: pid 4117:LOG: reading and processing packets
2021-07-30 10:50:01: pid 4117:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2021-07-30 10:50:01: pid 4117:LOG: received degenerate backend request for node_id: 0 from pid [4117]
2021-07-30 10:50:01: pid 545:LOG: new IPC connection received
2021-07-30 10:50:01: pid 5147:LOG: reading and processing packets
2021-07-30 10:50:01: pid 5147:DETAIL: postmaster on DB node 0 was shutdown by administrative command
2021-07-30 10:50:01: pid 5147:LOG: received degenerate backend request for node_id: 0 from pid [5147]
2021-07-30 10:50:01: pid 545:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2021-07-30 10:50:01: pid 542:LOG: Pgpool-II parent process has received failover request
2021-07-30 10:50:01: pid 545:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface


2021-07-30 10:50:01: pid 545:LOG: we have got the consensus to perform the failover
2021-07-30 10:50:01: pid 545:DETAIL: 1 node(s) voted in the favor
2021-07-30 10:50:01: pid 545:LOG: new IPC connection received
2021-07-30 10:50:01: pid 545:LOG: new IPC connection received
2021-07-30 10:50:01: pid 545:LOG: watchdog received the failover command from local pgpool-II on IPC interface
2021-07-30 10:50:01: pid 545:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2021-07-30 10:50:01: pid 545:LOG: we have got the consensus to perform the failover
2021-07-30 10:50:01: pid 545:DETAIL: 1 node(s) voted in the favor
2021-07-30 10:50:01: pid 545:LOG: received the failover indication from Pgpool-II on IPC interface
2021-07-30 10:50:01: pid 545:LOG: watchdog is informed of failover start by the main process
2021-07-30 10:50:01: pid 542:LOG: starting degeneration. shutdown host mwdp3prddm01.moh.gov.sa(5432)
2021-07-30 10:50:01: pid 542:LOG: Restart all children
2021-07-30 10:50:01: pid 542:LOG: execute command: /etc/pgpool-II/failover.sh 0 mwdp3prddm01.moh.gov.sa 5432 /installer/postgresql-11.5/data 1 mwdp3prdds01.moh.gov.sa 0 0 5432 /installer/postgresql-11.5/data mwdp3prddm01.moh.gov.sa 5432
+ exec
++ logger -i -p local1.info
2021-07-30 10:50:03: pid 542:LOG: find_primary_node_repeatedly: waiting for finding a primary node
2021-07-30 10:50:03: pid 542:LOG: find_primary_node: primary node is 1
2021-07-30 10:50:03: pid 542:LOG: failover: set new primary node: 1
2021-07-30 10:50:03: pid 542:LOG: failover: set new master node: 1
TagsNo tags attached.

Activities

raj.pandey1982@gmail.com

2021-07-30 20:15

reporter   ~0003910

ALso :-
With this we are frequntly getting cpu utilization alerts..to prevent cpu utilization we are suppose to kill idle sessions..however when we fired :"select pg_terminate_backend(pid) from pg_stat_activity where pid <> pg_backend_pid() AND state in ('idle') and usename NOT IN ('replication') and state_change >= current_timestamp - interval '10 minutes' "

automatic failover is happening and getting below message in pgpool logs..

: DETAIL: postmaster on DB node 0 was shutdown by administrative command
pgpool.conf (42,293 bytes)   
# ----------------------------
# pgPool-II configuration file
# ----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# Whitespace may be used.  Comments are introduced with "#" anywhere on a line.
# The complete list of parameter names and allowed values can be found in the
# pgPool-II documentation.
#
# This file is read on server startup and when the server receives a SIGHUP
# signal.  If you edit the file on a running system, you have to SIGHUP the
# server for the changes to take effect, or use "pgpool reload".  Some
# parameters, which are marked below, require a server shutdown and restart to
# take effect.
#


#------------------------------------------------------------------------------
# CONNECTIONS
#------------------------------------------------------------------------------

# - pgpool Connection Settings -

listen_addresses = '*'
                                   # Host name or IP address to listen on:
                                   # '*' for all, '' for no TCP/IP connections
                                   # (change requires restart)
port = 5433
                                   # Port number
                                   # (change requires restart)
socket_dir = '/var/run/postgresql'
                                   # Unix domain socket path
                                   # The Debian package defaults to
                                   # /var/run/postgresql
                                   # (change requires restart)
listen_backlog_multiplier = 2
                                   # Set the backlog parameter of listen(2) to
                                   # num_init_children * listen_backlog_multiplier.
                                   # (change requires restart)
serialize_accept = off
                                   # whether to serialize accept() call to avoid thundering herd problem
                                   # (change requires restart)

# - pgpool Communication Manager Connection Settings -

pcp_listen_addresses = '*'
                                   # Host name or IP address for pcp process to listen on:
                                   # '*' for all, '' for no TCP/IP connections
                                   # (change requires restart)
pcp_port = 9898
                                   # Port number for pcp
                                   # (change requires restart)
pcp_socket_dir = '/var/run/postgresql'
                                   # Unix domain socket path for pcp
                                   # The Debian package defaults to
                                   # /var/run/postgresql
                                   # (change requires restart)

# - Backend Connection Settings -

                                   # Host name or IP address to connect to for backend 0
                                   # Port number for backend 0
                                   # Weight for backend 0 (only in load balancing mode)
                                   # Data directory for backend 0
                                   # Controls various backend behavior
                                   # ALLOW_TO_FAILOVER, DISALLOW_TO_FAILOVER
				   # or ALWAYS_MASTER

# - Authentication -

enable_pool_hba = on               
                                   # Use pool_hba.conf for client authentication
pool_passwd = 'pool_passwd'
                                   # File name of pool_passwd for md5 authentication.
                                   # "" disables pool_passwd.
                                   # (change requires restart)
authentication_timeout = 60
                                   # Delay in seconds to complete client authentication
                                   # 0 means no timeout.

allow_clear_text_frontend_auth = off
                                   # Allow Pgpool-II to use clear text password authentication
                                   # with clients, when pool_passwd does not
                                   # contain the user password


# - SSL Connections -

ssl = off
                                   # Enable SSL support
                                   # (change requires restart)
#ssl_key = './server.key'
                                   # Path to the SSL private key file
                                   # (change requires restart)
#ssl_cert = './server.cert'
                                   # Path to the SSL public certificate file
                                   # (change requires restart)
#ssl_ca_cert = ''
                                   # Path to a single PEM format file
                                   # containing CA root certificate(s)
                                   # (change requires restart)
#ssl_ca_cert_dir = ''
                                   # Directory containing CA root certificate(s)
                                   # (change requires restart)

ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL'
                                   # Allowed SSL ciphers
                                   # (change requires restart)
ssl_prefer_server_ciphers = off
                                   # Use server's SSL cipher preferences,
                                   # rather than the client's
                                   # (change requires restart)
#------------------------------------------------------------------------------
# POOLS
#------------------------------------------------------------------------------

# - Concurrent session and pool size -

num_init_children = 4000
#num_init_children = 2975
#num_init_children = 975
                                   # Number of concurrent sessions allowed
                                   # (change requires restart)
max_pool = 1
#max_pool = 3
#max_pool = 10
#max_pool = 4
                                   # Number of connection pool caches per connection
                                   # (change requires restart)

# - Life time -

child_life_time = 300
                                   # Pool exits after being idle for this many seconds
child_max_connections = 0
                                   # Pool exits after receiving that many connections
                                   # 0 means no exit
#connection_life_time = 0
connection_life_time = 300
                                   # Connection to backend closes after being idle for this many seconds
                                   # 0 means no close
#client_idle_limit = 0
                                   # Client is disconnected after being idle for that many seconds
                                   # (even inside an explicit transactions!)
                                   # 0 means no disconnection

reserved_connections = 1
#------------------------------------------------------------------------------
# LOGS
#------------------------------------------------------------------------------

# - Where to log -

log_destination = 'stderr'
                                   # Where to log
                                   # Valid values are combinations of stderr,
                                   # and syslog. Default to stderr.

# - What to log -

log_line_prefix = '%t: pid %p:'

log_connections = off
                                   # Log connections
log_hostname = off
                                   # Hostname will be shown in ps status
                                   # and in logs if connections are logged
log_statement = off
                                   # Log all statements
log_per_node_statement = off
                                   # Log all statements
                                   # with node and backend informations
log_client_messages = off
                                   # Log any client messages
log_standby_delay = 'none'
                                   # Log standby delay
                                   # Valid values are combinations of always,
                                   # if_over_threshold, none

# - Syslog specific -

syslog_facility = 'LOCAL0'
                                   # Syslog local facility. Default to LOCAL0
syslog_ident = 'pgpool'
                                   # Syslog program identification string
                                   # Default to 'pgpool'

# - Debug -

#log_error_verbosity = default          # terse, default, or verbose messages

#client_min_messages = notice           # values in order of decreasing detail:
                                        #   debug5
                                        #   debug4
                                        #   debug3
                                        #   debug2
                                        #   debug1
                                        #   log
                                        #   notice
                                        #   warning
                                        #   error

#log_min_messages = warning             # values in order of decreasing detail:
                                        #   debug5
                                        #   debug4
                                        #   debug3
                                        #   debug2
                                        #   debug1
                                        #   info
                                        #   notice
                                        #   warning
                                        #   error
                                        #   log
                                        #   fatal
                                        #   panic

#------------------------------------------------------------------------------
# FILE LOCATIONS
#------------------------------------------------------------------------------

pid_file_name = '/var/run/postgresql/pgpool.pid'
                                   # PID file name
                                   # Can be specified as relative to the"
                                   # location of pgpool.conf file or
                                   # as an absolute path
                                   # (change requires restart)
logdir = '/var/log/pgpool'
                                   # Directory of pgPool status file
                                   # (change requires restart)


#------------------------------------------------------------------------------
# CONNECTION POOLING
#------------------------------------------------------------------------------

connection_cache = on
                                   # Activate connection pools
                                   # (change requires restart)

                                   # Semicolon separated list of queries
                                   # to be issued at the end of a session
                                   # The default is for 8.3 and later
reset_query_list = 'ABORT; DISCARD ALL'
                                   # The following one is for 8.2 and before
#reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT'


#------------------------------------------------------------------------------
# REPLICATION MODE
#------------------------------------------------------------------------------

replication_mode = off
                                   # Activate replication mode
                                   # (change requires restart)
replicate_select = off
                                   # Replicate SELECT statements
                                   # when in replication mode
                                   # replicate_select is higher priority than
                                   # load_balance_mode.

insert_lock = on
                                   # Automatically locks a dummy row or a table
                                   # with INSERT statements to keep SERIAL data
                                   # consistency
                                   # Without SERIAL, no lock will be issued
lobj_lock_table = ''
                                   # When rewriting lo_creat command in
                                   # replication mode, specify table name to
                                   # lock

# - Degenerate handling -

replication_stop_on_mismatch = off
                                   # On disagreement with the packet kind
                                   # sent from backend, degenerate the node
                                   # which is most likely "minority"
                                   # If off, just force to exit this session

failover_if_affected_tuples_mismatch = off
                                   # On disagreement with the number of affected
                                   # tuples in UPDATE/DELETE queries, then
                                   # degenerate the node which is most likely
                                   # "minority".
                                   # If off, just abort the transaction to
                                   # keep the consistency


#------------------------------------------------------------------------------
# LOAD BALANCING MODE
#------------------------------------------------------------------------------
#load_balance_mode = off
load_balance_mode = on
                                   # Activate load balancing mode
                                   # (change requires restart)
ignore_leading_white_space = on
                                   # Ignore leading white spaces of each query
white_function_list = ''
                                   # Comma separated list of function names
                                   # that don't write to database
                                   # Regexp are accepted
black_function_list = 'currval,lastval,nextval,setval,Get_Appt_code,walkin_appointment_token_no,findAppointmentFacilityTypeIdNew,findAppointmentReferUrgencyType,walkin_appointment_token_no,walkin_appointment_token_no,Reject_Dependent_Request,Delete_Dependent_Request'
#black_function_list = 'currval,lastval,nextval,setval'
                                   # Comma separated list of function names
                                   # that write to database
                                   # Regexp are accepted

black_query_pattern_list = ''
                                   # Semicolon separated list of query patterns
                                   # that should be sent to primary node
                                   # Regexp are accepted
                                   # valid for streaming replicaton mode only.

database_redirect_preference_list = ''
                                   # comma separated list of pairs of database and node id.
                                   # example: postgres:primary,mydb[0-4]:1,mydb[5-9]:2'
                                   # valid for streaming replicaton mode only.
app_name_redirect_preference_list = 'cas-schedule-mgmt-svc:primary'
#app_name_redirect_preference_list = ''
                                   # comma separated list of pairs of app name and node id.
                                   # example: 'psql:primary,myapp[0-4]:1,myapp[5-9]:standby'
                                   # valid for streaming replicaton mode only.
allow_sql_comments = off
                                   # if on, ignore SQL comments when judging if load balance or
                                   # query cache is possible.
                                   # If off, SQL comments effectively prevent the judgment
                                   # (pre 3.4 behavior).

disable_load_balance_on_write = 'transaction'
                                   # Load balance behavior when write query is issued
                                   # in an explicit transaction.
                                   # Note that any query not in an explicit transaction
                                   # is not affected by the parameter.
                                   # 'transaction' (the default): if a write query is issued,
                                   # subsequent read queries will not be load balanced
                                   # until the transaction ends.
                                   # 'trans_transaction': if a write query is issued,
                                   # subsequent read queries in an explicit transaction
                                   # will not be load balanced until the session ends.
                                   # 'always': if a write query is issued, read queries will
                                   # not be load balanced until the session ends.

#------------------------------------------------------------------------------
# MASTER/SLAVE MODE
#------------------------------------------------------------------------------

master_slave_mode = on
                                   # Activate master/slave mode
                                   # (change requires restart)
master_slave_sub_mode = 'stream'
                                   # Master/slave sub mode
                                   # Valid values are combinations stream, slony
                                   # or logical. Default is stream.
                                   # (change requires restart)

# - Streaming -

sr_check_period = 5                
                                   # Streaming replication check period
                                   # Disabled (0) by default
#sr_check_user = 'postgres'
sr_check_user = 'replication'
                                   # Streaming replication check user
                                   # This is necessary even if you disable
                                   # streaming replication delay check with
                                   # sr_check_period = 0
#sr_check_password = 'postgrestg'
sr_check_password = 'reppassword'
                                   # Password for streaming replication check user.
                                   # Leaving it empty will make Pgpool-II to first look for the
                                   # Password in pool_passwd file before using the empty password
sr_check_database = 'postgres'
#sr_check_database = 'postgres'
                                   # Database name for streaming replication check
delay_threshold = 0
                                   # Threshold before not dispatching query to standby node
                                   # Unit is in bytes
                                   # Disabled (0) by default

# - Special commands -

follow_master_command = ''
                                   # Executes this command after master failover
                                   # Special values:
                                   #   %d = node id
                                   #   %h = host name
                                   #   %p = port number
                                   #   %D = database cluster path
                                   #   %m = new master node id
                                   #   %H = hostname of the new master node
                                   #   %M = old master node id
                                   #   %P = old primary node id
                                   #   %r = new master port number
                                   #   %R = new master database cluster path
                                   #   %% = '%' character

#------------------------------------------------------------------------------
# HEALTH CHECK GLOBAL PARAMETERS
#------------------------------------------------------------------------------

health_check_period = 5
                                   # Health check period
                                   # Disabled (0) by default
health_check_timeout = 50
#health_check_timeout = 10
#health_check_timeout = 0
                                   # Health check timeout
                                   # 0 means no timeout
health_check_user = 'postgres'
                                   # Health check user
health_check_password = ''
                                   # Password for health check user
                                   # Leaving it empty will make Pgpool-II to first look for the
                                   # Password in pool_passwd file before using the empty password

health_check_database = 'postgres'
                                   # Database name for health check. If '', tries 'postgres' frist, then 'template1'
health_check_max_retries = 5
#health_check_max_retries = 3
#health_check_max_retries = 0
                                   # Maximum number of times to retry a failed health check before giving up.
health_check_retry_delay = 1
                                   # Amount of time to wait (in seconds) between retries.
connect_timeout = 50000
#connect_timeout = 10000
                                   # Timeout value in milliseconds before giving up to connect to backend.
                                   # Default is 10000 ms (10 second). Flaky network user may want to increase
                                   # the value. 0 means no timeout.
                                   # Note that this value is not only used for health check,
                                   # but also for ordinary conection to backend.

#------------------------------------------------------------------------------
# HEALTH CHECK PER NODE PARAMETERS (OPTIONAL)
#------------------------------------------------------------------------------
#health_check_period0 = 0
#health_check_timeout0 = 20
#health_check_user0 = 'nobody'
#health_check_password0 = ''
#health_check_database0 = ''
#health_check_max_retries0 = 0
#health_check_retry_delay0 = 1
#connect_timeout0 = 10000

#------------------------------------------------------------------------------
# FAILOVER AND FAILBACK
#------------------------------------------------------------------------------
#failover_command = '/usr/share/pgpool/4.1.0/etc/failover.sh %d %P %H postgrestg /installer/postgresql-11.5/data/im_the_master'
failover_command = '/etc/pgpool-II/failover.sh %d %h %p %D %m %H %M %P %r %R %N %S'
#failover_command = '/etc/pgpool-II/failover.sh %d %P %H reppassword /installer/postgresql-11.5/data/im_the_master'
                                   # Executes this command at failover
                                   # Special values:
                                   #   %d = node id
                                   #   %h = host name
                                   #   %p = port number
                                   #   %D = database cluster path
                                   #   %m = new master node id
                                   #   %H = hostname of the new master node
                                   #   %M = old master node id
                                   #   %P = old primary node id
                                   #   %r = new master port number
                                   #   %R = new master database cluster path
                                   #   %% = '%' character
failback_command = ''
                                   # Executes this command at failback.
                                   # Special values:
                                   #   %d = node id
                                   #   %h = host name
                                   #   %p = port number
                                   #   %D = database cluster path
                                   #   %m = new master node id
                                   #   %H = hostname of the new master node
                                   #   %M = old master node id
                                   #   %P = old primary node id
                                   #   %r = new master port number
                                   #   %R = new master database cluster path
                                   #   %% = '%' character

failover_on_backend_error = off
                                   # Initiates failover when reading/writing to the
                                   # backend communication socket fails
                                   # If set to off, pgpool will report an
                                   # error and disconnect the session.
#detach_false_primary = on
detach_false_primary = off
                                   # Detach false primary if on. Only
                                   # valid in streaming replicaton
                                   # mode and with PostgreSQL 9.6 or
                                   # after.
#search_primary_node_timeout = 10
search_primary_node_timeout = 300
                                   # Timeout in seconds to search for the
                                   # primary node when a failover occurs.
                                   # 0 means no timeout, keep searching
                                   # for a primary node forever.

#------------------------------------------------------------------------------
# ONLINE RECOVERY
#------------------------------------------------------------------------------

recovery_user = 'postgres'
                                   # Online recovery user
recovery_password = ''
                                   # Online recovery password
                                   # Leaving it empty will make Pgpool-II to first look for the
                                   # Password in pool_passwd file before using the empty password

recovery_1st_stage_command = 'recovery_1st_stage.sh'
                                   # Executes a command in first stage
recovery_2nd_stage_command = ''
                                   # Executes a command in second stage
recovery_timeout = 90
                                   # Timeout in seconds to wait for the
                                   # recovering node's postmaster to start up
                                   # 0 means no wait
client_idle_limit_in_recovery = 0
                                   # Client is disconnected after being idle
                                   # for that many seconds in the second stage
                                   # of online recovery
                                   # 0 means no disconnection
                                   # -1 means immediate disconnection


#------------------------------------------------------------------------------
# WATCHDOG
#------------------------------------------------------------------------------

# - Enabling -

use_watchdog = on
                                    # Activates watchdog
                                    # (change requires restart)

# -Connection to up stream servers -
trusted_servers = 'mohvcasdb01.novalocal,castestdb01.novalocal'
                                    # trusted server list which are used
                                    # to confirm network connection
                                    # (hostA,hostB,hostC,...)
                                    # (change requires restart)
ping_path = '/bin'
                                    # ping command path
                                    # (change requires restart)

# - Watchdog communication Settings -

wd_hostname = 'mwdp3prddm01.moh.gov.sa'
                                    # Host name or IP address of this watchdog
                                    # (change requires restart)
wd_port = 9000
                                    # port number for watchdog service
                                    # (change requires restart)
wd_priority = 1
                                    # priority of this watchdog in leader election
                                    # (change requires restart)

wd_authkey = ''
                                    # Authentication key for watchdog communication
                                    # (change requires restart)

wd_ipc_socket_dir = '/var/run/postgresql'
                                    # Unix domain socket path for watchdog IPC socket
                                    # The Debian package defaults to
                                    # /var/run/postgresql
                                    # (change requires restart)


# - Virtual IP control Setting -

delegate_IP = '10.70.185.66'
                                    # delegate IP address
                                    # If this is empty, virtual IP never bring up.
                                    # (change requires restart)
if_cmd_path = '/sbin'
                                    # path to the directory where if_up/down_cmd exists 
                                    # (change requires restart)
if_up_cmd = 'ip addr add $_IP_$/24 dev eth0 label eth0:0'
                                    # startup delegate IP command
                                    # (change requires restart)
if_down_cmd = 'ip addr del $_IP_$/24 dev eth0'
                                    # shutdown delegate IP command
                                    # (change requires restart)
arping_path = '/usr/sbin'
                                    # arping command path
                                    # (change requires restart)
arping_cmd = 'arping -U $_IP_$ -w 1 -I eth0'
                                    # arping command
                                    # (change requires restart)

# - Behaivor on escalation Setting -

clear_memqcache_on_escalation = on
                                    # Clear all the query cache on shared memory
                                    # when standby pgpool escalate to active pgpool
                                    # (= virtual IP holder).
                                    # This should be off if client connects to pgpool
                                    # not using virtual IP.
                                    # (change requires restart)
wd_escalation_command = ''
                                    # Executes this command at escalation on new active pgpool.
                                    # (change requires restart)
wd_de_escalation_command = ''
                                    # Executes this command when master pgpool resigns from being master.
                                    # (change requires restart)

# - Watchdog consensus settings for failover -

failover_when_quorum_exists = on
                                    # Only perform backend node failover
                                    # when the watchdog cluster holds the quorum
                                    # (change requires restart)

failover_require_consensus = on
                                    # Perform failover when majority of Pgpool-II nodes
                                    # aggrees on the backend node status change
                                    # (change requires restart)

allow_multiple_failover_requests_from_node = off
                                    # A Pgpool-II node can cast multiple votes
                                    # for building the consensus on failover
                                    # (change requires restart)

# - Lifecheck Setting -

# -- common --

wd_monitoring_interfaces_list = ''
                                    # if any interface from the list is active the watchdog will
                                    # consider the network is fine
                                    # 'any' to enable monitoring on all interfaces except loopback
                                    # '' to disable monitoring
                                    # (change requires restart)


wd_lifecheck_method = 'heartbeat'
                                    # Method of watchdog lifecheck ('heartbeat' or 'query' or 'external')
                                    # (change requires restart)
wd_interval = 3                     
                                    # lifecheck interval (sec) > 0
                                    # (change requires restart)

# -- heartbeat mode --

wd_heartbeat_port = 9694
                                    # Port number for receiving heartbeat signal
                                    # (change requires restart)
wd_heartbeat_keepalive = 2
                                    # Interval time of sending heartbeat signal (sec)
                                    # (change requires restart)
wd_heartbeat_deadtime = 30
                                    # Deadtime interval for heartbeat signal (sec)
                                    # (change requires restart)
                                    # Host name or IP address of destination 0
                                    # for sending heartbeat signal.
                                    # (change requires restart)
                                    # Port number of destination 0 for sending
                                    # heartbeat signal. Usually this is the
                                    # same as wd_heartbeat_port.
                                    # (change requires restart)
                                    # Name of NIC device (such like 'eth0')
                                    # used for sending/receiving heartbeat
                                    # signal to/from destination 0.
                                    # This works only when this is not empty
                                    # and pgpool has root privilege.
                                    # (change requires restart)

#heartbeat_destination1 = 'host0_ip2'
#heartbeat_destination_port1 = 9694
#heartbeat_device1 = ''

# -- query mode --

wd_life_point = 3
                                    # lifecheck retry times
                                    # (change requires restart)
wd_lifecheck_query = 'SELECT 1'
                                    # lifecheck query to pgpool from watchdog
                                    # (change requires restart)
wd_lifecheck_dbname = 'template1'
                                    # Database name connected for lifecheck
                                    # (change requires restart)
wd_lifecheck_user = 'nobody'
                                    # watchdog user monitoring pgpools in lifecheck
                                    # (change requires restart)
wd_lifecheck_password = ''
                                    # Password for watchdog user in lifecheck
                                    # Leaving it empty will make Pgpool-II to first look for the
                                    # Password in pool_passwd file before using the empty password
                                    # (change requires restart)

# - Other pgpool Connection Settings -

                                    # Host name or IP address to connect to for other pgpool 0
                                    # (change requires restart)
                                    # Port number for other pgpool 0
                                    # (change requires restart)
                                    # Port number for other watchdog 0
                                    # (change requires restart)
#other_pgpool_hostname1 = 'host1'
#other_pgpool_port1 = 5432
#other_wd_port1 = 9000


#------------------------------------------------------------------------------
# OTHERS
#------------------------------------------------------------------------------
relcache_expire = 0
                                   # Life time of relation cache in seconds.
                                   # 0 means no cache expiration(the default).
                                   # The relation cache is used for cache the
                                   # query result against PostgreSQL system
                                   # catalog to obtain various information
                                   # including table structures or if it's a
                                   # temporary table or not. The cache is
                                   # maintained in a pgpool child local memory
                                   # and being kept as long as it survives.
                                   # If someone modify the table by using
                                   # ALTER TABLE or some such, the relcache is
                                   # not consistent anymore.
                                   # For this purpose, cache_expiration
                                   # controls the life time of the cache.

relcache_size = 256
                                   # Number of relation cache
                                   # entry. If you see frequently:
                                   # "pool_search_relcache: cache replacement happend"
                                   # in the pgpool log, you might want to increate this number.

check_temp_table = on
                                   # If on, enable temporary table check in SELECT statements.
                                   # This initiates queries against system catalog of primary/master
                                   # thus increases load of master.
                                   # If you are absolutely sure that your system never uses temporary tables
                                   # and you want to save access to primary/master, you could turn this off.
                                   # Default is on.

check_unlogged_table = on
                                   # If on, enable unlogged table check in SELECT statements.
                                   # This initiates queries against system catalog of primary/master
                                   # thus increases load of master.
                                   # If you are absolutely sure that your system never uses unlogged tables
                                   # and you want to save access to primary/master, you could turn this off.
                                   # Default is on.

#------------------------------------------------------------------------------
# IN MEMORY QUERY MEMORY CACHE
#------------------------------------------------------------------------------
memory_cache_enabled = off
                                   # If on, use the memory cache functionality, off by default
                                   # (change requires restart)
memqcache_method = 'shmem'
                                   # Cache storage method. either 'shmem'(shared memory) or
                                   # 'memcached'. 'shmem' by default
                                   # (change requires restart)
memqcache_memcached_host = 'localhost'
                                   # Memcached host name or IP address. Mandatory if
                                   # memqcache_method = 'memcached'.
                                   # Defaults to localhost.
                                   # (change requires restart)
memqcache_memcached_port = 11211
                                   # Memcached port number. Mondatory if memqcache_method = 'memcached'.
                                   # Defaults to 11211.
                                   # (change requires restart)
memqcache_total_size = 67108864
                                   # Total memory size in bytes for storing memory cache.
                                   # Mandatory if memqcache_method = 'shmem'.
                                   # Defaults to 64MB.
                                   # (change requires restart)
memqcache_max_num_cache = 1000000
                                   # Total number of cache entries. Mandatory
                                   # if memqcache_method = 'shmem'.
                                   # Each cache entry consumes 48 bytes on shared memory.
                                   # Defaults to 1,000,000(45.8MB).
                                   # (change requires restart)
memqcache_expire = 0
                                   # Memory cache entry life time specified in seconds.
                                   # 0 means infinite life time. 0 by default.
                                   # (change requires restart)
memqcache_auto_cache_invalidation = on
                                   # If on, invalidation of query cache is triggered by corresponding
                                   # DDL/DML/DCL(and memqcache_expire).  If off, it is only triggered
                                   # by memqcache_expire.  on by default.
                                   # (change requires restart)
memqcache_maxcache = 409600
                                   # Maximum SELECT result size in bytes.
                                   # Must be smaller than memqcache_cache_block_size. Defaults to 400KB.
                                   # (change requires restart)
memqcache_cache_block_size = 1048576
                                   # Cache block size in bytes. Mandatory if memqcache_method = 'shmem'.
                                   # Defaults to 1MB.
                                   # (change requires restart)
memqcache_oiddir = '/var/log/pgpool/oiddir'
                                   # Temporary work directory to record table oids
                                   # (change requires restart)
white_memqcache_table_list = ''
                                   # Comma separated list of table names to memcache
                                   # that don't write to database
                                   # Regexp are accepted
black_memqcache_table_list = ''
                                   # Comma separated list of table names not to memcache
                                   # that don't write to database
                                   # Regexp are accepted
backend_hostname0 = 'mwdp3prddm01.moh.gov.sa'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/installer/postgresql-11.5/data'
backend_flag0 = 'ALLOW_TO_FAILOVER'
#backend_flag0 = 'DISALLOW_TO_FAILOVER'

backend_hostname1 = 'mwdp3prdds01.moh.gov.sa'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/installer/postgresql-11.5/data'
backend_flag1 = 'ALLOW_TO_FAILOVER'
#backend_flag1 = 'DISALLOW_TO_FAILOVER'
heartbeat_destination0 = 'mwdp3prdds01.moh.gov.sa'
heartbeat_destination_port0 = 9694

other_pgpool_hostname0 = 'mwdp3prdds01.moh.gov.sa'
other_pgpool_port0 = 5433
other_wd_port0 = 9000

heartbeat_device0 = 'eth0'
##Addded b Raj --> pgpool-II 4.1 onwards available parameters 

statement_level_load_balance = on

enable_consensus_with_half_votes = on

backend_application_name0 = 'mwdp3prddm01.moh.gov.sa'
backend_application_name1 = 'mwdp3prdds01.moh.gov.sa'

FAILOVER_COMMAND_FINISH_TIMEOUT = 15
pgpool.conf (42,293 bytes)   

raj.pandey1982@gmail.com

2021-07-30 20:18

reporter   ~0003911

Please provide the solution for both:-
(1) why "postmaster on DB node 0 was shutdown by administrative command" message comes in log and failover happen while Master is up and running fine.

(2) When i kill some DB session from postgres user "postmaster on DB node 0 was shutdown by administrative command" message comes and Master DB terminates and failover happen.

raj.pandey1982@gmail.com

2021-07-30 20:21

reporter   ~0003912

(3) is " pcp_attach_node command considerd as administrative command when fired from root.?

t-ishii

2021-08-03 15:39

developer   ~0003915

> (1) why "postmaster on DB node 0 was shutdown by administrative command" message comes in log and failover happen while Master is up and running fine.
> (2) When i kill some DB session from postgres user "postmaster on DB node 0 was shutdown by administrative command" message comes and Master DB terminates and failover happen.

Because you use pg_terminate_backend() function with non-constant argument. See the manual for more details.
https://www.pgpool.net/docs/41/en/html/restrictions.html

t-ishii

2021-08-03 15:40

developer   ~0003916

> (3) is " pcp_attach_node command considerd as administrative command when fired from root.?
No.

raj.pandey1982@gmail.com

2021-08-03 16:27

reporter   ~0003917

Thanks, but as per the above link, "If the argument to the function (that is a process id) is a constant, you can safely use the function. In extended protocol mode, you cannot use the function though."
How to use this idid'mt get.

Like i use for example:- select pg_terminate_backend(34) ;
'34' is a pid.

How are you suggesting to use it then?

t-ishii

2021-08-03 16:38

developer   ~0003918

> select pg_terminate_backend(34) ;
This is fine because 34 is a constant.

> select pg_terminate_backend(pid) from pg_stat_activity where pid <> pg_backend_pid() AND state in ('idle') and usename NOT IN ('replication') and state_change >= current_timestamp - interval '10 minutes' "
This is not fine because "pid" is not a constant.

Also you have to issue the SQL via pgpool, not directly to PostgreSQL.

raj.pandey1982@gmail.com

2021-08-03 16:48

reporter   ~0003919

Yes , i did the single constant kill from postgres. directly and not via pgpool VIP using select pg_terminate_backend(34) . Which triggered failover.

t-ishii

2021-08-03 16:55

developer   ~0003920

> Yes , i did the single constant kill from postgres
In this case there's nothing pgpool can do. i.e. pgpool cannot distinguish between postmaster shutdown and pg_terminate_backend() because both produce exactly same error messages.
If pg_terminate_backend() is issued via pgpool, pgpool can remember the pid passed to the function and can recognize it is caused by the function, not postmaster shutdown.

raj.pandey1982@gmail.com

2021-08-09 15:49

reporter   ~0003926

ok. and if i fire kill command at OS level ? .. would there be any impact? .. like "kill <pid>" at linux OS command prompt.
kill 830
kill 27712
kill 21300

t-ishii

2021-08-10 08:39

developer   ~0003927

That would trigger failover in most cases. The only case that would not trigger a failover is, no pgpool session is associated with the postgres process. That means the postgres process is kept for connection pooling.

t-ishii

2022-02-05 17:10

developer   ~0003992

Note that Pgpool-II 4.3 has new parameter "failover_on_backend_shutdown" which prevents failover when admin shutdowns PostgreSQL, kills PostgreSQL backend process or issues pg_terminate_backend(), when it is set to off. Please consider to upgrade to 4.3.

t-ishii

2022-05-19 11:28

developer   ~0004036

May I close this issue?

administrator

2022-06-28 12:00

administrator   ~0004083

Close issue.

Issue History

Date Modified Username Field Change
2021-07-30 20:10 raj.pandey1982@gmail.com New Issue
2021-07-30 20:15 raj.pandey1982@gmail.com Note Added: 0003910
2021-07-30 20:15 raj.pandey1982@gmail.com File Added: pgpool.conf
2021-07-30 20:18 raj.pandey1982@gmail.com Note Added: 0003911
2021-07-30 20:21 raj.pandey1982@gmail.com Note Added: 0003912
2021-08-03 15:36 t-ishii Assigned To => t-ishii
2021-08-03 15:36 t-ishii Status new => assigned
2021-08-03 15:39 t-ishii Note Added: 0003915
2021-08-03 15:39 t-ishii Status assigned => feedback
2021-08-03 15:40 t-ishii Note Added: 0003916
2021-08-03 16:27 raj.pandey1982@gmail.com Note Added: 0003917
2021-08-03 16:27 raj.pandey1982@gmail.com Status feedback => assigned
2021-08-03 16:38 t-ishii Note Added: 0003918
2021-08-03 16:48 raj.pandey1982@gmail.com Note Added: 0003919
2021-08-03 16:55 t-ishii Note Added: 0003920
2021-08-09 15:49 raj.pandey1982@gmail.com Note Added: 0003926
2021-08-10 08:39 t-ishii Note Added: 0003927
2021-08-10 08:39 t-ishii Status assigned => feedback
2022-02-05 17:10 t-ishii Note Added: 0003992
2022-05-19 11:28 t-ishii Note Added: 0004036
2022-06-28 12:00 administrator Status feedback => closed
2022-06-28 12:00 administrator Note Added: 0004083