View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000726 | Pgpool-II | Bug | public | 2021-07-30 20:10 | 2022-06-28 12:00 |
| Reporter | raj.pandey1982@gmail.com | Assigned To | t-ishii | ||
| Priority | high | Severity | major | Reproducibility | sometimes |
| Status | closed | Resolution | open | ||
| Platform | centos | OS | x86_64 GNU/Linux | OS Version | Redhat7 |
| Product Version | 4.1.0 | ||||
| Summary | 0000726: postmaster on DB node 0 was shutdown by administrative command | ||||
| Description | some times i see Master DB is not down but failover happened with pgpool log message :" postmaster on DB node 0 was shutdown by administrative command" . SLave became master while Master was nvewr down. ========================Master =Node 1 log================ 2021-07-30 09:16:43: pid 424:LOG: fork a new child process with pid: 29330 2021-07-30 09:16:43: pid 424:LOG: child process with pid: 20492 exits with status 256 2021-07-30 09:16:43: pid 424:LOG: fork a new child process with pid: 29331 2021-07-30 09:16:43: pid 424:LOG: child process with pid: 20955 exits with status 256 2021-07-30 09:16:43: pid 424:LOG: fork a new child process with pid: 29332 2021-07-30 10:47:47: pid 434:LOG: read from socket failed, remote end closed the connection 2021-07-30 10:47:47: pid 434:LOG: client socket of mwdp3prdds01.moh.gov.sa:5433 Linux mwdp3prdds01.moh.gov.sa is closed 2021-07-30 10:47:47: pid 434:LOG: remote node "mwdp3prdds01.moh.gov.sa:5433 Linux mwdp3prdds01.moh.gov.sa" is shutting down 2021-07-30 10:47:47: pid 434:LOG: removing watchdog node "mwdp3prdds01.moh.gov.sa:5433 Linux mwdp3prdds01.moh.gov.sa" from the standby list 2021-07-30 10:47:47: pid 424:LOG: Pgpool-II parent process received watchdog quorum change signal from watchdog 2021-07-30 10:47:47: pid 434:LOG: new IPC connection received 2021-07-30 10:47:47: pid 424:LOG: watchdog cluster now holds the quorum 2021-07-30 10:47:47: pid 424:DETAIL: updating the state of quarantine backend nodes 2021-07-30 10:47:47: pid 434:LOG: new IPC connection received 2021-07-30 10:47:51: pid 434:LOG: Watchdog is shutting down 2021-07-30 10:47:51: pid 397:LOG: watchdog: de-escalation started 2021-07-30 10:47:51: pid 397:LOG: successfully released the delegate IP:"10.70.185.66" 2021-07-30 10:47:51: pid 397:DETAIL: 'if_down_cmd' returned with success 2021-07-30 10:48:08: pid 542:WARNING: checking setuid bit of if_up_cmd 2021-07-30 10:48:08: pid 542:DETAIL: ifup[/sbin/ip] doesn't have setuid bit 2021-07-30 10:48:08: pid 542:WARNING: checking setuid bit of if_down_cmd 2021-07-30 10:48:08: pid 542:DETAIL: ifdown[/sbin/ip] doesn't have setuid bit 2021-07-30 10:48:08: pid 542:WARNING: checking setuid bit of arping command 2021-07-30 10:48:08: pid 542:DETAIL: arping[/usr/sbin/arping] doesn't have setuid bit 2021-07-30 10:48:08: pid 542:LOG: Backend status file /var/log/pgpool/pgpool_status discarded 2021-07-30 10:48:08: pid 542:LOG: memory cache initialized 2021-07-30 10:48:08: pid 542:DETAIL: memcache blocks :64 2021-07-30 10:48:08: pid 542:LOG: pool_discard_oid_maps: discarded memqcache oid maps 2021-07-30 10:48:09: pid 542:LOG: waiting for watchdog to initialize 2021-07-30 10:48:09: pid 545:LOG: setting the local watchdog node name to "mwdp3prddm01.moh.gov.sa:5433 Linux mwdp3prddm01.moh.gov.sa" 2021-07-30 10:48:09: pid 545:LOG: watchdog cluster is configured with 1 remote nodes 2021-07-30 10:48:09: pid 545:LOG: watchdog remote node:0 on mwdp3prdds01.moh.gov.sa:9000 2021-07-30 10:48:09: pid 545:LOG: interface monitoring is disabled in watchdog 2021-07-30 10:48:09: pid 545:LOG: watchdog node state changed from [DEAD] to [LOADING] 2021-07-30 10:48:14: pid 545:LOG: watchdog node state changed from [LOADING] to [JOINING] 2021-07-30 10:48:18: pid 545:LOG: watchdog node state changed from [JOINING] to [INITIALIZING] 2021-07-30 10:48:19: pid 545:LOG: I am the only alive node in the watchdog cluster 2021-07-30 10:48:19: pid 545:HINT: skipping stand for coordinator state 2021-07-30 10:48:19: pid 545:LOG: watchdog node state changed from [INITIALIZING] to [MASTER] 2021-07-30 10:48:19: pid 545:LOG: I am announcing my self as master/coordinator watchdog node 2021-07-30 10:48:23: pid 545:LOG: I am the cluster leader node 2021-07-30 10:48:23: pid 545:DETAIL: our declare coordinator message is accepted by all nodes 2021-07-30 10:48:23: pid 545:LOG: setting the local node "mwdp3prddm01.moh.gov.sa:5433 Linux mwdp3prddm01.moh.gov.sa" as watchdog cluster master 2021-07-30 10:48:23: pid 545:LOG: I am the cluster leader node. Starting escalation process 2021-07-30 10:48:23: pid 542:LOG: watchdog process is initialized 2021-07-30 10:48:23: pid 542:DETAIL: watchdog messaging data version: 1.1 2021-07-30 10:48:23: pid 545:LOG: escalation process started with PID:577 2021-07-30 10:48:23: pid 577:LOG: watchdog: escalation started 2021-07-30 10:48:23: pid 545:LOG: new IPC connection received 2021-07-30 10:48:23: pid 545:LOG: new IPC connection received 2021-07-30 10:48:23: pid 578:LOG: 2 watchdog nodes are configured for lifecheck 2021-07-30 10:48:23: pid 578:LOG: watchdog nodes ID:0 Name:"mwdp3prddm01.moh.gov.sa:5433 Linux mwdp3prddm01.moh.gov.sa" 2021-07-30 10:48:23: pid 578:DETAIL: Host:"mwdp3prddm01.moh.gov.sa" WD Port:9000 pgpool-II port:5433 2021-07-30 10:48:23: pid 578:LOG: watchdog nodes ID:1 Name:"Not_Set" 2021-07-30 10:48:23: pid 578:DETAIL: Host:"mwdp3prdds01.moh.gov.sa" WD Port:9000 pgpool-II port:5433 2021-07-30 10:48:23: pid 542:LOG: Setting up socket for 0.0.0.0:5433 2021-07-30 10:48:23: pid 542:LOG: Setting up socket for :::5433 2021-07-30 10:48:23: pid 578:LOG: watchdog lifecheck trusted server "mohvcasdb01.novalocal" added for the availability check 2021-07-30 10:48:23: pid 578:LOG: watchdog lifecheck trusted server "castestdb01.novalocal" added for the availability check 2021-07-30 10:48:24: pid 580:LOG: createing watchdog heartbeat receive socket. 2021-07-30 10:48:24: pid 580:DETAIL: bind receive socket to device: "eth0" 2021-07-30 10:48:24: pid 580:LOG: set SO_REUSEPORT option to the socket 2021-07-30 10:48:24: pid 580:LOG: creating watchdog heartbeat receive socket. 2021-07-30 10:48:24: pid 580:DETAIL: set SO_REUSEPORT 2021-07-30 10:48:24: pid 581:LOG: creating socket for sending heartbeat 2021-07-30 10:48:24: pid 581:DETAIL: bind send socket to device: eth0 2021-07-30 10:48:24: pid 581:LOG: set SO_REUSEPORT option to the socket 2021-07-30 10:48:24: pid 581:LOG: creating socket for sending heartbeat 2021-07-30 10:48:24: pid 581:DETAIL: set SO_REUSEPORT 2021-07-30 10:48:26: pid 542:LOG: find_primary_node_repeatedly: waiting for finding a primary node 2021-07-30 10:48:26: pid 542:LOG: find_primary_node: primary node is 0 2021-07-30 10:48:26: pid 542:LOG: find_primary_node: standby node is 1 2021-07-30 10:48:26: pid 5181:LOG: PCP process: 5181 started 2021-07-30 10:48:27: pid 542:LOG: pgpool-II successfully started. version 4.1.0 (karasukiboshi) 2021-07-30 10:48:27: pid 542:LOG: node status[0]: 1 2021-07-30 10:48:27: pid 542:LOG: node status[1]: 2 2021-07-30 10:48:27: pid 577:LOG: successfully acquired the delegate IP:"10.70.185.66" 2021-07-30 10:48:27: pid 577:DETAIL: 'if_up_cmd' returned with success 2021-07-30 10:48:27: pid 545:LOG: watchdog escalation process with pid: 577 exit with SUCCESS. 2021-07-30 10:49:08: pid 545:LOG: new watchdog node connection is received from "10.70.185.63:44727" 2021-07-30 10:49:08: pid 545:LOG: new node joined the cluster hostname:"mwdp3prdds01.moh.gov.sa" port:9000 pgpool_port:5433 2021-07-30 10:49:08: pid 545:DETAIL: Pgpool-II version:"4.1.0" watchdog messaging version: 1.1 2021-07-30 10:49:08: pid 545:LOG: new outbound connection to mwdp3prdds01.moh.gov.sa:9000 2021-07-30 10:49:14: pid 545:LOG: adding watchdog node "mwdp3prdds01.moh.gov.sa:5433 Linux mwdp3prdds01.moh.gov.sa" to the standby list 2021-07-30 10:49:14: pid 542:LOG: Pgpool-II parent process received watchdog quorum change signal from watchdog 2021-07-30 10:49:14: pid 545:LOG: new IPC connection received 2021-07-30 10:49:14: pid 542:LOG: watchdog cluster now holds the quorum 2021-07-30 10:49:14: pid 542:DETAIL: updating the state of quarantine backend nodes 2021-07-30 10:49:14: pid 545:LOG: new IPC connection received 2021-07-30 10:49:51: pid 5136:LOG: pool_reuse_block: blockid: 0 2021-07-30 10:49:51: pid 5136:CONTEXT: while searching system catalog, When relcache is missed 2021-07-30 10:50:01: pid 2153:LOG: reading and processing packets 2021-07-30 10:50:01: pid 2153:DETAIL: postmaster on DB node 0 was shutdown by administrative command 2021-07-30 10:50:01: pid 2153:LOG: received degenerate backend request for node_id: 0 from pid [2153] 2021-07-30 10:50:01: pid 5136:LOG: reading and processing packets 2021-07-30 10:50:01: pid 5136:DETAIL: postmaster on DB node 0 was shutdown by administrative command 2021-07-30 10:50:01: pid 5136:LOG: received degenerate backend request for node_id: 0 from pid [5136] 2021-07-30 10:50:01: pid 545:LOG: new IPC connection received 2021-07-30 10:50:01: pid 545:LOG: new IPC connection received 2021-07-30 10:50:01: pid 545:LOG: watchdog received the failover command from local pgpool-II on IPC interface 2021-07-30 10:50:01: pid 545:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface 2021-07-30 10:50:01: pid 545:LOG: we have got the consensus to perform the failover 2021-07-30 10:50:01: pid 545:DETAIL: 1 node(s) voted in the favor 2021-07-30 10:50:01: pid 545:LOG: watchdog received the failover command from local pgpool-II on IPC interface 2021-07-30 10:50:01: pid 545:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface 2021-07-30 10:50:01: pid 545:LOG: we have got the consensus to perform the failover 2021-07-30 10:50:01: pid 545:DETAIL: 1 node(s) voted in the favor 2021-07-30 10:50:01: pid 4117:LOG: reading and processing packets 2021-07-30 10:50:01: pid 4117:DETAIL: postmaster on DB node 0 was shutdown by administrative command 2021-07-30 10:50:01: pid 4117:LOG: received degenerate backend request for node_id: 0 from pid [4117] 2021-07-30 10:50:01: pid 545:LOG: new IPC connection received 2021-07-30 10:50:01: pid 5147:LOG: reading and processing packets 2021-07-30 10:50:01: pid 5147:DETAIL: postmaster on DB node 0 was shutdown by administrative command 2021-07-30 10:50:01: pid 5147:LOG: received degenerate backend request for node_id: 0 from pid [5147] 2021-07-30 10:50:01: pid 545:LOG: watchdog received the failover command from local pgpool-II on IPC interface 2021-07-30 10:50:01: pid 542:LOG: Pgpool-II parent process has received failover request 2021-07-30 10:50:01: pid 545:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface 2021-07-30 10:50:01: pid 545:LOG: we have got the consensus to perform the failover 2021-07-30 10:50:01: pid 545:DETAIL: 1 node(s) voted in the favor 2021-07-30 10:50:01: pid 545:LOG: new IPC connection received 2021-07-30 10:50:01: pid 545:LOG: new IPC connection received 2021-07-30 10:50:01: pid 545:LOG: watchdog received the failover command from local pgpool-II on IPC interface 2021-07-30 10:50:01: pid 545:LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface 2021-07-30 10:50:01: pid 545:LOG: we have got the consensus to perform the failover 2021-07-30 10:50:01: pid 545:DETAIL: 1 node(s) voted in the favor 2021-07-30 10:50:01: pid 545:LOG: received the failover indication from Pgpool-II on IPC interface 2021-07-30 10:50:01: pid 545:LOG: watchdog is informed of failover start by the main process 2021-07-30 10:50:01: pid 542:LOG: starting degeneration. shutdown host mwdp3prddm01.moh.gov.sa(5432) 2021-07-30 10:50:01: pid 542:LOG: Restart all children 2021-07-30 10:50:01: pid 542:LOG: execute command: /etc/pgpool-II/failover.sh 0 mwdp3prddm01.moh.gov.sa 5432 /installer/postgresql-11.5/data 1 mwdp3prdds01.moh.gov.sa 0 0 5432 /installer/postgresql-11.5/data mwdp3prddm01.moh.gov.sa 5432 + exec ++ logger -i -p local1.info 2021-07-30 10:50:03: pid 542:LOG: find_primary_node_repeatedly: waiting for finding a primary node 2021-07-30 10:50:03: pid 542:LOG: find_primary_node: primary node is 1 2021-07-30 10:50:03: pid 542:LOG: failover: set new primary node: 1 2021-07-30 10:50:03: pid 542:LOG: failover: set new master node: 1 | ||||
| Tags | No tags attached. | ||||
|
|
ALso :- With this we are frequntly getting cpu utilization alerts..to prevent cpu utilization we are suppose to kill idle sessions..however when we fired :"select pg_terminate_backend(pid) from pg_stat_activity where pid <> pg_backend_pid() AND state in ('idle') and usename NOT IN ('replication') and state_change >= current_timestamp - interval '10 minutes' " automatic failover is happening and getting below message in pgpool logs.. : DETAIL: postmaster on DB node 0 was shutdown by administrative command pgpool.conf (42,293 bytes)
# ----------------------------
# pgPool-II configuration file
# ----------------------------
#
# This file consists of lines of the form:
#
# name = value
#
# Whitespace may be used. Comments are introduced with "#" anywhere on a line.
# The complete list of parameter names and allowed values can be found in the
# pgPool-II documentation.
#
# This file is read on server startup and when the server receives a SIGHUP
# signal. If you edit the file on a running system, you have to SIGHUP the
# server for the changes to take effect, or use "pgpool reload". Some
# parameters, which are marked below, require a server shutdown and restart to
# take effect.
#
#------------------------------------------------------------------------------
# CONNECTIONS
#------------------------------------------------------------------------------
# - pgpool Connection Settings -
listen_addresses = '*'
# Host name or IP address to listen on:
# '*' for all, '' for no TCP/IP connections
# (change requires restart)
port = 5433
# Port number
# (change requires restart)
socket_dir = '/var/run/postgresql'
# Unix domain socket path
# The Debian package defaults to
# /var/run/postgresql
# (change requires restart)
listen_backlog_multiplier = 2
# Set the backlog parameter of listen(2) to
# num_init_children * listen_backlog_multiplier.
# (change requires restart)
serialize_accept = off
# whether to serialize accept() call to avoid thundering herd problem
# (change requires restart)
# - pgpool Communication Manager Connection Settings -
pcp_listen_addresses = '*'
# Host name or IP address for pcp process to listen on:
# '*' for all, '' for no TCP/IP connections
# (change requires restart)
pcp_port = 9898
# Port number for pcp
# (change requires restart)
pcp_socket_dir = '/var/run/postgresql'
# Unix domain socket path for pcp
# The Debian package defaults to
# /var/run/postgresql
# (change requires restart)
# - Backend Connection Settings -
# Host name or IP address to connect to for backend 0
# Port number for backend 0
# Weight for backend 0 (only in load balancing mode)
# Data directory for backend 0
# Controls various backend behavior
# ALLOW_TO_FAILOVER, DISALLOW_TO_FAILOVER
# or ALWAYS_MASTER
# - Authentication -
enable_pool_hba = on
# Use pool_hba.conf for client authentication
pool_passwd = 'pool_passwd'
# File name of pool_passwd for md5 authentication.
# "" disables pool_passwd.
# (change requires restart)
authentication_timeout = 60
# Delay in seconds to complete client authentication
# 0 means no timeout.
allow_clear_text_frontend_auth = off
# Allow Pgpool-II to use clear text password authentication
# with clients, when pool_passwd does not
# contain the user password
# - SSL Connections -
ssl = off
# Enable SSL support
# (change requires restart)
#ssl_key = './server.key'
# Path to the SSL private key file
# (change requires restart)
#ssl_cert = './server.cert'
# Path to the SSL public certificate file
# (change requires restart)
#ssl_ca_cert = ''
# Path to a single PEM format file
# containing CA root certificate(s)
# (change requires restart)
#ssl_ca_cert_dir = ''
# Directory containing CA root certificate(s)
# (change requires restart)
ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL'
# Allowed SSL ciphers
# (change requires restart)
ssl_prefer_server_ciphers = off
# Use server's SSL cipher preferences,
# rather than the client's
# (change requires restart)
#------------------------------------------------------------------------------
# POOLS
#------------------------------------------------------------------------------
# - Concurrent session and pool size -
num_init_children = 4000
#num_init_children = 2975
#num_init_children = 975
# Number of concurrent sessions allowed
# (change requires restart)
max_pool = 1
#max_pool = 3
#max_pool = 10
#max_pool = 4
# Number of connection pool caches per connection
# (change requires restart)
# - Life time -
child_life_time = 300
# Pool exits after being idle for this many seconds
child_max_connections = 0
# Pool exits after receiving that many connections
# 0 means no exit
#connection_life_time = 0
connection_life_time = 300
# Connection to backend closes after being idle for this many seconds
# 0 means no close
#client_idle_limit = 0
# Client is disconnected after being idle for that many seconds
# (even inside an explicit transactions!)
# 0 means no disconnection
reserved_connections = 1
#------------------------------------------------------------------------------
# LOGS
#------------------------------------------------------------------------------
# - Where to log -
log_destination = 'stderr'
# Where to log
# Valid values are combinations of stderr,
# and syslog. Default to stderr.
# - What to log -
log_line_prefix = '%t: pid %p:'
log_connections = off
# Log connections
log_hostname = off
# Hostname will be shown in ps status
# and in logs if connections are logged
log_statement = off
# Log all statements
log_per_node_statement = off
# Log all statements
# with node and backend informations
log_client_messages = off
# Log any client messages
log_standby_delay = 'none'
# Log standby delay
# Valid values are combinations of always,
# if_over_threshold, none
# - Syslog specific -
syslog_facility = 'LOCAL0'
# Syslog local facility. Default to LOCAL0
syslog_ident = 'pgpool'
# Syslog program identification string
# Default to 'pgpool'
# - Debug -
#log_error_verbosity = default # terse, default, or verbose messages
#client_min_messages = notice # values in order of decreasing detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# log
# notice
# warning
# error
#log_min_messages = warning # values in order of decreasing detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# info
# notice
# warning
# error
# log
# fatal
# panic
#------------------------------------------------------------------------------
# FILE LOCATIONS
#------------------------------------------------------------------------------
pid_file_name = '/var/run/postgresql/pgpool.pid'
# PID file name
# Can be specified as relative to the"
# location of pgpool.conf file or
# as an absolute path
# (change requires restart)
logdir = '/var/log/pgpool'
# Directory of pgPool status file
# (change requires restart)
#------------------------------------------------------------------------------
# CONNECTION POOLING
#------------------------------------------------------------------------------
connection_cache = on
# Activate connection pools
# (change requires restart)
# Semicolon separated list of queries
# to be issued at the end of a session
# The default is for 8.3 and later
reset_query_list = 'ABORT; DISCARD ALL'
# The following one is for 8.2 and before
#reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT'
#------------------------------------------------------------------------------
# REPLICATION MODE
#------------------------------------------------------------------------------
replication_mode = off
# Activate replication mode
# (change requires restart)
replicate_select = off
# Replicate SELECT statements
# when in replication mode
# replicate_select is higher priority than
# load_balance_mode.
insert_lock = on
# Automatically locks a dummy row or a table
# with INSERT statements to keep SERIAL data
# consistency
# Without SERIAL, no lock will be issued
lobj_lock_table = ''
# When rewriting lo_creat command in
# replication mode, specify table name to
# lock
# - Degenerate handling -
replication_stop_on_mismatch = off
# On disagreement with the packet kind
# sent from backend, degenerate the node
# which is most likely "minority"
# If off, just force to exit this session
failover_if_affected_tuples_mismatch = off
# On disagreement with the number of affected
# tuples in UPDATE/DELETE queries, then
# degenerate the node which is most likely
# "minority".
# If off, just abort the transaction to
# keep the consistency
#------------------------------------------------------------------------------
# LOAD BALANCING MODE
#------------------------------------------------------------------------------
#load_balance_mode = off
load_balance_mode = on
# Activate load balancing mode
# (change requires restart)
ignore_leading_white_space = on
# Ignore leading white spaces of each query
white_function_list = ''
# Comma separated list of function names
# that don't write to database
# Regexp are accepted
black_function_list = 'currval,lastval,nextval,setval,Get_Appt_code,walkin_appointment_token_no,findAppointmentFacilityTypeIdNew,findAppointmentReferUrgencyType,walkin_appointment_token_no,walkin_appointment_token_no,Reject_Dependent_Request,Delete_Dependent_Request'
#black_function_list = 'currval,lastval,nextval,setval'
# Comma separated list of function names
# that write to database
# Regexp are accepted
black_query_pattern_list = ''
# Semicolon separated list of query patterns
# that should be sent to primary node
# Regexp are accepted
# valid for streaming replicaton mode only.
database_redirect_preference_list = ''
# comma separated list of pairs of database and node id.
# example: postgres:primary,mydb[0-4]:1,mydb[5-9]:2'
# valid for streaming replicaton mode only.
app_name_redirect_preference_list = 'cas-schedule-mgmt-svc:primary'
#app_name_redirect_preference_list = ''
# comma separated list of pairs of app name and node id.
# example: 'psql:primary,myapp[0-4]:1,myapp[5-9]:standby'
# valid for streaming replicaton mode only.
allow_sql_comments = off
# if on, ignore SQL comments when judging if load balance or
# query cache is possible.
# If off, SQL comments effectively prevent the judgment
# (pre 3.4 behavior).
disable_load_balance_on_write = 'transaction'
# Load balance behavior when write query is issued
# in an explicit transaction.
# Note that any query not in an explicit transaction
# is not affected by the parameter.
# 'transaction' (the default): if a write query is issued,
# subsequent read queries will not be load balanced
# until the transaction ends.
# 'trans_transaction': if a write query is issued,
# subsequent read queries in an explicit transaction
# will not be load balanced until the session ends.
# 'always': if a write query is issued, read queries will
# not be load balanced until the session ends.
#------------------------------------------------------------------------------
# MASTER/SLAVE MODE
#------------------------------------------------------------------------------
master_slave_mode = on
# Activate master/slave mode
# (change requires restart)
master_slave_sub_mode = 'stream'
# Master/slave sub mode
# Valid values are combinations stream, slony
# or logical. Default is stream.
# (change requires restart)
# - Streaming -
sr_check_period = 5
# Streaming replication check period
# Disabled (0) by default
#sr_check_user = 'postgres'
sr_check_user = 'replication'
# Streaming replication check user
# This is necessary even if you disable
# streaming replication delay check with
# sr_check_period = 0
#sr_check_password = 'postgrestg'
sr_check_password = 'reppassword'
# Password for streaming replication check user.
# Leaving it empty will make Pgpool-II to first look for the
# Password in pool_passwd file before using the empty password
sr_check_database = 'postgres'
#sr_check_database = 'postgres'
# Database name for streaming replication check
delay_threshold = 0
# Threshold before not dispatching query to standby node
# Unit is in bytes
# Disabled (0) by default
# - Special commands -
follow_master_command = ''
# Executes this command after master failover
# Special values:
# %d = node id
# %h = host name
# %p = port number
# %D = database cluster path
# %m = new master node id
# %H = hostname of the new master node
# %M = old master node id
# %P = old primary node id
# %r = new master port number
# %R = new master database cluster path
# %% = '%' character
#------------------------------------------------------------------------------
# HEALTH CHECK GLOBAL PARAMETERS
#------------------------------------------------------------------------------
health_check_period = 5
# Health check period
# Disabled (0) by default
health_check_timeout = 50
#health_check_timeout = 10
#health_check_timeout = 0
# Health check timeout
# 0 means no timeout
health_check_user = 'postgres'
# Health check user
health_check_password = ''
# Password for health check user
# Leaving it empty will make Pgpool-II to first look for the
# Password in pool_passwd file before using the empty password
health_check_database = 'postgres'
# Database name for health check. If '', tries 'postgres' frist, then 'template1'
health_check_max_retries = 5
#health_check_max_retries = 3
#health_check_max_retries = 0
# Maximum number of times to retry a failed health check before giving up.
health_check_retry_delay = 1
# Amount of time to wait (in seconds) between retries.
connect_timeout = 50000
#connect_timeout = 10000
# Timeout value in milliseconds before giving up to connect to backend.
# Default is 10000 ms (10 second). Flaky network user may want to increase
# the value. 0 means no timeout.
# Note that this value is not only used for health check,
# but also for ordinary conection to backend.
#------------------------------------------------------------------------------
# HEALTH CHECK PER NODE PARAMETERS (OPTIONAL)
#------------------------------------------------------------------------------
#health_check_period0 = 0
#health_check_timeout0 = 20
#health_check_user0 = 'nobody'
#health_check_password0 = ''
#health_check_database0 = ''
#health_check_max_retries0 = 0
#health_check_retry_delay0 = 1
#connect_timeout0 = 10000
#------------------------------------------------------------------------------
# FAILOVER AND FAILBACK
#------------------------------------------------------------------------------
#failover_command = '/usr/share/pgpool/4.1.0/etc/failover.sh %d %P %H postgrestg /installer/postgresql-11.5/data/im_the_master'
failover_command = '/etc/pgpool-II/failover.sh %d %h %p %D %m %H %M %P %r %R %N %S'
#failover_command = '/etc/pgpool-II/failover.sh %d %P %H reppassword /installer/postgresql-11.5/data/im_the_master'
# Executes this command at failover
# Special values:
# %d = node id
# %h = host name
# %p = port number
# %D = database cluster path
# %m = new master node id
# %H = hostname of the new master node
# %M = old master node id
# %P = old primary node id
# %r = new master port number
# %R = new master database cluster path
# %% = '%' character
failback_command = ''
# Executes this command at failback.
# Special values:
# %d = node id
# %h = host name
# %p = port number
# %D = database cluster path
# %m = new master node id
# %H = hostname of the new master node
# %M = old master node id
# %P = old primary node id
# %r = new master port number
# %R = new master database cluster path
# %% = '%' character
failover_on_backend_error = off
# Initiates failover when reading/writing to the
# backend communication socket fails
# If set to off, pgpool will report an
# error and disconnect the session.
#detach_false_primary = on
detach_false_primary = off
# Detach false primary if on. Only
# valid in streaming replicaton
# mode and with PostgreSQL 9.6 or
# after.
#search_primary_node_timeout = 10
search_primary_node_timeout = 300
# Timeout in seconds to search for the
# primary node when a failover occurs.
# 0 means no timeout, keep searching
# for a primary node forever.
#------------------------------------------------------------------------------
# ONLINE RECOVERY
#------------------------------------------------------------------------------
recovery_user = 'postgres'
# Online recovery user
recovery_password = ''
# Online recovery password
# Leaving it empty will make Pgpool-II to first look for the
# Password in pool_passwd file before using the empty password
recovery_1st_stage_command = 'recovery_1st_stage.sh'
# Executes a command in first stage
recovery_2nd_stage_command = ''
# Executes a command in second stage
recovery_timeout = 90
# Timeout in seconds to wait for the
# recovering node's postmaster to start up
# 0 means no wait
client_idle_limit_in_recovery = 0
# Client is disconnected after being idle
# for that many seconds in the second stage
# of online recovery
# 0 means no disconnection
# -1 means immediate disconnection
#------------------------------------------------------------------------------
# WATCHDOG
#------------------------------------------------------------------------------
# - Enabling -
use_watchdog = on
# Activates watchdog
# (change requires restart)
# -Connection to up stream servers -
trusted_servers = 'mohvcasdb01.novalocal,castestdb01.novalocal'
# trusted server list which are used
# to confirm network connection
# (hostA,hostB,hostC,...)
# (change requires restart)
ping_path = '/bin'
# ping command path
# (change requires restart)
# - Watchdog communication Settings -
wd_hostname = 'mwdp3prddm01.moh.gov.sa'
# Host name or IP address of this watchdog
# (change requires restart)
wd_port = 9000
# port number for watchdog service
# (change requires restart)
wd_priority = 1
# priority of this watchdog in leader election
# (change requires restart)
wd_authkey = ''
# Authentication key for watchdog communication
# (change requires restart)
wd_ipc_socket_dir = '/var/run/postgresql'
# Unix domain socket path for watchdog IPC socket
# The Debian package defaults to
# /var/run/postgresql
# (change requires restart)
# - Virtual IP control Setting -
delegate_IP = '10.70.185.66'
# delegate IP address
# If this is empty, virtual IP never bring up.
# (change requires restart)
if_cmd_path = '/sbin'
# path to the directory where if_up/down_cmd exists
# (change requires restart)
if_up_cmd = 'ip addr add $_IP_$/24 dev eth0 label eth0:0'
# startup delegate IP command
# (change requires restart)
if_down_cmd = 'ip addr del $_IP_$/24 dev eth0'
# shutdown delegate IP command
# (change requires restart)
arping_path = '/usr/sbin'
# arping command path
# (change requires restart)
arping_cmd = 'arping -U $_IP_$ -w 1 -I eth0'
# arping command
# (change requires restart)
# - Behaivor on escalation Setting -
clear_memqcache_on_escalation = on
# Clear all the query cache on shared memory
# when standby pgpool escalate to active pgpool
# (= virtual IP holder).
# This should be off if client connects to pgpool
# not using virtual IP.
# (change requires restart)
wd_escalation_command = ''
# Executes this command at escalation on new active pgpool.
# (change requires restart)
wd_de_escalation_command = ''
# Executes this command when master pgpool resigns from being master.
# (change requires restart)
# - Watchdog consensus settings for failover -
failover_when_quorum_exists = on
# Only perform backend node failover
# when the watchdog cluster holds the quorum
# (change requires restart)
failover_require_consensus = on
# Perform failover when majority of Pgpool-II nodes
# aggrees on the backend node status change
# (change requires restart)
allow_multiple_failover_requests_from_node = off
# A Pgpool-II node can cast multiple votes
# for building the consensus on failover
# (change requires restart)
# - Lifecheck Setting -
# -- common --
wd_monitoring_interfaces_list = ''
# if any interface from the list is active the watchdog will
# consider the network is fine
# 'any' to enable monitoring on all interfaces except loopback
# '' to disable monitoring
# (change requires restart)
wd_lifecheck_method = 'heartbeat'
# Method of watchdog lifecheck ('heartbeat' or 'query' or 'external')
# (change requires restart)
wd_interval = 3
# lifecheck interval (sec) > 0
# (change requires restart)
# -- heartbeat mode --
wd_heartbeat_port = 9694
# Port number for receiving heartbeat signal
# (change requires restart)
wd_heartbeat_keepalive = 2
# Interval time of sending heartbeat signal (sec)
# (change requires restart)
wd_heartbeat_deadtime = 30
# Deadtime interval for heartbeat signal (sec)
# (change requires restart)
# Host name or IP address of destination 0
# for sending heartbeat signal.
# (change requires restart)
# Port number of destination 0 for sending
# heartbeat signal. Usually this is the
# same as wd_heartbeat_port.
# (change requires restart)
# Name of NIC device (such like 'eth0')
# used for sending/receiving heartbeat
# signal to/from destination 0.
# This works only when this is not empty
# and pgpool has root privilege.
# (change requires restart)
#heartbeat_destination1 = 'host0_ip2'
#heartbeat_destination_port1 = 9694
#heartbeat_device1 = ''
# -- query mode --
wd_life_point = 3
# lifecheck retry times
# (change requires restart)
wd_lifecheck_query = 'SELECT 1'
# lifecheck query to pgpool from watchdog
# (change requires restart)
wd_lifecheck_dbname = 'template1'
# Database name connected for lifecheck
# (change requires restart)
wd_lifecheck_user = 'nobody'
# watchdog user monitoring pgpools in lifecheck
# (change requires restart)
wd_lifecheck_password = ''
# Password for watchdog user in lifecheck
# Leaving it empty will make Pgpool-II to first look for the
# Password in pool_passwd file before using the empty password
# (change requires restart)
# - Other pgpool Connection Settings -
# Host name or IP address to connect to for other pgpool 0
# (change requires restart)
# Port number for other pgpool 0
# (change requires restart)
# Port number for other watchdog 0
# (change requires restart)
#other_pgpool_hostname1 = 'host1'
#other_pgpool_port1 = 5432
#other_wd_port1 = 9000
#------------------------------------------------------------------------------
# OTHERS
#------------------------------------------------------------------------------
relcache_expire = 0
# Life time of relation cache in seconds.
# 0 means no cache expiration(the default).
# The relation cache is used for cache the
# query result against PostgreSQL system
# catalog to obtain various information
# including table structures or if it's a
# temporary table or not. The cache is
# maintained in a pgpool child local memory
# and being kept as long as it survives.
# If someone modify the table by using
# ALTER TABLE or some such, the relcache is
# not consistent anymore.
# For this purpose, cache_expiration
# controls the life time of the cache.
relcache_size = 256
# Number of relation cache
# entry. If you see frequently:
# "pool_search_relcache: cache replacement happend"
# in the pgpool log, you might want to increate this number.
check_temp_table = on
# If on, enable temporary table check in SELECT statements.
# This initiates queries against system catalog of primary/master
# thus increases load of master.
# If you are absolutely sure that your system never uses temporary tables
# and you want to save access to primary/master, you could turn this off.
# Default is on.
check_unlogged_table = on
# If on, enable unlogged table check in SELECT statements.
# This initiates queries against system catalog of primary/master
# thus increases load of master.
# If you are absolutely sure that your system never uses unlogged tables
# and you want to save access to primary/master, you could turn this off.
# Default is on.
#------------------------------------------------------------------------------
# IN MEMORY QUERY MEMORY CACHE
#------------------------------------------------------------------------------
memory_cache_enabled = off
# If on, use the memory cache functionality, off by default
# (change requires restart)
memqcache_method = 'shmem'
# Cache storage method. either 'shmem'(shared memory) or
# 'memcached'. 'shmem' by default
# (change requires restart)
memqcache_memcached_host = 'localhost'
# Memcached host name or IP address. Mandatory if
# memqcache_method = 'memcached'.
# Defaults to localhost.
# (change requires restart)
memqcache_memcached_port = 11211
# Memcached port number. Mondatory if memqcache_method = 'memcached'.
# Defaults to 11211.
# (change requires restart)
memqcache_total_size = 67108864
# Total memory size in bytes for storing memory cache.
# Mandatory if memqcache_method = 'shmem'.
# Defaults to 64MB.
# (change requires restart)
memqcache_max_num_cache = 1000000
# Total number of cache entries. Mandatory
# if memqcache_method = 'shmem'.
# Each cache entry consumes 48 bytes on shared memory.
# Defaults to 1,000,000(45.8MB).
# (change requires restart)
memqcache_expire = 0
# Memory cache entry life time specified in seconds.
# 0 means infinite life time. 0 by default.
# (change requires restart)
memqcache_auto_cache_invalidation = on
# If on, invalidation of query cache is triggered by corresponding
# DDL/DML/DCL(and memqcache_expire). If off, it is only triggered
# by memqcache_expire. on by default.
# (change requires restart)
memqcache_maxcache = 409600
# Maximum SELECT result size in bytes.
# Must be smaller than memqcache_cache_block_size. Defaults to 400KB.
# (change requires restart)
memqcache_cache_block_size = 1048576
# Cache block size in bytes. Mandatory if memqcache_method = 'shmem'.
# Defaults to 1MB.
# (change requires restart)
memqcache_oiddir = '/var/log/pgpool/oiddir'
# Temporary work directory to record table oids
# (change requires restart)
white_memqcache_table_list = ''
# Comma separated list of table names to memcache
# that don't write to database
# Regexp are accepted
black_memqcache_table_list = ''
# Comma separated list of table names not to memcache
# that don't write to database
# Regexp are accepted
backend_hostname0 = 'mwdp3prddm01.moh.gov.sa'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/installer/postgresql-11.5/data'
backend_flag0 = 'ALLOW_TO_FAILOVER'
#backend_flag0 = 'DISALLOW_TO_FAILOVER'
backend_hostname1 = 'mwdp3prdds01.moh.gov.sa'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/installer/postgresql-11.5/data'
backend_flag1 = 'ALLOW_TO_FAILOVER'
#backend_flag1 = 'DISALLOW_TO_FAILOVER'
heartbeat_destination0 = 'mwdp3prdds01.moh.gov.sa'
heartbeat_destination_port0 = 9694
other_pgpool_hostname0 = 'mwdp3prdds01.moh.gov.sa'
other_pgpool_port0 = 5433
other_wd_port0 = 9000
heartbeat_device0 = 'eth0'
##Addded b Raj --> pgpool-II 4.1 onwards available parameters
statement_level_load_balance = on
enable_consensus_with_half_votes = on
backend_application_name0 = 'mwdp3prddm01.moh.gov.sa'
backend_application_name1 = 'mwdp3prdds01.moh.gov.sa'
FAILOVER_COMMAND_FINISH_TIMEOUT = 15
|
|
|
Please provide the solution for both:- (1) why "postmaster on DB node 0 was shutdown by administrative command" message comes in log and failover happen while Master is up and running fine. (2) When i kill some DB session from postgres user "postmaster on DB node 0 was shutdown by administrative command" message comes and Master DB terminates and failover happen. |
|
|
(3) is " pcp_attach_node command considerd as administrative command when fired from root.? |
|
|
> (1) why "postmaster on DB node 0 was shutdown by administrative command" message comes in log and failover happen while Master is up and running fine. > (2) When i kill some DB session from postgres user "postmaster on DB node 0 was shutdown by administrative command" message comes and Master DB terminates and failover happen. Because you use pg_terminate_backend() function with non-constant argument. See the manual for more details. https://www.pgpool.net/docs/41/en/html/restrictions.html |
|
|
> (3) is " pcp_attach_node command considerd as administrative command when fired from root.? No. |
|
|
Thanks, but as per the above link, "If the argument to the function (that is a process id) is a constant, you can safely use the function. In extended protocol mode, you cannot use the function though." How to use this idid'mt get. Like i use for example:- select pg_terminate_backend(34) ; '34' is a pid. How are you suggesting to use it then? |
|
|
> select pg_terminate_backend(34) ; This is fine because 34 is a constant. > select pg_terminate_backend(pid) from pg_stat_activity where pid <> pg_backend_pid() AND state in ('idle') and usename NOT IN ('replication') and state_change >= current_timestamp - interval '10 minutes' " This is not fine because "pid" is not a constant. Also you have to issue the SQL via pgpool, not directly to PostgreSQL. |
|
|
Yes , i did the single constant kill from postgres. directly and not via pgpool VIP using select pg_terminate_backend(34) . Which triggered failover. |
|
|
> Yes , i did the single constant kill from postgres In this case there's nothing pgpool can do. i.e. pgpool cannot distinguish between postmaster shutdown and pg_terminate_backend() because both produce exactly same error messages. If pg_terminate_backend() is issued via pgpool, pgpool can remember the pid passed to the function and can recognize it is caused by the function, not postmaster shutdown. |
|
|
ok. and if i fire kill command at OS level ? .. would there be any impact? .. like "kill <pid>" at linux OS command prompt. kill 830 kill 27712 kill 21300 |
|
|
That would trigger failover in most cases. The only case that would not trigger a failover is, no pgpool session is associated with the postgres process. That means the postgres process is kept for connection pooling. |
|
|
Note that Pgpool-II 4.3 has new parameter "failover_on_backend_shutdown" which prevents failover when admin shutdowns PostgreSQL, kills PostgreSQL backend process or issues pg_terminate_backend(), when it is set to off. Please consider to upgrade to 4.3. |
|
|
May I close this issue? |
|
|
Close issue. |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2021-07-30 20:10 | raj.pandey1982@gmail.com | New Issue | |
| 2021-07-30 20:15 | raj.pandey1982@gmail.com | Note Added: 0003910 | |
| 2021-07-30 20:15 | raj.pandey1982@gmail.com | File Added: pgpool.conf | |
| 2021-07-30 20:18 | raj.pandey1982@gmail.com | Note Added: 0003911 | |
| 2021-07-30 20:21 | raj.pandey1982@gmail.com | Note Added: 0003912 | |
| 2021-08-03 15:36 | t-ishii | Assigned To | => t-ishii |
| 2021-08-03 15:36 | t-ishii | Status | new => assigned |
| 2021-08-03 15:39 | t-ishii | Note Added: 0003915 | |
| 2021-08-03 15:39 | t-ishii | Status | assigned => feedback |
| 2021-08-03 15:40 | t-ishii | Note Added: 0003916 | |
| 2021-08-03 16:27 | raj.pandey1982@gmail.com | Note Added: 0003917 | |
| 2021-08-03 16:27 | raj.pandey1982@gmail.com | Status | feedback => assigned |
| 2021-08-03 16:38 | t-ishii | Note Added: 0003918 | |
| 2021-08-03 16:48 | raj.pandey1982@gmail.com | Note Added: 0003919 | |
| 2021-08-03 16:55 | t-ishii | Note Added: 0003920 | |
| 2021-08-09 15:49 | raj.pandey1982@gmail.com | Note Added: 0003926 | |
| 2021-08-10 08:39 | t-ishii | Note Added: 0003927 | |
| 2021-08-10 08:39 | t-ishii | Status | assigned => feedback |
| 2022-02-05 17:10 | t-ishii | Note Added: 0003992 | |
| 2022-05-19 11:28 | t-ishii | Note Added: 0004036 | |
| 2022-06-28 12:00 | administrator | Status | feedback => closed |
| 2022-06-28 12:00 | administrator | Note Added: 0004083 |