View Issue Details

IDProjectCategoryView StatusLast Update
0000812Pgpool-IIBugpublic2023-09-27 16:12
Reportersupakit.chavar Assigned Topengbo  
PrioritynormalSeverityminorReproducibilityunable to reproduce
Status assignedResolutionopen 
PlatformCentOS 7OSCentOS 7OS Version7.5
Product Version4.2.6 
Summary0000812: failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
Description2023-09-26 10:27:28: [4058]: db=[No Connection],user=[No Connection],app=[unknown] LOG: new connection received
2023-09-26 10:27:28: [4058]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL: connecting host=10.102.49.39 port=49394
2023-09-26 10:27:28: [22288]: db=[No Connection],user=[No Connection],app=[unknown] LOG: frontend disconnection: session time: 0:00:33.710 user=customer_pdpa.app database=customer host=10.102.44.103 port=41678
2023-09-26 10:27:28: [22288]: db=[No Connection],user=[No Connection],app=[unknown] LOG: new connection received
2023-09-26 10:27:28: [22288]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL: connecting host=10.102.44.51 port=54394
2023-09-26 10:27:29: [22133]: db=[No Connection],user=[No Connection],app=[unknown] LOG: failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
2023-09-26 10:27:29: [22133]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL: Operation already in progress
2023-09-26 10:27:29: [659]: db=[No Connection],user=[No Connection],app=[unknown] LOG: failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
2023-09-26 10:27:29: [659]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL: Operation already in progress
2023-09-26 10:27:29: [655]: db=[No Connection],user=[No Connection],app=[unknown] LOG: failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
2023-09-26 10:27:29: [655]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL: Operation already in progress
2023-09-26 10:27:29: [1891]: db=[No Connection],user=[No Connection],app=[unknown] LOG: failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
2023-09-26 10:27:29: [1891]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL: Operation already in progress
2023-09-26 10:27:29: [659]: db=[No Connection],user=[No Connection],app=[unknown] LOG: received degenerate backend request for node_id: 0 from pid [659]
2023-09-26 10:27:29: [22133]: db=[No Connection],user=[No Connection],app=[unknown] LOG: received degenerate backend request for node_id: 0 from pid [22133]
2023-09-26 10:27:29: [655]: db=[No Connection],user=[No Connection],app=[unknown] LOG: received degenerate backend request for node_id: 0 from pid [655]
2023-09-26 10:27:29: [1891]: db=[No Connection],user=[No Connection],app=[unknown] LOG: received degenerate backend request for node_id: 0 from pid [1891]
2023-09-26 10:27:29: [3156]: db=[No Connection],user=[No Connection],app=[unknown] LOG: failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
2023-09-26 10:27:29: [3156]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL: Operation already in progress
2023-09-26 10:27:29: [3156]: db=[No Connection],user=[No Connection],app=[unknown] LOG: received degenerate backend request for node_id: 0 from pid [3156]
2023-09-26 10:27:29: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG: new IPC connection received
2023-09-26 10:27:29: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG: new IPC connection received
2023-09-26 10:27:29: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG: watchdog received the failover command from local pgpool-II on IPC interface
2023-09-26 10:27:29: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2023-09-26 10:27:29: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG: failover requires the majority vote, waiting for consensus
2023-09-26 10:27:29: [21120]: db=[No Connection],user=[No Connection],app=watchdog DETAIL: failover request noted
2023-09-26 10:27:29: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG: failover command [DEGENERATE_BACKEND_REQUEST] request from pgpool-II node "pgpool-0:9999 Linux T1VMPDDB89" is queued, waiting for the confirmation from other nodes
2023-09-26 10:27:29: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG: new IPC connection received
2023-09-26 10:27:29: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG: watchdog received the failover command from local pgpool-II on IPC interface
2023-09-26 10:27:29: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG: watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
Steps To Reproduceresolved itself
Additional Informationcustomerdb-a (postgresql 13.5) it not down on above period


please see detail in the attached file
TagsNo tags attached.

Activities

supakit.chavar

2023-09-26 15:18

reporter  

pool_nodes.txt (824 bytes)   
# psql -h localhost -U pgcheck -d postgres --pset pager=off -c "show pool_nodes"
 node_id |   hostname   | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
---------+--------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | customerdb-a | 5432 | up     | 0.500000  | primary | 37720815   | true              | 0                 |                   |                        | 2023-09-26 12:58:24
 1       | customerdb-b | 5432 | down   | 0.500000  | standby | 0          | false             | 0                 |                   |                        | 2023-09-21 00:33:43
(2 rows)
pool_nodes.txt (824 bytes)   
hosts.txt (136 bytes)   
10.102.46.94    customerdb-a
10.102.46.95    customerdb-b
10.102.46.175   pgpool-0
10.102.46.176   pgpool-1
10.102.46.177   pgpool-2
hosts.txt (136 bytes)   
pgpool.conf (48,377 bytes)   
# ----------------------------
# pgPool-II configuration file
# ----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# Whitespace may be used.  Comments are introduced with "#" anywhere on a line.
# The complete list of parameter names and allowed values can be found in the
# pgPool-II documentation.
#
# This file is read on server startup and when the server receives a SIGHUP
# signal.  If you edit the file on a running system, you have to SIGHUP the
# server for the changes to take effect, or use "pgpool reload".  Some
# parameters, which are marked below, require a server shutdown and restart to
# take effect.
#

#------------------------------------------------------------------------------
# BACKEND CLUSTERING MODE
# Choose one of: 'streaming_replication', 'native_replication',
#	'logical_replication', 'slony', 'raw' or 'snapshot_isolation'
# (change requires restart)
#------------------------------------------------------------------------------

backend_clustering_mode = 'streaming_replication'

#------------------------------------------------------------------------------
# CONNECTIONS
#------------------------------------------------------------------------------

# - pgpool Connection Settings -

#listen_addresses = 'localhost'
listen_addresses = '*'
                                   # Host name or IP address to listen on:
                                   # '*' for all, '' for no TCP/IP connections
                                   # (change requires restart)
#port = 9999
port = 5432
                                   # Port number
                                   # (change requires restart)
socket_dir = '/var/run/postgresql'
                                   # Unix domain socket path
                                   # The Debian package defaults to
                                   # /var/run/postgresql
                                   # (change requires restart)
reserved_connections = 0
                                   # Number of reserved connections.
                                   # Pgpool-II does not accept connections if over
                                   # num_init_chidlren - reserved_connections.


# - pgpool Communication Manager Connection Settings -

pcp_listen_addresses = '*'
                                   # Host name or IP address for pcp process to listen on:
                                   # '*' for all, '' for no TCP/IP connections
                                   # (change requires restart)
pcp_port = 9898
                                   # Port number for pcp
                                   # (change requires restart)
pcp_socket_dir = '/var/run/postgresql'
                                   # Unix domain socket path for pcp
                                   # The Debian package defaults to
                                   # /var/run/postgresql
                                   # (change requires restart)
listen_backlog_multiplier = 2
                                   # Set the backlog parameter of listen(2) to
                                   # num_init_children * listen_backlog_multiplier.
                                   # (change requires restart)
serialize_accept = off
                                   # whether to serialize accept() call to avoid thundering herd problem
                                   # (change requires restart)

# - Backend Connection Settings -

backend_hostname0 = 'customerdb-a'
                                   # Host name or IP address to connect to for backend 0
backend_port0 = 5432
                                   # Port number for backend 0
backend_weight0 = 1
                                   # Weight for backend 0 (only in load balancing mode)
backend_data_directory0 = '/data/pgdata13'
                                   # Data directory for backend 0
backend_flag0 = 'ALLOW_TO_FAILOVER'
                                   # Controls various backend behavior
                                   # ALLOW_TO_FAILOVER, DISALLOW_TO_FAILOVER
                                   # or ALWAYS_PRIMARY
backend_application_name0 = 'DBserver0'
                                   # walsender's application_name, used for "show pool_nodes" command
backend_hostname1 = 'customerdb-b'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/data/pgdata13'
backend_flag1 = 'ALLOW_TO_FAILOVER'
backend_application_name1 = 'DBserver1'

# - Authentication -

#enable_pool_hba = off
enable_pool_hba = on
                                   # Use pool_hba.conf for client authentication
pool_passwd = 'pool_passwd'
                                   # File name of pool_passwd for md5 authentication.
                                   # "" disables pool_passwd.
                                   # (change requires restart)
authentication_timeout = 1min
                                   # Delay in seconds to complete client authentication
                                   # 0 means no timeout.

allow_clear_text_frontend_auth = off
                                   # Allow Pgpool-II to use clear text password authentication
                                   # with clients, when pool_passwd does not
                                   # contain the user password

# - SSL Connections -

ssl = off
                                   # Enable SSL support
                                   # (change requires restart)
#ssl_key = 'server.key'
                                   # SSL private key file
                                   # (change requires restart)
#ssl_cert = 'server.crt'
                                   # SSL public certificate file
                                   # (change requires restart)
#ssl_ca_cert = ''
                                   # Single PEM format file containing
                                   # CA root certificate(s)
                                   # (change requires restart)
#ssl_ca_cert_dir = ''
                                   # Directory containing CA root certificate(s)
                                   # (change requires restart)
#ssl_crl_file = ''
                                   # SSL certificate revocation list file
                                   # (change requires restart)

ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL'
                                   # Allowed SSL ciphers
                                   # (change requires restart)
ssl_prefer_server_ciphers = off
                                   # Use server's SSL cipher preferences,
                                   # rather than the client's
                                   # (change requires restart)
ssl_ecdh_curve = 'prime256v1'
                                   # Name of the curve to use in ECDH key exchange
ssl_dh_params_file = ''
                                   # Name of the file containing Diffie-Hellman parameters used
                                   # for so-called ephemeral DH family of SSL cipher.
#ssl_passphrase_command=''
                                   # Sets an external command to be invoked when a passphrase
                                   # for decrypting an SSL file needs to be obtained
                                   # (change requires restart)

#------------------------------------------------------------------------------
# POOLS
#------------------------------------------------------------------------------

# - Concurrent session and pool size -

num_init_children = 1200
                                   # Number of concurrent sessions allowed
                                   # (change requires restart)
max_pool = 4
                                   # Number of connection pool caches per connection
                                   # (change requires restart)

# - Life time -

#child_life_time = 5min
child_life_time = 600
                                   # Pool exits after being idle for this many seconds
child_max_connections = 0
                                   # Pool exits after receiving that many connections
                                   # 0 means no exit
connection_life_time = 0
                                   # Connection to backend closes after being idle for this many seconds
                                   # 0 means no close
#client_idle_limit = 0
client_idle_limit = 600
                                   # Client is disconnected after being idle for that many seconds
                                   # (even inside an explicit transactions!)
                                   # 0 means no disconnection


#------------------------------------------------------------------------------
# LOGS
#------------------------------------------------------------------------------

# - Where to log -

log_destination = 'stderr'
                                   # Where to log
                                   # Valid values are combinations of stderr,
                                   # and syslog. Default to stderr.

# - What to log -

#log_line_prefix = '%t: pid %p: '   # printf-style string to output at beginning of each log line.
log_line_prefix = '%t: [%p]: db=%d,user=%u,app=%a '   # printf-style string to output at beginning of each log line.

#log_connections = off
log_connections = on
                                   # Log connections
#log_disconnections = off
log_disconnections = on
                                   # Log disconnections
log_hostname = off
                                   # Hostname will be shown in ps status
                                   # and in logs if connections are logged
log_statement = off
#log_statement = on
                                   # Log all statements
log_per_node_statement = off
                                   # Log all statements
                                   # with node and backend informations
log_client_messages = off
                                   # Log any client messages
log_standby_delay = 'if_over_threshold'
                                   # Log standby delay
                                   # Valid values are combinations of always,
                                   # if_over_threshold, none

# - Syslog specific -

syslog_facility = 'LOCAL0'
                                   # Syslog local facility. Default to LOCAL0
syslog_ident = 'pgpool'
                                   # Syslog program identification string
                                   # Default to 'pgpool'

# - Debug -

#log_error_verbosity = default          # terse, default, or verbose messages

#client_min_messages = notice           # values in order of decreasing detail:
                                        #   debug5
                                        #   debug4
                                        #   debug3
                                        #   debug2
                                        #   debug1
                                        #   log
                                        #   notice
                                        #   warning
                                        #   error

#log_min_messages = warning             # values in order of decreasing detail:
                                        #   debug5
                                        #   debug4
                                        #   debug3
                                        #   debug2
                                        #   debug1
                                        #   info
                                        #   notice
                                        #   warning
                                        #   error
                                        #   log
                                        #   fatal
                                        #   panic

# This is used when logging to stderr:
#logging_collector = off
logging_collector = on
                                        # Enable capturing of stderr
                                        # into log files.
                                        # (change requires restart)

# -- Only used if logging_collector is on ---

#log_directory = '/tmp/pgpool_logs'
log_directory = '/var/log/pgpool'
                                        # directory where log files are written,
                                        # can be absolute
#log_filename = 'pgpool-%Y-%m-%d_%H%M%S.log'
log_filename = 'pgpool-%Y-%m-%d_%H%M%S.log'
                                        # log file name pattern,
                                        # can include strftime() escapes

#log_file_mode = 0600
log_file_mode = 0600
                                        # creation mode for log files,
                                        # begin with 0 to use octal notation

#log_truncate_on_rotation = off
log_truncate_on_rotation = on
                                        # If on, an existing log file with the
                                        # same name as the new log file will be
                                        # truncated rather than appended to.
                                        # But such truncation only occurs on
                                        # time-driven rotation, not on restarts
                                        # or size-driven rotation.  Default is
                                        # off, meaning append to existing files
                                        # in all cases.

#log_rotation_age = 1d
log_rotation_age = 1d
                                        # Automatic rotation of logfiles will
                                        # happen after that (minutes)time.
                                        # 0 disables time based rotation.
#log_rotation_size = 10MB
log_rotation_size = 40MB
                                        # Automatic rotation of logfiles will
                                        # happen after that much (KB) log output.
                                        # 0 disables size based rotation.
#------------------------------------------------------------------------------
# FILE LOCATIONS
#------------------------------------------------------------------------------

pid_file_name = '/var/run/pgpool/pgpool.pid'
                                   # PID file name
                                   # Can be specified as relative to the"
                                   # location of pgpool.conf file or
                                   # as an absolute path
                                   # (change requires restart)
#logdir = '/tmp'
logdir = '/var/log/pgpool'
                                   # Directory of pgPool status file
                                   # (change requires restart)


#------------------------------------------------------------------------------
# CONNECTION POOLING
#------------------------------------------------------------------------------

connection_cache = on
                                   # Activate connection pools
                                   # (change requires restart)

                                   # Semicolon separated list of queries
                                   # to be issued at the end of a session
                                   # The default is for 8.3 and later
reset_query_list = 'ABORT; DISCARD ALL'
                                   # The following one is for 8.2 and before
#reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT'


#------------------------------------------------------------------------------
# REPLICATION MODE
#------------------------------------------------------------------------------

replicate_select = off
                                   # Replicate SELECT statements
                                   # when in replication mode
                                   # replicate_select is higher priority than
                                   # load_balance_mode.

insert_lock = off
                                   # Automatically locks a dummy row or a table
                                   # with INSERT statements to keep SERIAL data
                                   # consistency
                                   # Without SERIAL, no lock will be issued
lobj_lock_table = ''
                                   # When rewriting lo_creat command in
                                   # replication mode, specify table name to
                                   # lock

# - Degenerate handling -

replication_stop_on_mismatch = off
                                   # On disagreement with the packet kind
                                   # sent from backend, degenerate the node
                                   # which is most likely "minority"
                                   # If off, just force to exit this session

failover_if_affected_tuples_mismatch = off
                                   # On disagreement with the number of affected
                                   # tuples in UPDATE/DELETE queries, then
                                   # degenerate the node which is most likely
                                   # "minority".
                                   # If off, just abort the transaction to
                                   # keep the consistency


#------------------------------------------------------------------------------
# LOAD BALANCING MODE
#------------------------------------------------------------------------------

#load_balance_mode = on
load_balance_mode = off
                                   # Activate load balancing mode
                                   # (change requires restart)
ignore_leading_white_space = on
                                   # Ignore leading white spaces of each query
read_only_function_list = ''
                                   # Comma separated list of function names
                                   # that don't write to database
                                   # Regexp are accepted
write_function_list = ''
                                   # Comma separated list of function names
                                   # that write to database
                                   # Regexp are accepted
                                   # If both read_only_function_list and write_function_list
                                   # is empty, function's volatile property is checked.
                                   # If it's volatile, the function is regarded as a
                                   # writing function.

primary_routing_query_pattern_list = ''
                                   # Semicolon separated list of query patterns
                                   # that should be sent to primary node
                                   # Regexp are accepted
                                   # valid for streaming replicaton mode only.

database_redirect_preference_list = ''
                                   # comma separated list of pairs of database and node id.
                                   # example: postgres:primary,mydb[0-4]:1,mydb[5-9]:2'
                                   # valid for streaming replicaton mode only.

app_name_redirect_preference_list = ''
                                   # comma separated list of pairs of app name and node id.
                                   # example: 'psql:primary,myapp[0-4]:1,myapp[5-9]:standby'
                                   # valid for streaming replicaton mode only.
allow_sql_comments = off
                                   # if on, ignore SQL comments when judging if load balance or
                                   # query cache is possible.
                                   # If off, SQL comments effectively prevent the judgment
                                   # (pre 3.4 behavior).

disable_load_balance_on_write = 'transaction'
                                   # Load balance behavior when write query is issued
                                   # in an explicit transaction.
                                   #
                                   # Valid values:
                                   #
                                   # 'transaction' (default):
                                   #     if a write query is issued, subsequent
                                   #     read queries will not be load balanced
                                   #     until the transaction ends.
                                   #
                                   # 'trans_transaction':
                                   #     if a write query is issued, subsequent
                                   #     read queries in an explicit transaction
                                   #     will not be load balanced until the session ends.
                                   #
                                   # 'dml_adaptive':
                                   #     Queries on the tables that have already been
                                   #     modified within the current explicit transaction will
                                   #     not be load balanced until the end of the transaction.
                                   #
                                   # 'always':
                                   #     if a write query is issued, read queries will
                                   #     not be load balanced until the session ends.
                                   #
                                   # Note that any query not in an explicit transaction
                                   # is not affected by the parameter except 'always'.

dml_adaptive_object_relationship_list= ''
                                   # comma separated list of object pairs
                                   # [object]:[dependent-object], to disable load balancing
                                   # of dependent objects within the explicit transaction
                                   # after WRITE statement is issued on (depending-on) object.
                                   #
                                   # example: 'tb_t1:tb_t2,insert_tb_f_func():tb_f,tb_v:my_view'
                                   # Note: function name in this list must also be present in
                                   # the write_function_list
                                   # only valid for disable_load_balance_on_write = 'dml_adaptive'.

statement_level_load_balance = off
                                   # Enables statement level load balancing

#------------------------------------------------------------------------------
# NATIVE REPLICATION MODE
#------------------------------------------------------------------------------

# - Streaming -

sr_check_period = 10
                                   # Streaming replication check period
                                   # Disabled (0) by default
#sr_check_user = 'nobody'
sr_check_user = 'pgcheck'
                                   # Streaming replication check user
                                   # This is neccessary even if you disable streaming
                                   # replication delay check by sr_check_period = 0
sr_check_password = ''
                                   # Password for streaming replication check user
                                   # Leaving it empty will make Pgpool-II to first look for the
                                   # Password in pool_passwd file before using the empty password

sr_check_database = 'postgres'
                                   # Database name for streaming replication check
delay_threshold = 10000000
                                   # Threshold before not dispatching query to standby node
                                   # Unit is in bytes
                                   # Disabled (0) by default

# - Special commands -

#follow_primary_command = ''
follow_primary_command = '/etc/pgpool-II/follow_primary.sh %d %h %p %D %m %H %M %P %r %R'
                                   # Executes this command after main node failover
                                   # Special values:
                                   #   %d = failed node id
                                   #   %h = failed node host name
                                   #   %p = failed node port number
                                   #   %D = failed node database cluster path
                                   #   %m = new main node id
                                   #   %H = new main node hostname
                                   #   %M = old main node id
                                   #   %P = old primary node id
                                   #   %r = new main port number
                                   #   %R = new main database cluster path
                                   #   %N = old primary node hostname
                                   #   %S = old primary node port number
                                   #   %% = '%' character

#------------------------------------------------------------------------------
# HEALTH CHECK GLOBAL PARAMETERS
#------------------------------------------------------------------------------

#health_check_period = 0
health_check_period = 5
                                   # Health check period
                                   # Disabled (0) by default
#health_check_timeout = 20
health_check_timeout = 30
                                   # Health check timeout
                                   # 0 means no timeout
#health_check_user = 'nobody'
health_check_user = 'pgcheck'
                                   # Health check user
health_check_password = ''
                                   # Password for health check user
                                   # Leaving it empty will make Pgpool-II to first look for the
                                   # Password in pool_passwd file before using the empty password

health_check_database = ''
                                   # Database name for health check. If '', tries 'postgres' frist, 
#health_check_max_retries = 0
health_check_max_retries = 5
                                   # Maximum number of times to retry a failed health check before giving up.
health_check_retry_delay = 1
                                   # Amount of time to wait (in seconds) between retries.
#connect_timeout = 10000
connect_timeout = 60000
                                   # Timeout value in milliseconds before giving up to connect to backend.
                                   # Default is 10000 ms (10 second). Flaky network user may want to increase
                                   # the value. 0 means no timeout.
                                   # Note that this value is not only used for health check,
                                   # but also for ordinary conection to backend.

#------------------------------------------------------------------------------
# HEALTH CHECK PER NODE PARAMETERS (OPTIONAL)
#------------------------------------------------------------------------------
#health_check_period0 = 0
#health_check_timeout0 = 20
#health_check_user0 = 'nobody'
#health_check_password0 = ''
#health_check_database0 = ''
#health_check_max_retries0 = 0
#health_check_retry_delay0 = 1
#connect_timeout0 = 10000

#------------------------------------------------------------------------------
# FAILOVER AND FAILBACK
#------------------------------------------------------------------------------

#failover_command = ''
failover_command = '/etc/pgpool-II/failover.sh %d %h %p %D %m %H %M %P %r %R %N %S'
                                   # Executes this command at failover
                                   # Special values:
                                   #   %d = failed node id
                                   #   %h = failed node host name
                                   #   %p = failed node port number
                                   #   %D = failed node database cluster path
                                   #   %m = new main node id
                                   #   %H = new main node hostname
                                   #   %M = old main node id
                                   #   %P = old primary node id
                                   #   %r = new main port number
                                   #   %R = new main database cluster path
                                   #   %N = old primary node hostname
                                   #   %S = old primary node port number
                                   #   %% = '%' character
failback_command = ''
                                   # Executes this command at failback.
                                   # Special values:
                                   #   %d = failed node id
                                   #   %h = failed node host name
                                   #   %p = failed node port number
                                   #   %D = failed node database cluster path
                                   #   %m = new main node id
                                   #   %H = new main node hostname
                                   #   %M = old main node id
                                   #   %P = old primary node id
                                   #   %r = new main port number
                                   #   %R = new main database cluster path
                                   #   %N = old primary node hostname
                                   #   %S = old primary node port number
                                   #   %% = '%' character

failover_on_backend_error = on
                                   # Initiates failover when reading/writing to the
                                   # backend communication socket fails
                                   # If set to off, pgpool will report an
                                   # error and disconnect the session.

detach_false_primary = off
                                   # Detach false primary if on. Only
                                   # valid in streaming replicaton
                                   # mode and with PostgreSQL 9.6 or
                                   # after.

search_primary_node_timeout = 5min
                                   # Timeout in seconds to search for the
                                   # primary node when a failover occurs.
                                   # 0 means no timeout, keep searching
                                   # for a primary node forever.

#------------------------------------------------------------------------------
# ONLINE RECOVERY
#------------------------------------------------------------------------------

recovery_user = 'nobody'
                                   # Online recovery user
recovery_password = ''
                                   # Online recovery password
                                   # Leaving it empty will make Pgpool-II to first look for the
                                   # Password in pool_passwd file before using the empty password

recovery_1st_stage_command = ''
                                   # Executes a command in first stage
recovery_2nd_stage_command = ''
                                   # Executes a command in second stage
recovery_timeout = 90
                                   # Timeout in seconds to wait for the
                                   # recovering node's postmaster to start up
                                   # 0 means no wait
client_idle_limit_in_recovery = 0
                                   # Client is disconnected after being idle
                                   # for that many seconds in the second stage
                                   # of online recovery
                                   # 0 means no disconnection
                                   # -1 means immediate disconnection

auto_failback = off
                                   # Dettached backend node reattach automatically
                                   # if replication_state is 'streaming'.
auto_failback_interval = 1min
                                   # Min interval of executing auto_failback in
                                   # seconds.

#------------------------------------------------------------------------------
# WATCHDOG
#------------------------------------------------------------------------------

# - Enabling -

#use_watchdog = off
use_watchdog = on
                                    # Activates watchdog
                                    # (change requires restart)

# -Connection to up stream servers -

trusted_servers = ''
                                    # trusted server list which are used
                                    # to confirm network connection
                                    # (hostA,hostB,hostC,...)
                                    # (change requires restart)
ping_path = '/bin'
                                    # ping command path
                                    # (change requires restart)

# - Watchdog communication Settings -

hostname0 = 'pgpool-0'
                                    # Host name or IP address of pgpool node
                                    # for watchdog connection
                                    # (change requires restart)
wd_port0 = 9000
                                    # Port number for watchdog service
                                    # (change requires restart)
pgpool_port0 = 9999
                                    # Port number for pgpool
                                    # (change requires restart)

hostname1 = 'pgpool-1'
wd_port1 = 9000
pgpool_port1 = 9999

hostname2 = 'pgpool-2'
wd_port2 = 9000
pgpool_port2 = 9999

wd_priority = 1
                                    # priority of this watchdog in leader election
                                    # (change requires restart)

wd_authkey = ''
                                    # Authentication key for watchdog communication
                                    # (change requires restart)

wd_ipc_socket_dir = '/var/run/postgresql'
                                    # Unix domain socket path for watchdog IPC socket
                                    # The Debian package defaults to
                                    # /var/run/postgresql
                                    # (change requires restart)


# - Virtual IP control Setting -

#delegate_IP = ''
delegate_IP = '10.102.46.100'
                                    # delegate IP address
                                    # If this is empty, virtual IP never bring up.
                                    # (change requires restart)
if_cmd_path = '/sbin'
                                    # path to the directory where if_up/down_cmd exists
                                    # If if_up/down_cmd starts with "/", if_cmd_path will be ignored.
                                    # (change requires restart)
#if_up_cmd = '/usr/bin/sudo /sbin/ip addr add $_IP_$/24 dev eth0 label eth0:0'
if_up_cmd = '/bin/sudo /sbin/ip addr add $_IP_$/24 dev ens192 label ens192:0'
                                    # startup delegate IP command
                                    # (change requires restart)
#if_down_cmd = '/usr/bin/sudo /sbin/ip addr del $_IP_$/24 dev eth0'
if_down_cmd = '/bin/sudo /sbin/ip addr del $_IP_$/24 dev ens192'
                                    # shutdown delegate IP command
                                    # (change requires restart)
arping_path = '/usr/sbin'
                                    # arping command path
                                    # If arping_cmd starts with "/", if_cmd_path will be ignored.
                                    # (change requires restart)
#arping_cmd = '/usr/bin/sudo /usr/sbin/arping -U $_IP_$ -w 1 -I eth0'
arping_cmd = '/bin/sudo /usr/sbin/arping -U $_IP_$ -w 1 -I ens192'
                                    # arping command
                                    # (change requires restart)

# - Behaivor on escalation Setting -

clear_memqcache_on_escalation = on
                                    # Clear all the query cache on shared memory
                                    # when standby pgpool escalate to active pgpool
                                    # (= virtual IP holder).
                                    # This should be off if client connects to pgpool
                                    # not using virtual IP.
                                    # (change requires restart)
#wd_escalation_command = ''
wd_escalation_command = '/etc/pgpool-II/escalation.sh'
                                    # Executes this command at escalation on new active pgpool.
                                    # (change requires restart)
wd_de_escalation_command = ''
                                    # Executes this command when leader pgpool resigns from being leader.
                                    # (change requires restart)

# - Watchdog consensus settings for failover -

failover_when_quorum_exists = on
                                    # Only perform backend node failover
                                    # when the watchdog cluster holds the quorum
                                    # (change requires restart)

failover_require_consensus = on
                                    # Perform failover when majority of Pgpool-II nodes
                                    # aggrees on the backend node status change
                                    # (change requires restart)

allow_multiple_failover_requests_from_node = off
                                    # A Pgpool-II node can cast multiple votes
                                    # for building the consensus on failover
                                    # (change requires restart)


enable_consensus_with_half_votes = off
                                    # apply majority rule for consensus and quorum computation
                                    # at 50% of votes in a cluster with even number of nodes.
                                    # when enabled the existence of quorum and consensus
                                    # on failover is resolved after receiving half of the
                                    # total votes in the cluster, otherwise both these
                                    # decisions require at least one more vote than
                                    # half of the total votes.
                                    # (change requires restart)

# - Lifecheck Setting -

# -- common --

wd_monitoring_interfaces_list = ''
                                    # Comma separated list of interfaces names to monitor.
                                    # if any interface from the list is active the watchdog will
                                    # consider the network is fine
                                    # 'any' to enable monitoring on all interfaces except loopback
                                    # '' to disable monitoring
                                    # (change requires restart)

wd_lifecheck_method = 'heartbeat'
                                    # Method of watchdog lifecheck ('heartbeat' or 'query' or 'external')
                                    # (change requires restart)
wd_interval = 10
                                    # lifecheck interval (sec) > 0
                                    # (change requires restart)

# -- heartbeat mode --

heartbeat_hostname0 = 'pgpool-0'
                                    # Host name or IP address used
                                    # for sending heartbeat signal.
                                    # (change requires restart)
heartbeat_port0 = 9694
                                    # Port number used for receiving/sending heartbeat signal
                                    # Usually this is the same as heartbeat_portX.
                                    # (change requires restart)
heartbeat_device0 = ''
                                    # Name of NIC device (such like 'eth0')
                                    # used for sending/receiving heartbeat
                                    # signal to/from destination 0.
                                    # This works only when this is not empty
                                    # and pgpool has root privilege.
                                    # (change requires restart)

heartbeat_hostname1 = 'pgpool-1'
heartbeat_port1 = 9694
heartbeat_device1 = ''
heartbeat_hostname2 = 'pgpool-2'
heartbeat_port2 = 9694
heartbeat_device2 = ''

wd_heartbeat_keepalive = 2
                                    # Interval time of sending heartbeat signal (sec)
                                    # (change requires restart)
wd_heartbeat_deadtime = 30
                                    # Deadtime interval for heartbeat signal (sec)
                                    # (change requires restart)

# -- query mode --

wd_life_point = 3
                                    # lifecheck retry times
                                    # (change requires restart)
wd_lifecheck_query = 'SELECT 1'
                                    # lifecheck query to pgpool from watchdog
                                    # (change requires restart)
wd_lifecheck_dbname = 'template1'
                                    # Database name connected for lifecheck
                                    # (change requires restart)
wd_lifecheck_user = 'nobody'
                                    # watchdog user monitoring pgpools in lifecheck
                                    # (change requires restart)
wd_lifecheck_password = ''
                                    # Password for watchdog user in lifecheck
                                    # Leaving it empty will make Pgpool-II to first look for the
                                    # Password in pool_passwd file before using the empty password
                                    # (change requires restart)

#------------------------------------------------------------------------------
# OTHERS
#------------------------------------------------------------------------------
relcache_expire = 0
                                   # Life time of relation cache in seconds.
                                   # 0 means no cache expiration(the default).
                                   # The relation cache is used for cache the
                                   # query result against PostgreSQL system
                                   # catalog to obtain various information
                                   # including table structures or if it's a
                                   # temporary table or not. The cache is
                                   # maintained in a pgpool child local memory
                                   # and being kept as long as it survives.
                                   # If someone modify the table by using
                                   # ALTER TABLE or some such, the relcache is
                                   # not consistent anymore.
                                   # For this purpose, cache_expiration
                                   # controls the life time of the cache.
relcache_size = 256
                                   # Number of relation cache
                                   # entry. If you see frequently:
                                   # "pool_search_relcache: cache replacement happend"
                                   # in the pgpool log, you might want to increate this number.

check_temp_table = catalog
                                   # Temporary table check method. catalog, trace or none.
                                   # Default is catalog.

check_unlogged_table = on
                                   # If on, enable unlogged table check in SELECT statements.
                                   # This initiates queries against system catalog of primary/main
                                   # thus increases load of primary.
                                   # If you are absolutely sure that your system never uses unlogged tables
                                   # and you want to save access to primary/main, you could turn this off.
                                   # Default is on.
enable_shared_relcache = on
                                   # If on, relation cache stored in memory cache,
                                   # the cache is shared among child process.
                                   # Default is on.
                                   # (change requires restart)

relcache_query_target = primary
                                   # Target node to send relcache queries. Default is primary node.
                                   # If load_balance_node is specified, queries will be sent to load balance node.
#------------------------------------------------------------------------------
# IN MEMORY QUERY MEMORY CACHE
#------------------------------------------------------------------------------
memory_cache_enabled = off
                                   # If on, use the memory cache functionality, off by default
                                   # (change requires restart)
memqcache_method = 'shmem'
                                   # Cache storage method. either 'shmem'(shared memory) or
                                   # 'memcached'. 'shmem' by default
                                   # (change requires restart)
memqcache_memcached_host = 'localhost'
                                   # Memcached host name or IP address. Mandatory if
                                   # memqcache_method = 'memcached'.
                                   # Defaults to localhost.
                                   # (change requires restart)
memqcache_memcached_port = 11211
                                   # Memcached port number. Mondatory if memqcache_method = 'memcached'.
                                   # Defaults to 11211.
                                   # (change requires restart)
memqcache_total_size = 64MB
                                   # Total memory size in bytes for storing memory cache.
                                   # Mandatory if memqcache_method = 'shmem'.
                                   # Defaults to 64MB.
                                   # (change requires restart)
memqcache_max_num_cache = 1000000
                                   # Total number of cache entries. Mandatory
                                   # if memqcache_method = 'shmem'.
                                   # Each cache entry consumes 48 bytes on shared memory.
                                   # Defaults to 1,000,000(45.8MB).
                                   # (change requires restart)
memqcache_expire = 0
                                   # Memory cache entry life time specified in seconds.
                                   # 0 means infinite life time. 0 by default.
                                   # (change requires restart)
memqcache_auto_cache_invalidation = on
                                   # If on, invalidation of query cache is triggered by corresponding
                                   # DDL/DML/DCL(and memqcache_expire).  If off, it is only triggered
                                   # by memqcache_expire.  on by default.
                                   # (change requires restart)
memqcache_maxcache = 400kB
                                   # Maximum SELECT result size in bytes.
                                   # Must be smaller than memqcache_cache_block_size. Defaults to 400KB.
                                   # (change requires restart)
memqcache_cache_block_size = 1MB
                                   # Cache block size in bytes. Mandatory if memqcache_method = 'shmem'.
                                   # Defaults to 1MB.
                                   # (change requires restart)
memqcache_oiddir = '/var/log/pgpool/oiddir'
                                   # Temporary work directory to record table oids
                                   # (change requires restart)
cache_safe_memqcache_table_list = ''
                                   # Comma separated list of table names to memcache
                                   # that don't write to database
                                   # Regexp are accepted
cache_unsafe_memqcache_table_list = ''
                                   # Comma separated list of table names not to memcache
                                   # that don't write to database
                                   # Regexp are accepted
pgpool.conf (48,377 bytes)   

pengbo

2023-09-27 12:19

developer   ~0004432

It seems pgpool-0 could not connect to customerdb-a.
Could you make sure you can connect to customerdb-a from pgpool-0?

supakit.chavar

2023-09-27 16:12

reporter   ~0004433

Dear Pengbo,

  I try to summary the events from log of 3 pgpool servers. please see the summary in pgpool_log_summary.txt attached file.

  Refer to your question , i found that within same second some pgpool process can connect to customerdb-a: 5432 some process cannot.

  Please see more detail in the attached file.
pgpool_log_summary.txt (46,371 bytes)   
after check pgpool log on 3 servers (10.102.46.175,10.102.46.176,10.102.46.177)  we found the events as below:-

10.102.46.175 is leader
10.102.46.176,10.102.46.177  are member

1. <10.102.46.177> see 10.102.46.177_pgpool-2023-09-26_000000.log.zip
2023-09-26 10:26:18: [30739]: db=[No Connection],user=[No Connection],app=health_check0 ERROR:  health check timed out while waiting for reading data
2023-09-26 10:26:18: [30739]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 (round:1)
2023-09-26 10:26:25: [30739]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 succeeded
2023-09-26 10:27:16: [30739]: db=[No Connection],user=[No Connection],app=health_check0 ERROR:  health check timed out while waiting for reading data
2023-09-26 10:27:33: [29515]: db=[No Connection],user=[No Connection],app=main LOG:  leader watchdog has performed failover
2023-09-26 10:29:39: [30739]: db=[No Connection],user=[No Connection],app=health_check0 ERROR:  failed to make persistent db connection
2023-09-26 10:29:39: [30739]: db=[No Connection],user=[No Connection],app=health_check0 DETAIL:  connection to host:"customerdb-a:5432" failed
2023-09-26 10:29:39: [30739]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 (round:1)
2023-09-26 10:29:43: [30739]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 succeeded
2023-09-26 10:31:13: [30739]: db=[No Connection],user=[No Connection],app=health_check0 ERROR:  failed to make persistent db connection
2023-09-26 10:31:13: [30739]: db=[No Connection],user=[No Connection],app=health_check0 DETAIL:  connection to host:"customerdb-a:5432" failed
2023-09-26 10:31:13: [30739]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 (round:1)
2023-09-26 10:31:26: [29518]: db=[No Connection],user=[No Connection],app=watchdog LOG:  leader/coordinator node "pgpool-0:9999 Linux T1VMPDDB89" decided to resigning from leader, probably because of split-brain
2023-09-26 10:31:27: [29518]: db=[No Connection],user=[No Connection],app=watchdog LOG:  setting the remote node "pgpool-1:9999 Linux T1VMPDDB90" as watchdog cluster leader
2023-09-26 10:31:30: [30739]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 succeeded
2023-09-26 10:33:26: [30739]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  failed to connect to PostgreSQL server on "customerdb-a:5432" using INET socket
2023-09-26 10:33:26: [30739]: db=[No Connection],user=[No Connection],app=health_check0 DETAIL:  health check timer expired
2023-09-26 10:33:26: [30739]: db=[No Connection],user=[No Connection],app=health_check0 ERROR:  failed to make persistent db connection
2023-09-26 10:33:26: [30739]: db=[No Connection],user=[No Connection],app=health_check0 DETAIL:  connection to host:"customerdb-a:5432" failed
2023-09-26 10:33:26: [30739]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 (round:1)
2023-09-26 10:33:39: [30739]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 succeeded
2023-09-26 10:34:26: [29518]: db=[No Connection],user=[No Connection],app=watchdog LOG:  new IPC connection received

everything is fine after 2023-09-26 10:34:26

2. <10.102.46.176> see 10.102.46.176_pgpool-2023-09-26_000000.log.zip
2023-09-26 10:27:16: [32514]: db=[No Connection],user=[No Connection],app=health_check0 ERROR:  health check timed out while waiting for reading data
2023-09-26 10:27:16: [32514]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 (round:1)
2023-09-26 10:27:32: [32514]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 succeeded
2023-09-26 10:27:33: [31286]: db=[No Connection],user=[No Connection],app=main LOG:  leader watchdog has performed failover
2023-09-26 10:29:39: [32514]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  failed to connect to PostgreSQL server on "customerdb-a:5432" using INET socket
2023-09-26 10:29:39: [32514]: db=[No Connection],user=[No Connection],app=health_check0 DETAIL:  health check timer expired
2023-09-26 10:29:39: [32514]: db=[No Connection],user=[No Connection],app=health_check0 ERROR:  failed to make persistent db connection
2023-09-26 10:29:39: [32514]: db=[No Connection],user=[No Connection],app=health_check0 DETAIL:  connection to host:"customerdb-a:5432" failed
2023-09-26 10:29:39: [32514]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 (round:1)
2023-09-26 10:29:43: [32514]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 succeeded
2023-09-26 10:31:13: [32514]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  failed to connect to PostgreSQL server on "customerdb-a:5432" using INET socket
2023-09-26 10:31:13: [32514]: db=[No Connection],user=[No Connection],app=health_check0 DETAIL:  health check timer expired
2023-09-26 10:31:13: [32514]: db=[No Connection],user=[No Connection],app=health_check0 ERROR:  failed to make persistent db connection
2023-09-26 10:31:13: [32514]: db=[No Connection],user=[No Connection],app=health_check0 DETAIL:  connection to host:"customerdb-a:5432" failed
2023-09-26 10:31:13: [32514]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 (round:1)
2023-09-26 10:31:26: [31289]: db=[No Connection],user=[No Connection],app=watchdog LOG:  leader/coordinator node "pgpool-0:9999 Linux T1VMPDDB89" decided to resigning from leader, probably because of split-brain
2023-09-26 10:31:27: [31289]: db=[No Connection],user=[No Connection],app=watchdog LOG:  setting the local node "pgpool-1:9999 Linux T1VMPDDB90" as watchdog cluster leader
2023-09-26 10:31:29: [3906]: db=[No Connection],user=[No Connection],app=child LOG:  new connection received
2023-09-26 10:31:29: [3906]: db=[No Connection],user=[No Connection],app=child DETAIL:  connecting host=10.102.46.107 port=55046
2023-09-26 10:31:30: [3906]: db=corebiz,user=corebiz.app,app=postgres_fdw LOG:  statement: SET search_path = pg_catalog
2023-09-26 10:31:30: [32514]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 succeeded
2023-09-26 10:33:43: [32514]: db=[No Connection],user=[No Connection],app=health_check0 ERROR:  failed to make persistent db connection
2023-09-26 10:33:43: [32514]: db=[No Connection],user=[No Connection],app=health_check0 DETAIL:  connection to host:"customerdb-a:5432" failed
2023-09-26 10:33:43: [32514]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 (round:1)
2023-09-26 10:33:43: [2965]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:44: [3022]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:44: [3976]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:44: [3945]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:44: [3688]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:44: [3679]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:44: [2999]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:44: [3496]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:44: [3328]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:44: [3952]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:44: [3219]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:44: [3233]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:56: [3217]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:56: [3044]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:57: [3650]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:33:59: [3757]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:34:03: [3090]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket

everything is fine after 2023-09-26 10:34:22


3. <10.102.46.175> see 10.102.46.175_pgpool-2023-09-26_102619.log.zip
2023-09-26 10:26:24: [22402]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 succeeded
2023-09-26 10:27:16: [22402]: db=[No Connection],user=[No Connection],app=health_check0 ERROR:  health check timed out while waiting for reading data
2023-09-26 10:27:16: [22402]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 (round:1)
2023-09-26 10:27:29: [22133]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
2023-09-26 10:27:29: [659]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
2023-09-26 10:27:29: [655]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
2023-09-26 10:27:29: [1891]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
2023-09-26 10:27:29: [3156]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
2023-09-26 10:27:29: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2023-09-26 10:27:29: [1891]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  executing failover on backend
2023-09-26 10:27:29: [3156]: db=[No Connection],user=[No Connection],app=[unknown] FATAL:  failed to create a backend connection
2023-09-26 10:27:29: [3156]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  executing failover on backend
2023-09-26 10:27:29: [21117]: db=[No Connection],user=[No Connection],app=main LOG:  failover: no valid backend node found
2023-09-26 10:27:29: [21117]: db=[No Connection],user=[No Connection],app=main LOG:  Restart all children
2023-09-26 10:27:30: [8610]: db=[No Connection],user=[No Connection],app=child FATAL:  pgpool is not accepting any new connections
2023-09-26 10:27:30: [8610]: db=[No Connection],user=[No Connection],app=child DETAIL:  all backend nodes are down, pgpool requires at least one valid node
2023-09-26 10:27:30: [8610]: db=[No Connection],user=[No Connection],app=child HINT:  repair the backend nodes and restart pgpool
2023-09-26 10:27:30: [8607]: db=[No Connection],user=[No Connection],app=child FATAL:  pgpool is not accepting any new connections
2023-09-26 10:27:30: [8607]: db=[No Connection],user=[No Connection],app=child DETAIL:  all backend nodes are down, pgpool requires at least one valid node
2023-09-26 10:27:30: [21117]: db=[No Connection],user=[No Connection],app=main LOG:  failover: no valid backend node found
2023-09-26 10:27:30: [21117]: db=[No Connection],user=[No Connection],app=main LOG:  Restart all children
2023-09-26 10:27:31: [11995]: db=[No Connection],user=[No Connection],app=child FATAL:  pgpool is not accepting any new connections
2023-09-26 10:27:31: [11995]: db=[No Connection],user=[No Connection],app=child DETAIL:  all backend nodes are down, pgpool requires at least one valid node
2023-09-26 10:27:31: [11995]: db=[No Connection],user=[No Connection],app=child HINT:  repair the backend nodes and restart pgpool
2023-09-26 10:27:32: [20076]: db=[No Connection],user=[No Connection],app=pcp_main LOG:  restart request received in pcp child process
2023-09-26 10:27:32: [11905]: db=[No Connection],user=[No Connection],app=child FATAL:  pgpool is not accepting any new connections
2023-09-26 10:27:32: [11905]: db=[No Connection],user=[No Connection],app=child DETAIL:  all backend nodes are down, pgpool requires at least one valid node
2023-09-26 10:27:32: [11905]: db=[No Connection],user=[No Connection],app=child HINT:  repair the backend nodes and restart pgpool
2023-09-26 10:27:34: [14406]: db=[No Connection],user=[No Connection],app=child LOG:  new connection received
2023-09-26 10:27:34: [14406]: db=[No Connection],user=[No Connection],app=child DETAIL:  connecting host=10.102.44.103 port=44584
2023-09-26 10:27:34: [14408]: db=[No Connection],user=[No Connection],app=child LOG:  new connection received
2023-09-26 10:27:34: [14408]: db=[No Connection],user=[No Connection],app=child DETAIL:  connecting host=10.102.44.103 port=44586
2023-09-26 10:27:34: [14411]: db=[No Connection],user=[No Connection],app=child LOG:  new connection received
2023-09-26 10:27:34: [14411]: db=[No Connection],user=[No Connection],app=child DETAIL:  connecting host=10.102.44.60 port=33742
2023-09-26 10:27:34: [14377]: db=[No Connection],user=[No Connection],app=child LOG:  new connection received
2023-09-26 10:27:34: [14377]: db=[No Connection],user=[No Connection],app=child DETAIL:  connecting host=10.102.49.38 port=55874
2023-09-26 10:27:34: [14418]: db=[No Connection],user=[No Connection],app=child LOG:  new connection received
2023-09-26 10:27:34: [14418]: db=[No Connection],user=[No Connection],app=child DETAIL:  connecting host=10.102.44.103 port=44590
2023-09-26 10:27:34: [14417]: db=[No Connection],user=[No Connection],app=child LOG:  new connection received
2023-09-26 10:27:34: [14417]: db=[No Connection],user=[No Connection],app=child DETAIL:  connecting host=10.102.44.55 port=59680
2023-09-26 10:27:34: [14398]: db=[No Connection],user=[No Connection],app=child LOG:  new connection received
2023-09-26 10:27:34: [14398]: db=[No Connection],user=[No Connection],app=child DETAIL:  connecting host=10.102.44.22 port=59096
2023-09-26 10:29:12: [14154]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:29:12: [14154]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.44.75 port=47168
2023-09-26 10:29:12: [13611]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:29:12: [13611]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.44.68 port=52216
2023-09-26 10:29:39: [22402]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  failed to connect to PostgreSQL server on "customerdb-a:5432" using INET socket
2023-09-26 10:29:39: [22402]: db=[No Connection],user=[No Connection],app=health_check0 DETAIL:  health check timer expired
2023-09-26 10:29:39: [22402]: db=[No Connection],user=[No Connection],app=health_check0 ERROR:  failed to make persistent db connection
2023-09-26 10:29:39: [22402]: db=[No Connection],user=[No Connection],app=health_check0 DETAIL:  connection to host:"customerdb-a:5432" failed
2023-09-26 10:29:39: [22402]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 (round:1)
2023-09-26 10:29:39: [13700]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:29:39: [13700]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.44.102 port=43056
2023-09-26 10:29:39: [13491]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:29:39: [13491]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.44.51 port=43340
2023-09-26 10:29:39: [13947]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:29:39: [13947]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.49.38 port=56660
2023-09-26 10:29:39: [13978]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:29:39: [13978]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.49.39 port=50192
2023-09-26 10:29:39: [14108]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:29:43: [22402]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 succeeded
2023-09-26 10:29:50: [13460]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:29:50: [13460]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.44.103 port=35790
2023-09-26 10:29:50: [14170]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:29:50: [14170]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.44.159 port=43832
2023-09-26 10:29:05: [14146]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:29:05: [14146]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.44.22 port=37702
2023-09-26 10:30:50: [14146]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  received degenerate backend request for node_id: 0 from pid [14146]
2023-09-26 10:30:50: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  new IPC connection received
2023-09-26 10:30:50: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  watchdog received the failover command from local pgpool-II on IPC interface
2023-09-26 10:30:50: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2023-09-26 10:30:50: [14146]: db=[No Connection],user=[No Connection],app=[unknown] WARNING:  write on backend 0 failed with error :"Broken pipe"
2023-09-26 10:30:50: [14146]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  while trying to write data from offset: 0 wlen: 5
2023-09-26 10:30:50: [14334]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:30:50: [14334]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.44.59 port=55538
2023-09-26 10:30:50: [14334]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  frontend disconnection: session time: 0:00:00.003 user=corebiz_bcp.app database=corebiz host=10.102.44.59 port=55538
2023-09-26 10:30:56: [13751]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:30:56: [13751]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:30:56: [13246]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:30:56: [13246]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:30:56: [14152]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:30:56: [14152]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:30:56: [13413]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:30:56: [13413]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:30:56: [14071]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:30:56: [14071]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:30:56: [13673]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:30:56: [13673]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.44.62 port=35218
2023-09-26 10:30:56: [13588]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:30:56: [13588]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:30:56: [14010]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:30:56: [14010]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:30:56: [14257]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:30:56: [14257]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.49.39 port=50604
2023-09-26 10:30:56: [13462]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:30:56: [13462]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.44.102 port=43350
2023-09-26 10:30:56: [14171]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:30:56: [14171]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:30:56: [13224]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:30:56: [13224]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:30:56: [14294]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:30:56: [14294]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:30:56: [13946]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:30:56: [13946]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:30:56: [14393]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:30:56: [14393]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:30:56: [13370]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:30:56: [13370]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:30:56: [13723]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:03: [22402]: db=[No Connection],user=[No Connection],app=health_check0 ERROR:  health check timed out while waiting for reading data
2023-09-26 10:31:03: [22402]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 (round:1)
2023-09-26 10:31:03: [14155]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:09: [13513]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:31:09: [13513]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.49.38 port=57104
2023-09-26 10:31:09: [13770]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  new connection received
2023-09-26 10:31:09: [13770]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  connecting host=10.102.49.62 port=42884
2023-09-26 10:31:09: [13914]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:09: [13914]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:09: [14024]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:10: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  new IPC connection received
2023-09-26 10:31:10: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  watchdog received the failover command from local pgpool-II on IPC interface
2023-09-26 10:31:10: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2023-09-26 10:31:10: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  failover requires the majority vote, waiting for consensus
2023-09-26 10:31:10: [21120]: db=[No Connection],user=[No Connection],app=watchdog DETAIL:  failover request noted
2023-09-26 10:31:10: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  failover command [DEGENERATE_BACKEND_REQUEST] request from pgpool-II node "pgpool-0:9999 Linux T1VMPDDB89" is queued, waiting for the confirmation from other nodes
2023-09-26 10:31:10: [14121]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  degenerate backend request for node_id: 0 from pid [14121], will be handled by watchdog, which is building consensus for request
2023-09-26 10:31:12: [13871]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:12: [13871]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:12: [13343]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:12: [13343]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:12: [13797]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:12: [13797]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:12: [13713]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:12: [13713]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:12: [13666]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:12: [13666]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:12: [13430]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:12: [13430]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:12: [13846]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:12: [13846]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:12: [13839]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:12: [13839]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:12: [13572]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:12: [13572]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:12: [13610]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:12: [13610]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [13560]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [13560]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [13290]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [13290]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [14079]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [14079]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [13557]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [13557]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [14302]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [14302]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [13852]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [13852]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [13641]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [13641]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [14411]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [14411]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [13440]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [13440]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [13863]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [13863]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [14328]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [14328]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [13781]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [13781]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [14374]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [14374]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [13178]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [13178]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [14248]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [14248]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [13190]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [13190]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:13: [13759]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  trying connecting to PostgreSQL server on "customerdb-a:5432" by INET socket
2023-09-26 10:31:13: [13759]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  timed out. retrying...
2023-09-26 10:31:14: [14075]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
2023-09-26 10:31:14: [14075]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  Operation already in progress
2023-09-26 10:31:14: [13516]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
2023-09-26 10:31:14: [13516]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  Operation already in progress
2023-09-26 10:31:14: [13707]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
2023-09-26 10:31:14: [13707]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  Operation already in progress
2023-09-26 10:31:14: [13306]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  failed to connect to PostgreSQL server on "customerdb-a:5432", getsockopt() failed
2023-09-26 10:31:14: [13306]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  Operation already in progress
2023-09-26 10:31:14: [14075]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  received degenerate backend request for node_id: 0 from pid [14075]
2023-09-26 10:31:14: [13516]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  received degenerate backend request for node_id: 0 from pid [13516]
2023-09-26 10:31:14: [13306]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  received degenerate backend request for node_id: 0 from pid [13306]
2023-09-26 10:31:14: [13707]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  received degenerate backend request for node_id: 0 from pid [13707]
2023-09-26 10:31:14: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
2023-09-26 10:31:14: [13306]: db=[No Connection],user=[No Connection],app=[unknown] FATAL:  failed to create a backend connection
2023-09-26 10:31:14: [13306]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  executing failover on backend
2023-09-26 10:31:14: [13707]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  degenerate backend request for 1 node(s) from pid [13707], is changed to quarantine node request by watchdog
2023-09-26 10:31:14: [13707]: db=[No Connection],user=[No Connection],app=[unknown] DETAIL:  watchdog is taking time to build consensus
2023-09-26 10:31:14: [13516]: db=[No Connection],user=[No Connection],app=[unknown] LOG:  degenerate backend request for 1 node(s) from pid [13516], is changed to quarantine node request by watchdog
2023-09-26 10:31:14: [21117]: db=[No Connection],user=[No Connection],app=main LOG:  failover: no valid backend node found
2023-09-26 10:31:14: [21117]: db=[No Connection],user=[No Connection],app=main LOG:  Restart all children
2023-09-26 10:31:14: [15466]: db=[No Connection],user=[No Connection],app=child FATAL:  pgpool is not accepting any new connections
2023-09-26 10:31:14: [15466]: db=[No Connection],user=[No Connection],app=child DETAIL:  all backend nodes are down, pgpool requires at least one valid node
2023-09-26 10:31:14: [15466]: db=[No Connection],user=[No Connection],app=child HINT:  repair the backend nodes and restart pgpool
2023-09-26 10:31:14: [15556]: db=[No Connection],user=[No Connection],app=child HINT:  repair the backend nodes and restart pgpool
2023-09-26 10:31:14: [15558]: db=[No Connection],user=[No Connection],app=child FATAL:  pgpool is not accepting any new connections
2023-09-26 10:31:14: [15558]: db=[No Connection],user=[No Connection],app=child DETAIL:  all backend nodes are down, pgpool requires at least one valid node
2023-09-26 10:31:16: [20274]: db=[No Connection],user=[No Connection],app=pcp_main LOG:  PCP process: 20274 started
2023-09-26 10:31:16: [19551]: db=[No Connection],user=[No Connection],app=child FATAL:  pgpool is not accepting any new connections
2023-09-26 10:31:16: [19551]: db=[No Connection],user=[No Connection],app=child DETAIL:  all backend nodes are down, pgpool requires at least one valid node
2023-09-26 10:31:16: [19551]: db=[No Connection],user=[No Connection],app=child HINT:  repair the backend nodes and restart pgpool
2023-09-26 10:31:26: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  We are not able to build consensus for our primary node failover request, got 1 votes only for failover request ID:46642
2023-09-26 10:31:26: [21120]: db=[No Connection],user=[No Connection],app=watchdog DETAIL:  resigning from the coordinator
2023-09-26 10:31:26: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  watchdog node state changed from [LEADER] to [JOINING]
2023-09-26 10:31:26: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  removing the local node "pgpool-0:9999 Linux T1VMPDDB89" from watchdog cluster leader
2023-09-26 10:31:26: [21729]: db=[No Connection],user=[No Connection],app=watchdog_utility LOG:  watchdog: de-escalation started
2023-09-26 10:31:26: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  watchdog node state changed from [JOINING] to [INITIALIZING]
2023-09-26 10:31:26: [21729]: db=[No Connection],user=[No Connection],app=watchdog_utility LOG:  successfully released the delegate IP:"10.102.46.100"
2023-09-26 10:31:26: [21729]: db=[No Connection],user=[No Connection],app=watchdog_utility DETAIL:  'if_down_cmd' returned with success
2023-09-26 10:31:26: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  watchdog de-escalation process with pid: 21729 exit with SUCCESS.
2023-09-26 10:31:27: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  watchdog node state changed from [INITIALIZING] to [STANDING FOR LEADER]
2023-09-26 10:31:27: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  our stand for coordinator request is rejected by node "pgpool-2:9999 Linux T1VMPDDB91"
2023-09-26 10:31:27: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  watchdog node state changed from [STANDING FOR LEADER] to [PARTICIPATING IN ELECTION]
2023-09-26 10:31:27: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  watchdog node state changed from [PARTICIPATING IN ELECTION] to [INITIALIZING]
2023-09-26 10:31:27: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  setting the remote node "pgpool-1:9999 Linux T1VMPDDB90" as watchdog cluster leader
2023-09-26 10:31:28: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  watchdog node state changed from [INITIALIZING] to [STANDBY]
2023-09-26 10:31:28: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  signal_user1_to_parent_with_reason(1)
2023-09-26 10:31:28: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  successfully joined the watchdog cluster as standby node
2023-09-26 10:31:28: [21120]: db=[No Connection],user=[No Connection],app=watchdog DETAIL:  our join coordinator request is accepted by cluster leader node "pgpool-1:9999 Linux T1VMPDDB90"
2023-09-26 10:31:28: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  new IPC connection received
2023-09-26 10:31:28: [21117]: db=[No Connection],user=[No Connection],app=main LOG:  we have joined the watchdog cluster as STANDBY node
2023-09-26 10:31:28: [21117]: db=[No Connection],user=[No Connection],app=main DETAIL:  syncing the backend states from the LEADER watchdog node
2023-09-26 10:31:28: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  new IPC connection received
2023-09-26 10:31:28: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  received the get data request from local pgpool-II on IPC interface
2023-09-26 10:31:28: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  get data request from local pgpool-II node received on IPC interface is forwarded to leader watchdog node "pgpool-1:9999 Linux T1VMPDDB90"
2023-09-26 10:31:30: [22402]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 succeeded
2023-09-26 10:31:40: [14339]: db=[No Connection],user=[No Connection],app=sr_check_worker LOG:  worker process received restart request
2023-09-26 10:31:40: [21117]: db=[No Connection],user=[No Connection],app=main LOG:  worker child process with pid: 14339 exits with status 256
2023-09-26 10:31:40: [21117]: db=[No Connection],user=[No Connection],app=main LOG:  fork a new worker child process with pid: 23060
2023-09-26 10:31:40: [23060]: db=[No Connection],user=[No Connection],app=sr_check_worker LOG:  process started
2023-09-26 10:31:40: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  new IPC connection received
2023-09-26 10:31:50: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  new IPC connection received
2023-09-26 10:32:00: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  new IPC connection received
2023-09-26 10:32:14: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  new IPC connection received
2023-09-26 10:32:26: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  new IPC connection received
2023-09-26 10:32:40: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  new IPC connection received
2023-09-26 10:33:00: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  new IPC connection received
2023-09-26 10:33:20: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  new IPC connection received
2023-09-26 10:33:32: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  new IPC connection received
2023-09-26 10:33:43: [22402]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  failed to connect to PostgreSQL server on "customerdb-a:5432" using INET socket
2023-09-26 10:33:43: [22402]: db=[No Connection],user=[No Connection],app=health_check0 DETAIL:  health check timer expired
2023-09-26 10:33:43: [22402]: db=[No Connection],user=[No Connection],app=health_check0 ERROR:  failed to make persistent db connection
2023-09-26 10:33:43: [22402]: db=[No Connection],user=[No Connection],app=health_check0 DETAIL:  connection to host:"customerdb-a:5432" failed
2023-09-26 10:33:43: [22402]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 (round:1)
2023-09-26 10:33:52: [22402]: db=[No Connection],user=[No Connection],app=health_check0 LOG:  health check retrying on DB node: 0 succeeded
2023-09-26 10:34:18: [21120]: db=[No Connection],user=[No Connection],app=watchdog LOG:  remote node "pgpool-1:9999 Linux T1VMPDDB90" is asking to inform about quarantined backend nodes


everything is fine after 2023-09-26 10:34:26











pgpool_log_summary.txt (46,371 bytes)   

Issue History

Date Modified Username Field Change
2023-09-26 15:18 supakit.chavar New Issue
2023-09-26 15:18 supakit.chavar File Added: pgpool-2023-09-26_102619.log.gz
2023-09-26 15:18 supakit.chavar File Added: pool_nodes.txt
2023-09-26 15:18 supakit.chavar File Added: hosts.txt
2023-09-26 15:18 supakit.chavar File Added: pgpool.conf
2023-09-26 15:18 supakit.chavar File Added: postgresql-2023-09-26_102612.log.gz
2023-09-26 15:18 supakit.chavar File Added: postgresql-2023-09-26_102904.log.gz
2023-09-26 15:18 supakit.chavar File Added: postgresql-2023-09-26_103142.log.gz
2023-09-27 12:19 pengbo Note Added: 0004432
2023-09-27 12:19 pengbo Assigned To => pengbo
2023-09-27 12:19 pengbo Status new => feedback
2023-09-27 16:12 supakit.chavar Note Added: 0004433
2023-09-27 16:12 supakit.chavar File Added: pgpool_log_summary.txt
2023-09-27 16:12 supakit.chavar File Added: 10.102.46.175_pgpool-2023-09-26_102619.log.zip
2023-09-27 16:12 supakit.chavar File Added: 10.102.46.176_pgpool-2023-09-26_000000.log.zip
2023-09-27 16:12 supakit.chavar File Added: 10.102.46.177_pgpool-2023-09-26_000000.log.zip
2023-09-27 16:12 supakit.chavar Status feedback => assigned