View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000697 | Pgpool-II | Bug | public | 2021-03-16 06:04 | 2021-04-08 13:56 |
| Reporter | Garreat | Assigned To | pengbo | ||
| Priority | normal | Severity | minor | Reproducibility | always |
| Status | closed | Resolution | open | ||
| Platform | Linux/Docker | ||||
| Product Version | 4.2.2 | ||||
| Summary | 0000697: auto_failback never triggers | ||||
| Description | I'm using pgpool 4.2.2 in a containerized environment. auto_failback feature never works despite setting up sr_check, healthcheck, pg_stat_statements on Postgres nodes, log_hostname etc. I have a standby backend defined: backend_hostname1 = 'mbase_database-replica-1' backend_port1 = '5432' backend_flag1 = 'ALLOW_TO_FAILOVER' I bring the Postgres service down, pgpool detects that and sets backend status 'Down' -- ok. Then I bring the service back up. Pgpool healthcheck detects that, healthchecks are now skipped, but backend status stays 'Down'. In pg_stat_replication view, there are no problems with the replication: # select * from pg_stat_replication; pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_lsn | wr ite_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time ------+----------+---------+------------------+-------------+-------------------------------------------- ----------------+-------------+-------------------------------+--------------+-----------+-----------+--- --------+-----------+------------+-----------+-----------+------------+---------------+------------+----- -------------------------- 1033 | 10 | mbase | walreceiver | 10.0.2.242 | mbase_database-replica-1.1.rlbz732fxblqjk17 lcptok7kb.net2 | 33178 | 2021-03-15 21:29:31.750421+01 | | streaming | 0/48F2238 | 0/ 48F2238 | 0/48F2238 | 0/48F2238 | | | | 0 | async | 2021 -03-15 21:34:01.083468+01 Does this feature only work when using static IP addresses as backend? Docker service gets re-created with different IP, however I define it using a hostname... Maybe the client_hostname in pg_stat_replication view has to exactly match the defined backend name? Docker's reverse DNS returns (hostname).(replica-number).(random-garbage). Docker or not, I believe pgpool should try compare both hostname and FQDN. I'm trying to find a solution for a few days already to no success. Would be a perfectly usable Postgres Swarm stack with this feature... | ||||
| Tags | No tags attached. | ||||
|
|
pgpool.conf (23,062 bytes)
#------------------------------------------------------------------------------
# MAIN OPTIONS
#------------------------------------------------------------------------------
listen_addresses = '*'
port = '5432'
backend_clustering_mode = 'streaming_replication'
backend_hostname0 = 'mbase_database-primary'
backend_port0 = '5432'
backend_flag0 = 'ALWAYS_PRIMARY|DISALLOW_TO_FAILOVER'
backend_hostname1 = 'mbase_database-replica-1'
backend_port1 = '5432'
backend_flag1 = 'ALLOW_TO_FAILOVER'
num_init_children = 300
# Number of concurrent sessions allowed
# (change requires restart)
reserved_connections = 0
# Number of reserved connections.
# Pgpool-II does not accept connections if over
# num_init_chidlren - reserved_connections
connection_life_time = 0
# Connection to backend closes after being idle for this many seconds
# 0 means no close
client_idle_limit = 0
# Client is disconnected after being idle for that many seconds
# (even inside an explicit transactions!)
# 0 means no disconnection
#------------------------------------------------------------------------------
# CONNECTION POOLING
#------------------------------------------------------------------------------
connection_cache = off
# Activate connection pools
# (change requires restart)
max_pool = 1
# Number of connection pool caches per connection
# (change requires restart)
child_life_time = 10
# Pool exits after being idle for this many seconds
child_max_connections = 0
# Pool exits after receiving that many connections
# 0 means no exit
#------------------------------------------------------------------------------
# STREAMING REPLICATION
#------------------------------------------------------------------------------
sr_check_period = 10
# Streaming replication check period
# Disabled (0) by default
sr_check_user = 'mbase'
# Streaming replication check user
# This is necessary even if you disable
# streaming replication delay check with
# sr_check_period = 0
sr_check_password = ''
# Password for streaming replication check user.
# Leaving it empty will make Pgpool-II to first look for the
# Password in pool_passwd file before using the empty password
sr_check_database = 'mbase'
# Database name for streaming replication check
delay_threshold = 0
# Threshold before not dispatching query to standby node
# Unit is in bytes
# Disabled (0) by default
#------------------------------------------------------------------------------
# LOAD BALANCING MODE
#------------------------------------------------------------------------------
load_balance_mode = on
# Activate load balancing mode
ignore_leading_white_space = off
# Ignore leading white spaces of each query
#database_redirect_preference_list = ''
# comma separated list of pairs of database and node id.
# example: postgres:primary,mydb[0-4]:1,mydb[5-9]:2'
# valid for streaming replicaton mode only.
#app_name_redirect_preference_list = ''
# comma separated list of pairs of app name and node id.
# example: 'psql:primary,myapp[0-4]:1,myapp[5-9]:standby'
# valid for streaming replicaton mode only.
#allow_sql_comments = off
# if on, ignore SQL comments when judging if load balance or
# query cache is possible.
# If off, SQL comments effectively prevent the judgment
# (pre 3.4 behavior).
disable_load_balance_on_write = 'trans_transaction'
# Load balance behavior when write query is issued
# in an explicit transaction.
# Note that any query not in an explicit transaction
# is not affected by the parameter.
# 'transaction' (the default): if a write query is issued,
# subsequent read queries will not be load balanced
# until the transaction ends.
# 'trans_transaction': if a write query is issued,
# subsequent read queries in an explicit transaction
# will not be load balanced until the session ends.
# 'always': if a write query is issued, read queries will
# not be load balanced until the session ends.
#statement_level_load_balance = off
# Enables statement level load balancing
read_only_function_list = ''
# Comma separated list of function names
# that don't write to database
# Regexp are accepted
write_function_list = ''
# Comma separated list of function names
# that write to database
# Regexp are accepted
# If both read_only_function_list and write_function_list
# is empty, function's volatile property is checked.
# If it's volatile, the function is regarded as a
# writing function.
#------------------------------------------------------------------------------
# LOGGING
#------------------------------------------------------------------------------
log_destination = 'stderr'
# Where to log
# Valid values are combinations of stderr,
# and syslog. Default to stderr.
log_line_prefix = '%t: pid %p: '
# printf-style string to output at beginning of each log line.
log_connections = off
# Log connections
log_hostname = off
# Hostname will be shown in ps status
# and in logs if connections are logged
log_statement = off
# Log all statements
log_per_node_statement = off
# Log all statements
# with node and backend informations
log_client_messages = off
# Log any client messages
log_standby_delay = 'always'
# Log standby delay
# Valid values are combinations of always,
# if_over_threshold, none
log_error_verbosity = verbose
# terse, default, or verbose messages
#client_min_messages = notice
# values in order of decreasing detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# log
# notice
# warning
# error
log_min_messages = notice # values in order of decreasing detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# info
# notice
# warning
# error
# log
# fatal
# panic
#------------------------------------------------------------------------------
# - pgpool Communication Manager Connection Settings -
#------------------------------------------------------------------------------
#pcp_listen_addresses = '*'
# Host name or IP address for pcp process to listen on:
# '*' for all, '' for no TCP/IP connections
# (change requires restart)
pcp_port = 9898
# Port number for pcp
# (change requires restart)
pcp_socket_dir = '/tmp'
# Unix domain socket path for pcp
# The Debian package defaults to
# /var/run/postgresql
# (change requires restart)
#------------------------------------------------------------------------------
# - Authentication -
#------------------------------------------------------------------------------
enable_pool_hba = on
# Use pool_hba.conf for client authentication
pool_passwd = '/etc/pgpool-II/pool_passwd'
# File name of pool_passwd for md5 authentication.
# "" disables pool_passwd.
# (change requires restart)
#authentication_timeout = 60
# Delay in seconds to complete client authentication
# 0 means no timeout.
#allow_clear_text_frontend_auth = off
# Allow Pgpool-II to use clear text password authentication
# with clients, when pool_passwd does not
# contain the user password
#------------------------------------------------------------------------------
# IN MEMORY QUERY MEMORY CACHE
#------------------------------------------------------------------------------
#memqcache_memcached_host = 'localhost'
# Memcached host name or IP address. Mandatory if
# memqcache_method = 'memcached'.
# Defaults to localhost.
# (change requires restart)
#memqcache_memcached_port = 11211
# Memcached port number. Mondatory if memqcache_method = 'memcached'.
# Defaults to 11211.
# (change requires restart)
#memqcache_total_size = 67108864
# Total memory size in bytes for storing memory cache.
# Mandatory if memqcache_method = 'shmem'.
# Defaults to 64MB.
# (change requires restart)
memory_cache_enabled = off
# If on, use the memory cache functionality, off by default
# (change requires restart)
memqcache_method = 'shmem'
# Cache storage method. either 'shmem'(shared memory) or
# 'memcached'. 'shmem' by default
# (change requires restart)
memqcache_max_num_cache = 1000000
# Total number of cache entries. Mandatory
# if memqcache_method = 'shmem'.
# Each cache entry consumes 48 bytes on shared memory.
# Defaults to 1,000,000(45.8MB).
# (change requires restart)
memqcache_expire = 0
# Memory cache entry life time specified in seconds.
# 0 means infinite life time. 0 by default.
# (change requires restart)
memqcache_auto_cache_invalidation = on
# If on, invalidation of query cache is triggered by corresponding
# DDL/DML/DCL(and memqcache_expire). If off, it is only triggered
# by memqcache_expire. on by default.
# (change requires restart)
memqcache_maxcache = 409600
# Maximum SELECT result size in bytes.
# Must be smaller than memqcache_cache_block_size. Defaults to 400KB.
# (change requires restart)
memqcache_cache_block_size = 1048576
# Cache block size in bytes. Mandatory if memqcache_method = 'shmem'.
# Defaults to 1MB.
# (change requires restart)
memqcache_oiddir = '/var/log/pgpool/oiddir'
# Temporary work directory to record table oids
# (change requires restart)
#white_memqcache_table_list = ''
# Comma separated list of table names to memcache
# that don't write to database
# Regexp are accepted
#black_memqcache_table_list = ''
# Comma separated list of table names not to memcache
# that don't write to database
# Regexp are accepted
#------------------------------------------------------------------------------
# HEALTH CHECK GLOBAL PARAMETERS
#------------------------------------------------------------------------------
health_check_period = 5
# Health check period
# Disabled (0) by default
health_check_timeout = 10
# Health check timeout
# 0 means no timeout
health_check_user = 'mbase'
# Health check user
health_check_password = ''
# Password for health check user
# Leaving it empty will make Pgpool-II to first look for the
# Password in pool_passwd file before using the empty password
health_check_database = 'mbase'
# Database name for health check. If '', tries 'postgres' frist, then 'template1'
health_check_max_retries = 259200
# Maximum number of times to retry a failed health check before giving up.
health_check_retry_delay = 10
# Amount of time to wait (in seconds) between retries.
connect_timeout = 5000
# Timeout value in milliseconds before giving up to connect to backend.
# Default is 10000 ms (10 second). Flaky network user may want to increase
# the value. 0 means no timeout.
# Note that this value is not only used for health check,
# but also for ordinary conection to backend.
#------------------------------------------------------------------------------
# HEALTH CHECK PER NODE PARAMETERS (OPTIONAL)
#------------------------------------------------------------------------------
#health_check_period0 = 0
#health_check_timeout0 = 20
#health_check_user0 = 'nobody'
#health_check_password0 = ''
#health_check_database0 = ''
#health_check_max_retries0 = 0
#health_check_retry_delay0 = 1
#connect_timeout0 = 10000
#------------------------------------------------------------------------------
# FAILOVER AND FAILBACK
#------------------------------------------------------------------------------
failover_command = ''
# Executes this command at failover
# Special values:
# %d = failed node id
# %h = failed node host name
# %p = failed node port number
# %D = failed node database cluster path
# %m = new master node id
# %H = new master node hostname
# %M = old master node id
# %P = old primary node id
# %r = new master port number
# %R = new master database cluster path
# %N = old primary node hostname
# %S = old primary node port number
# %% = '%' character
failback_command = ''
# Executes this command at failback.
# Special values:
# %d = failed node id
# %h = failed node host name
# %p = failed node port number
# %D = failed node database cluster path
# %m = new master node id
# %H = new master node hostname
# %M = old master node id
# %P = old primary node id
# %r = new master port number
# %R = new master database cluster path
# %N = old primary node hostname
# %S = old primary node port number
# %% = '%' character
failover_on_backend_error = on
# Initiates failover when reading/writing to the
# backend communication socket fails
# If set to off, pgpool will report an
# error and disconnect the session.
detach_false_primary = off
# Detach false primary if on. Only
# valid in streaming replicaton
# mode and with PostgreSQL 9.6 or
# after.
search_primary_node_timeout = 20
# Timeout in seconds to search for the
# primary node when a failover occurs.
# 0 means no timeout, keep searching
# for a primary node forever.
auto_failback = on
# Dettached backend node reattach automatically
# if replication_state is 'streaming'.
auto_failback_interval = 5
# Min interval of executing auto_failback in
# seconds.
#------------------------------------------------------------------------------
# OTHERS
#------------------------------------------------------------------------------
relcache_expire = 0
# Life time of relation cache in seconds.
# 0 means no cache expiration(the default).
# The relation cache is used for cache the
# query result against PostgreSQL system
# catalog to obtain various information
# including table structures or if it's a
# temporary table or not. The cache is
# maintained in a pgpool child local memory
# and being kept as long as it survives.
# If someone modify the table by using
# ALTER TABLE or some such, the relcache is
# not consistent anymore.
# For this purpose, cache_expiration
# controls the life time of the cache.
#relcache_size = 256
# Number of relation cache
# entry. If you see frequently:
# "pool_search_relcache: cache replacement happend"
# in the pgpool log, you might want to increate this number.
#check_temp_table = catalog
# Temporary table check method. catalog, trace or none.
# Default is catalog.
#check_unlogged_table = on
# If on, enable unlogged table check in SELECT statements.
# This initiates queries against system catalog of primary/master
# thus increases load of master.
# If you are absolutely sure that your system never uses unlogged tables
# and you want to save access to primary/master, you could turn this off.
# Default is on.
#enable_shared_relcache = on
# If on, relation cache stored in memory cache,
# the cache is shared among child process.
# Default is on.
# (change requires restart)
|
|
|
Postgres 12.3 |
|
|
> Does this feature only work when using static IP addresses as backend? Docker service gets re-created with different IP, however I define it using a hostname... > Maybe the client_hostname in pg_stat_replication view has to exactly match the defined backend name? Docker's reverse DNS returns (hostname).(replica-number).(random-garbage). Docker or not, I believe pgpool should try compare both hostname and FQDN. To use this auto_failback feature, you need to specify "backend_application_name1" which is specified in "primary_conninfo". Could you try again by specifying "backend_application_name"? https://www.pgpool.net/docs/latest/en/html/runtime-config-backend-settings.html#GUC-BACKEND-APPLICATION-NAME Sorry that document doesn't mention that. I will add it to documentation. |
|
|
I can confirm that now it works as intended. Thank you! Sorry for the delay -- I got covid. |
|
|
OK. I'm going to close this issue. Take care! |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2021-03-16 06:04 | Garreat | New Issue | |
| 2021-03-16 06:04 | Garreat | File Added: pgpool.conf | |
| 2021-03-16 06:08 | Garreat | Note Added: 0003768 | |
| 2021-03-17 18:00 | pengbo | Assigned To | => pengbo |
| 2021-03-17 18:00 | pengbo | Status | new => assigned |
| 2021-03-17 18:08 | pengbo | Note Added: 0003772 | |
| 2021-03-17 18:09 | pengbo | Note Edited: 0003772 | |
| 2021-03-17 18:09 | pengbo | Status | assigned => feedback |
| 2021-04-08 01:38 | Garreat | Note Added: 0003797 | |
| 2021-04-08 01:38 | Garreat | Status | feedback => assigned |
| 2021-04-08 13:55 | pengbo | Note Added: 0003798 | |
| 2021-04-08 13:56 | pengbo | Status | assigned => closed |