[pgpool-general: 5047] Re: pgpool fail to load balance after database restart
Joffrey Dupire
joffrey.dupire at morning.com
Fri Oct 14 16:37:34 JST 2016
I don't think this is a bug, like a said previously in pgpool.conf you have
an argument named
In pgpool.conf you should have recovery_failed_node, just create a script
that use command detach_node and attach_node. You should have parameters
example in pgpool.conf
2016-10-14 2:17 GMT+02:00 Dylan Luong <Dylan.Luong at unisa.edu.au>:
> Is this a bug of pgpool or is it meant to be like that where we need some
> manual intervention in this situation?
>
>
>
> *From:* Joffrey Dupire [mailto:joffrey.dupire at morning.com]
> *Sent:* Friday, 14 October 2016 12:32 AM
> *To:* Dylan Luong
> *Cc:* pgpool-general at pgpool.net
> *Subject:* Re: [pgpool-general: 5042] Re: pgpool fail to load balance
> after database restart
>
>
>
> Hi i think you can script that. In pgpool.conf you should have something
> like "recover_command" but you have to know the failed node number maybe
> option like %d should exists to get this number.
>
>
>
> 2016-10-13 14:53 GMT+02:00 Dylan Luong <Dylan.Luong at unisa.edu.au>:
>
> Thanks. Does that mean every time where there is temp disconnection (ie
> network issue) we have to manually run a detach_node and attach_node?
>
>
>
> *From:* Joffrey Dupire [mailto:joffrey.dupire at morning.com]
> *Sent:* Thursday, 13 October 2016 9:44 PM
> *To:* Dylan Luong
> *Cc:* pgpool-general at pgpool.net
> *Subject:* Re: [pgpool-general: 5042] Re: pgpool fail to load balance
> after database restart
>
>
>
> Hi,
>
>
>
> You have to use detach_node command and attach_node
>
>
>
>
>
> 2016-10-13 12:52 GMT+02:00 Dylan Luong <Dylan.Luong at unisa.edu.au>:
>
> Hi,
>
> More info:
>
> For scenario 1, I tried restarting pgpool with debug (pgpool -n -d >
> pgpool2.log 2>&1), I can see in the log file that the health check working
> but it reports that the slave node 0 down (status 3) even though it is up.
>
>
>
> 2016-10-13 21:11:42: pid 6618: DEBUG: health check: clearing alarm
>
> 2016-10-13 21:11:42: pid 6618: DEBUG: doing health check against
> database:postgres user:postgres
>
> 2016-10-13 21:11:42: pid 6618: DEBUG: Backend DB node 0 status is 3
>
> 2016-10-13 21:11:42: pid 6618: DEBUG: Backend DB node 1 status is 2
>
> 2016-10-13 21:11:42: pid 6618: DEBUG: Trying to make persistent DB
> connection to backend node 1 having status 2
>
> 2016-10-13 21:11:42: pid 6618: DEBUG: pool_read: read 13 bytes from
> backend 0
>
> 2016-10-13 21:11:42: pid 6618: DEBUG: authenticate kind = 5
>
> 2016-10-13 21:11:42: pid 6618: DEBUG: pool_write: to backend: 0 kind:p
>
> 2016-10-13 21:11:42: pid 6618: DEBUG: pool_read: read 331 bytes from
> backend 0
>
> 2016-10-13 21:11:42: pid 6618: DEBUG: authenticate kind = 0
>
> 2016-10-13 21:11:42: pid 6618: DEBUG: authenticate backend: key data
> received
>
> 2016-10-13 21:11:42: pid 6618: DEBUG: authenticate backend: transaction
> state: I
>
> 2016-10-13 21:11:42: pid 6618: DEBUG: persistent DB connection to backend
> node 1 having status 2 is successful
>
>
>
> Why is it reporting its down. Its definitely up and receiving wal
> replication from the Master node1.
>
>
>
>
>
> *From:* pgpool-general-bounces at pgpool.net [mailto:pgpool-general-
> bounces at pgpool.net] *On Behalf Of *Dylan Luong
> *Sent:* Tuesday, 11 October 2016 6:06 PM
> *To:* pgpool-general at pgpool.net
> *Subject:* [pgpool-general: 5038] pgpool fail to load balance after
> database restart
>
>
>
> Hi,
>
> I am new to pgpool. I have just installed pgpool and configured
> Master/Slave mode with “stream” sub_mode. I have enabled load balance as
> that is the main feature we want test out. I have enabled Health Check. The
> load balance appears to work as I can see connections to both Master and
> Slave database.
>
> pgpool is installed on a separate server. (linux Redhat 6)
>
> ie: server1 = pgpool
>
> server2 = Master
>
> server3= Slave
>
> (wal replication)
>
>
>
> I tested two scenarios and it failing:
>
>
>
> 1. When I stop/start the slave database, pgpool stops load
> balancing. It looks like it doesn’t know that it available again after the
> slave comes backup online. Even after I restart pgpool it doesn’t load
> balance. It just sends all queries to the Master.
>
>
>
> 2. I tried shutting down slave db and then shut down master db (no
> failover) and restart master db and restart slave, pgpool failed to
> reconnect to the master or the slave. I get database unavailable errors.
> However, if I restart pgpool for this scenario, its all fine.
>
>
>
> It appears that pgpool is not doing the healthcheck correctly?? Or have I
> missed something here??
>
>
>
> Here are my main pgpool.conf changes:
>
>
>
> #-----------------------------------------------------------
> -------------------
>
> # CONNECTIONS
>
> #-----------------------------------------------------------
> -------------------
>
>
>
> # - pgpool Connection Settings -
>
>
>
> listen_addresses = '*'
>
> # Host name or IP address to listen on:
>
> # '*' for all, '' for no TCP/IP
> connections
>
> # (change requires restart)
>
> #port = 9999
>
> port = 5432
>
> # Port number
>
> # (change requires restart)
>
> socket_dir = '/var/run/postgresql'
>
> # Unix domain socket path
>
> # The Debian package defaults to
>
> # /var/run/postgresql
>
> # (change requires restart)
>
> listen_backlog_multiplier = 2
>
> # Set the backlog parameter of
> listen(2) to
>
> #
> num_init_children * listen_backlog_multiplier.
>
> # (change requires restart)
>
> serialize_accept = off
>
> # whether to serialize accept() call
> to avoid thundering herd problem
>
> # (change requires restart)
>
>
>
> # - pgpool Communication Manager Connection Settings -
>
>
>
> pcp_listen_addresses = '*'
>
> # Host name or IP address for pcp
> process to listen on:
>
> # '*' for all, '' for no TCP/IP
> connections
>
> # (change requires restart)
>
> pcp_port = 9898
>
> # Port number for pcp
>
> # (change requires restart)
>
> pcp_socket_dir = '/var/run/postgresql'
>
> # Unix domain socket path for pcp
>
> # The Debian package defaults to
>
> # /var/run/postgresql
>
> # (change requires restart)
>
>
>
> # - Backend Connection Settings -
>
>
>
> backend_hostname0 = '10.69.20.11'
>
> # Host name or IP address to connect
> to for backend 0
>
> backend_port0 = 5432
>
> # Port number for backend 0
>
> backend_weight0 = 1
>
> # Weight for backend 0 (only in load
> balancing mode)
>
> backend_data_directory0 = '/var/lib/pgsql/9.5/data'
>
> # Data directory for backend 0
>
> backend_flag0 = 'ALLOW_TO_FAILOVER'
>
> # Controls various backend behavior
>
> # ALLOW_TO_FAILOVER or
> DISALLOW_TO_FAILOVER
>
> backend_hostname1 = '10.68.20.11'
>
> backend_port1 = 5432
>
> backend_weight1 = 1
>
> backend_data_directory1 = '/var/lib/pgsql/9.5/data'
>
> backend_flag1 = 'ALLOW_TO_FAILOVER'
>
>
>
>
>
> #-----------------------------------------------------------
> -------------------
>
> # POOLS
>
> #-----------------------------------------------------------
> -------------------
>
>
>
> # - Concurrent session and pool size -
>
>
>
> num_init_children = 32
>
> # Number of concurrent sessions allowed
>
> # (change requires restart)
>
> max_pool = 4
>
> # Number of connection pool caches per
> connection
>
> # (change requires restart)
>
>
>
> # - Life time -
>
>
>
> child_life_time = 300
>
> # Pool exits after being idle for this
> many seconds
>
> child_max_connections = 0
>
> # Pool exits after receiving that many
> connections
>
> # 0 means no exit
>
> connection_life_time = 0
>
> # Connection to backend closes after
> being idle for this many seconds
>
> # 0 means no close
>
> client_idle_limit = 0
>
> # Client is disconnected after being
> idle for that many seconds
>
> # (even inside an explicit
> transactions!)
>
> # 0 means no disconnection
>
>
>
> #-----------------------------------------------------------
> -------------------
>
> # CONNECTION POOLING
>
> #-----------------------------------------------------------
> -------------------
>
>
>
> connection_cache = on
>
> # Activate connection pools
>
> # (change requires restart)
>
>
>
> # Semicolon separated list of queries
>
> # to be issued at the end of a session
>
> # The default is for 8.3 and later
>
> reset_query_list = 'ABORT; DISCARD ALL'
>
> # The following one is for 8.2 and
> before
>
> #reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT'
>
>
>
>
>
> #-----------------------------------------------------------
> -------------------
>
> # LOAD BALANCING MODE
>
> #-----------------------------------------------------------
> -------------------
>
>
>
> load_balance_mode = on
>
> # Activate load balancing mode
>
> # (change requires restart)
>
> ignore_leading_white_space = on
>
> # Ignore leading white spaces of each
> query
>
> white_function_list = ''
>
> # Comma separated list of function
> names
>
> # that don't write to database
>
> # Regexp are accepted
>
> #black_function_list = 'nextval,setval,nextval,setval'
>
> black_function_list = 'currval,lastval,nextval,setval'
>
> # Comma separated list of function
> names
>
> # that write to database
>
> # Regexp are accepted
>
>
>
> database_redirect_preference_list = ''
>
> #
> comma separated list of pairs of database and node id.
>
> #
> example: postgres:primary,mydb[0-4]:1,mydb[5-9]:2'
>
> #
> valid for streaming replicaton mode only.
>
>
>
> app_name_redirect_preference_list = ''
>
> #
> comma separated list of pairs of app name and node id.
>
> #
> example: 'psql:primary,myapp[0-4]:1,myapp[5-9]:standby'
>
> #
> valid for streaming replicaton mode only.
>
> allow_sql_comments = off
>
> # if
> on, ignore SQL comments when judging if load balance or
>
> #
> query cache is possible.
>
> # If
> off, SQL comments effectively prevent the judgment
>
> # (pre
> 3.4 behavior).
>
>
>
> #-----------------------------------------------------------
> -------------------
>
> # MASTER/SLAVE MODE
>
> #-----------------------------------------------------------
> -------------------
>
>
>
> master_slave_mode = on
>
> # Activate master/slave mode
>
> # (change requires restart)
>
> master_slave_sub_mode = 'stream'
>
> # Master/slave sub mode
>
> # Valid values are combinations slony
> or
>
> # stream. Default is slony.
>
> # (change requires restart)
>
>
>
> # - Streaming -
>
>
>
> sr_check_period = 5
>
> # Streaming replication check period
>
> # Disabled (0) by default
>
> sr_check_user = 'postgres'
>
> # Streaming replication check user
>
> # This is necessary even if you disable
>
> # streaming replication delay check
> with
>
> # sr_check_period = 0
>
> sr_check_password = 'xxxxxxxxxxxxxxxx'
>
> # Password for streaming replication
> check user
>
> sr_check_database = 'postgres'
>
> # Database name for streaming
> replication check
>
> delay_threshold = 10000000
>
> # Threshold before not dispatching
> query to standby node
>
> # Unit is in bytes
>
> # Disabled (0) by default
>
>
>
> # - Special commands -
>
>
>
> follow_master_command = ''
>
> # Executes this command after master
> failover
>
> # Special values:
>
> # %d = node id
>
> # %h = host name
>
> # %p = port number
>
> # %D = database cluster path
>
> # %m = new master node id
>
> # %H = hostname of the new master
> node
>
> # %M = old master node id
>
> # %P = old primary node id
>
> # %r = new master port number
>
> # %R = new master database cluster
> path
>
> # %% = '%' character
>
>
>
> #-----------------------------------------------------------
> -------------------
>
> # HEALTH CHECK
>
> #-----------------------------------------------------------
> -------------------
>
>
>
> health_check_period = 40
>
> # Health check period
>
> # Disabled (0) by default
>
> health_check_timeout = 10
>
> # Health check timeout
>
> # 0 means no timeout
>
> health_check_user = 'postgres'
>
> # Health check user
>
> health_check_password = 'xxxxxxxxxxxxxxxxxx'
>
> # Password for health check user
>
> health_check_database = ''
>
> # Database name for health check. If
> '', tries 'postgres' frist, then 'template1'
>
>
>
> health_check_max_retries = 3
>
> # Maximum number of times to retry a
> failed health check before giving up.
>
> health_check_retry_delay = 1
>
> # Amount of time to wait (in seconds)
> between retries.
>
> connect_timeout = 10000
>
> # Timeout value in milliseconds before
> giving up to connect to backend.
>
> #
> Default is 10000 ms (10 second). Flaky network user may want to increase
>
> # the
> value. 0 means no timeout.
>
> # Note
> that this value is not only used for health check,
>
> # but
> also for ordinary conection to backend.
>
>
>
>
>
> Thanks.
>
> Dylan
>
>
>
>
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
>
>
>
>
>
> --
>
> *Joffrey DUPIRE*
>
> Network Guru
>
> *joffrey.dupire at morning.com <prenom.nom at morning.com>*
>
> *morning <http://morning.com/>*| *up <http://up.morning.com/>* | corpo
> <http://corpo.morning.com/>
>
> Twitter <https://twitter.com/morning_com> | Facebook
> <https://www.facebook.com/morning.fr> | Linkedin
> <https://www.linkedin.com/company/payname>
>
>
>
>
>
> --
>
> *Joffrey DUPIRE*
>
> Network Guru
>
> *joffrey.dupire at morning.com <prenom.nom at morning.com>*
>
> *morning <http://morning.com/>*| *up <http://up.morning.com/>* | corpo
> <http://corpo.morning.com/>
>
> Twitter <https://twitter.com/morning_com> | Facebook
> <https://www.facebook.com/morning.fr> | Linkedin
> <https://www.linkedin.com/company/payname>
>
--
*Joffrey DUPIRE*
Network Guru
*joffrey.dupire at morning.com <prenom.nom at morning.com>*
*morning <http://morning.com/>*| *up <http://up.morning.com/>* | corpo
<http://corpo.morning.com/>
Twitter <https://twitter.com/morning_com> | Facebook
<https://www.facebook.com/morning.fr> | Linkedin
<https://www.linkedin.com/company/payname>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20161014/894ba287/attachment.htm>
More information about the pgpool-general
mailing list