[pgpool-general: 3565] Re: pgpool 3.3.1 - unexplained failover without health-check retries

Guy Meler melguy at gmail.com
Mon Mar 23 15:30:54 JST 2015


Me either have the same problem with 3.4.
whenever I restart the postgresql servicr,  the node immediately has '3' in
it's status and pgpool runs the failover script,  ignoring all the health
check settings. What I have seen lately,  those settings are relevant only
when pgpool start and a backend is already in down state -  in this
situation,  pgpool indeed checks periodically the node.

On Fri, 6 Mar 2015 04:16 Tatsuo Ishii <ishii at postgresql.org> wrote:

> pgpool-II 3.3.1 is pretty old. Please update to the latest version
> (3.3.5 at this moment).  If you still have the problem, please let us
> know.
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
> > Hello Pgpool support,
> > I am using Pgpool 3.3.1 in our production database, with master-slave
> feature + health check.
> > Today and not for the first time, we encountered a situation in which
> Pgpool "decided" to do a failover after it has lost connectivity to the
> master node, without the retry logic that should be when using health-check
> mode.
> > Here is the Pgpool log from the failure time:
> > Mar  4 16:35:06 pgpool pgpool[22795]: DB node id: 0 backend pid: 44867
> statement: C message
> > Mar  4 16:35:06 pgpool pgpool[22795]: DB node id: 0 backend pid: 44867
> statement: B message
> > Mar  4 16:35:06 pgpool pgpool[22795]: DB node id: 1 backend pid: 26169
> statement: B message
> > Mar  4 16:35:06 pgpool pgpool[22795]: DB node id: 0 backend pid: 44867
> statement: Execute: ROLLBACK
> > Mar  4 16:35:06 pgpool pgpool[22795]: DB node id: 1 backend pid: 26169
> statement: Execute: ROLLBACK
> > Mar  4 16:35:06 pgpool pgpool[22795]: DB node id: 0 backend pid: 44867
> statement: Parse: SELECT 1
> > Mar  4 16:35:06 pgpool pgpool[22795]: DB node id: 0 backend pid: 44867
> statement: B message
> > Mar  4 16:35:06 pgpool pgpool[22795]: DB node id: 0 backend pid: 44867
> statement: D message
> > Mar  4 16:35:06 pgpool pgpool[22795]: DB node id: 0 backend pid: 44867
> statement: Execute: SELECT 1
> > Mar  4 16:35:09 pgpool pgpool[2306]: connect_inet_domain_socket:
> gethostbyname() failed: Unknown host host: pnmpg3.sj.peer39.com
> > Mar  4 16:35:09 pgpool pgpool[2306]: connection to pnmpg3.sj.peer39.com(5432)
> failed
> > Mar  4 16:35:09 pgpool pgpool[2306]: new_connection: create_cp() failed
> > Mar  4 16:35:09 pgpool pgpool[2306]: degenerate_backend_set: 0 fail over
> request from pid 2306
> > Mar  4 16:35:09 pgpool pgpool[32118]: starting degeneration. shutdown
> host pnmpg3.sj.peer39.com(5432)
> > Mar  4 16:35:09 pgpool pgpool[32118]: Restart all children
> > Mar  4 16:35:09 pgpool pgpool[32118]: execute command:
> /etc/pgpool-II/failover.sh 0 "pnmpg3.sj.peer39.com" 5432
> /var/lib/pgsql/9.2/data 1 0 "pnmpg4.sj.peer39.com" 0
> > Mar  4 16:35:19 pgpool pgpool[32118]: find_primary_node_repeatedly:
> waiting for finding a primary node
> > Mar  4 16:35:35 pgpool pgpool[32118]: failover: set new primary node: -1
> > Mar  4 16:35:35 pgpool pgpool[32118]: failover: set new master node: 1
> > Mar  4 16:35:35 pgpool pgpool[32233]: worker process received restart
> request
> > Mar  4 16:35:35 pgpool pgpool[32118]: failover done. shutdown host
> pnmpg3.sj.peer39.com(5432)
> >
> > Is this a normal behavior or is it a bug?? If it is normal, Is there a
> way to modify it?
> >
> > Pgpool enabled features:
> >
> > [ mode ] Master Slave mode
> >
> > [ healthcheck ] every 40 seconds / retry upto 3 counts
> >
> > Partial pgpool.conf:
> > #-----------------------------------------------------------
> -------------------
> > # POOLS
> > #-----------------------------------------------------------
> -------------------
> > # - Pool size -
> > num_init_children = 50
> > max_pool = 1
> > # - Life time -
> > child_life_time = 50
> > child_max_connections = 0
> > connection_life_time = 1800
> > client_idle_limit = 0
> > #-----------------------------------------------------------
> -------------------
> > # MASTER/SLAVE MODE
> > #-----------------------------------------------------------
> -------------------
> > master_slave_mode = on
> > master_slave_sub_mode = 'stream'
> > # - Streaming -
> > sr_check_period = 30
> > sr_check_user = 'username'
> > sr_check_password = 'password'
> > delay_threshold = 0
> > # - Special commands -
> > follow_master_command = ''
> > #-----------------------------------------------------------
> -------------------
> > # HEALTH CHECK
> > #-----------------------------------------------------------
> -------------------
> > health_check_period = 40
> > health_check_timeout = 10
> > health_check_user = 'username'
> > health_check_password = 'password'
> > health_check_max_retries = 3
> > health_check_retry_delay = 20
> > #-----------------------------------------------------------
> -------------------
> > # FAILOVER AND FAILBACK
> > #-----------------------------------------------------------
> -------------------
> > failover_command = '/etc/pgpool-II/failover.sh %d "%h" %p %D %m %M "%H"
> %P'
> > failback_command = ''
> > fail_over_on_backend_error = on
> > search_primary_node_timeout = 10
> > #-----------------------------------------------------------
> -------------------
> > # ONLINE RECOVERY
> > #-----------------------------------------------------------
> -------------------
> > recovery_user = ''
> > recovery_password = ''
> > recovery_1st_stage_command = 'basebackup.sh'
> > recovery_2nd_stage_command = ''
> > recovery_timeout = 90
> > client_idle_limit_in_recovery = 0
> >
> > Thanks
> >
> > Boaz Goldstein
> > DBA, Deployment DBA Team
> > boaz.goldstein at sizmek.com<mailto:boaz.goldstein at sizmek.com>
> > M +972.524.695731
> > T +972.9.778.2910
> > Israel
> >
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20150323/a5d51232/attachment.html>


More information about the pgpool-general mailing list