[pgpool-general: 5107] Re: pgpool incorrectly thinks backend cluster is down

manoj amanualjolt at gmail.com
Tue Nov 8 01:45:34 JST 2016


> Whenever Pgpool-II thinks a backend is being down, there should be a
>log entry in the Pgpool-II log file. Please check.

This is the error in the log file when this happens

2016-11-02 00:00:07: pid 9217: DETAIL:  postmaster on DB node 0 was
shutdown by administrative command
2016-11-02 00:00:07: pid 9217: LOG:  received degenerate backend request
for node_id: 0 from pid [9217]
2016-11-02 00:00:07: pid 9188: LOG:  starting degeneration. shutdown host
prod1.amazonaws.com(5439)
2016-11-02 00:00:07: pid 9188: LOG:  Restart all children

What does "postmaster on DB node 0 was shutdown by administrative command".
I havent sent any shutdown commands to pgpool. I verify connectivity to the
cluster whenever this happens and it is always fine. Why does the health
check that I configured to run every 30 secs not sense that the cluster is
back up again and update the pgpool_status file? Health check details from
the log are below

2016-11-01 23:59:54: pid 9188: LOG:  notice_backend_error: called from
pgpool main. ignored.
2016-11-01 23:59:54: pid 9188: WARNING:  child_exit: called from invalid
process. ignored.
2016-11-01 23:59:54: pid 9188: ERROR:  unable to read data from DB node 0
2016-11-01 23:59:54: pid 9188: DETAIL:  socket read failed with an error
"Success"

What dos the above log indicate?

>Yes, it randomly routes to backends. You can control the possibility
>of the routing.

Is it possible to control routing using round robin approach or least used
cluster? If so, where do I configure this?

Thanks,
- Manoj

On Mon, Nov 7, 2016 at 12:08 AM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> > I have pgpool configured against two redshift backend clusters to do
> > parallel writes. Seemingly at random, pgpool determines that one or both
> > the clusters are down and stops accepting connections even when they are
> > not down. I have health check configured every 30 seconds but that does
> not
> > help as it checks heath and still determines they are down in
> pgpool_status
> > file. How is health status determined and written to the file
> > /var/log/pgpool/pgpool_status and why does pgpool think the clusters are
> > down when they are not?
>
> Whenever Pgpool-II thinks a backend is being down, there should be a
> log entry in the Pgpool-II log file. Please check.
>
> > I also tested read query routing and noticed they were being routed
> > randomly to the backend clusters. Is there a specific algorithm that
> pgpool
> > uses for read query routing?
>
> Yes, it randomly routes to backends. You can control the possibility
> of the routing.
>
> >
> >
> >
> >
> > My config parameters are below
> >
> >
> >
> > backend_hostname0 = 'cluster1'
> >
> > backend_port0 = 5439
> >
> > backend_weight0 = 1
> >
> > backend_data_directory0 = '/data1'
> >
> > backend_flag0 = 'ALLOW_TO_FAILOVER'
> >
> >
> >
> > backend_hostname1 = 'cluster2'
> >
> > backend_port1 = 5439
> >
> > backend_weight1 = 1
> >
> > backend_data_directory1 = '/data1'
> >
> > backend_flag1 = 'ALLOW_TO_FAILOVER'
> >
> >
> >
> > #-----------------------------------------------------------
> > -------------------
> >
> > # HEALTH CHECK
> >
> > #-----------------------------------------------------------
> > -------------------
> >
> >
> >
> > health_check_period = 30
> >
> >                                    # Health check period
> >
> >                                    # Disabled (0) by default
> >
> > health_check_timeout = 20
> >
> >                                    # Health check timeout
> >
> >                                    # 0 means no timeout
> >
> > health_check_user = 'username'
> >
> >                                    # Health check user
> >
> > health_check_password = 'password'
> >
> >                                    # Password for health check user
> >
> > health_check_max_retries = 10
> >
> >                                    # Maximum number of times to retry a
> > failed health check before giving up.
> >
> > health_check_retry_delay = 1
> >
> >                                    # Amount of time to wait (in seconds)
> > between retries.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20161107/ee26d511/attachment.html>


More information about the pgpool-general mailing list