[pgpool-hackers: 4012] Re: Disabling failover when backend goes down or backend process killed

Tatsuo Ishii ishii at sraoss.co.jp
Sat Sep 4 11:56:14 JST 2021


> Hi Usama,
> 
>>>> To overcome the problem, I would like to introduce a new switch called
>>>> "enable_failover_on_backend_shutdown" for upcoming Pgpool-II 4.3.  If
>>>> enable_failover_on_backend_shutdown is on, pgpool will behave as it is
>>>> now. If it is off, pgpool will not trigger failover when admin
>>>> shutdowns the backend node or backend process is killed. Instead the
>>>> session corresponding to the backend process will be terminated.
>>>> 
>>>> Comments or suggestions are welcome.
>> 
>> I think the default for the new parameter should be "off" because it
>> will help newcommers to Pgpool-II. The current behavior has been
>> bringing confusions to such users.
>> For example:
>> https://www.pgpool.net/mantisbt/view.php?id=726
> 
> In the yesterday's off mailing list discussion, you suggested that we
> want to generalize the feature: so that users can specify error code
> which should be avoided failover, for example max_connections error.
> 
> I have investigated the code and found that the only error codes which
> triggers failover are:
> 
> #define ADMIN_SHUTDOWN_ERROR_CODE "57P01"
> #define CRASH_SHUTDOWN_ERROR_CODE "57P02"
> 
> i.e. max_connections error is not handled in this code path. It's
> actually handled by health check. The health check process just tries
> to connect to backend and if it fails by whatever reason, including
> max_connections error, it triggers failover after specified
> retries. We could enhance this so that max_connections error does not
> trigger failover in the health check, but it is another story.
> 
> Am I missing something?

Note that if the health check is disable or pgpool tries to connect to
backend and fails because of max_connections error (while the health
check retires), the client gets but no failover happens:

FATAL:  sorry, too many clients already

So far so good. However logs are not so good:

2021-09-04 11:49:22.967: child pid 661457: ERROR:  backend authentication failed
2021-09-04 11:49:22.967: child pid 661457: DETAIL:  backend response with kind 'E' when expecting 'R'
2021-09-04 11:49:22.967: child pid 661457: HINT:  This issue can be caused by version mismatch (current version 3)
2021-09-04 11:49:22.968: child pid 661457: ERROR:  backend authentication failed
2021-09-04 11:49:22.968: child pid 661457: DETAIL:  backend response with kind 'E' when expecting 'R'
2021-09-04 11:49:22.968: child pid 661457: HINT:  This issue can be caused by version mismatch (current version 2)

In this case the cause of error is apparently not protocol version
mismatch. We need to enhance this in the future.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


More information about the pgpool-hackers mailing list