[pgpool-committers: 5979] pgpool: Fix for 0000483: online-recovery is blocked after a child proce

Muhammad Usama m.usama at gmail.com
Thu Aug 8 23:44:17 JST 2019

Fix for 0000483: online-recovery is blocked after a child process exits ...

The problem is if some child process exits abnormally during the second stage
of online recovery, then the connection counter that keeps the track of exiting
processes does not get decremented and Pgpool-II keeps waiting for the exit of
the already exited process. Eventually, the recovery fails after
client_idle_limit_in_recovery expires.

The fix for this issue is to set the connection counter to zero when
client_idle_limit_in_recovery is enabled and it has less value than
recovery_timeout, Since all clients must have been kicked out by the time
when client_idle_limit_in_recovery expires.

A similar fix is already committed as part of bug 431 by Tatsuo Ishii, So this
commit basically imports the same logic in the watchdog function that processes
the remote online recovery requests.

Apart from the above-mentioned change,  Hoshiai San identified that the watchdog
IPC command timeout for the online recovery start function executed through watchdog
is set exactly to the same as recovery_timeout which needs to be increased to
make the solution work correctly.



Modified Files
src/include/pool.h         |  1 +
src/pcp_con/recovery.c     |  4 ++++
src/watchdog/watchdog.c    | 11 +++++++++--
src/watchdog/wd_commands.c |  4 ++--
4 files changed, 16 insertions(+), 4 deletions(-)

More information about the pgpool-committers mailing list