[pgpool-general: 1302] Re: pgpool stopped accepting client connections after 1 node hung

Lonni J Friedman netllama at gmail.com
Tue Jan 8 09:06:18 JST 2013


On Mon, Jan 7, 2013 at 4:00 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>> On Mon, Jan 7, 2013 at 3:05 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>>>>>>> I don't understand why pgpool stopped accepting client connections.
>>>>>>> I'd expect that if any single node goes down, pgpool should continue
>>>>>>> to work and accept connections, and simply mark the unresponsive node
>>>>>>> as unavailable.
>>>>>>
>>>>>> That is my question too. Do you see this kind of message in the pgpool log?
>>>>>>
>>>>>>                 degenerate_backend_set: 2 fail over request from pid xxxx
>>>>>>
>>>>>> If you see this, pgpool should initiate the failover and mark cuda-db5 down.
>>>>>
>>>>> Nope, that message was not present at any time.
>>>>
>>>> There was a bug report regarding pgpool-II 3.2 (or higher)'s
>>>> connect_inet_domain_socket():
>>>> http://www.pgpool.net/mantisbt/view.php?id=46
>>>>
>>>> In the report the error message was same as you
>>>> (connect_inet_domain_socket: connect() failed: Connection timed out).
>>>> and I have created a patch to fix it:
>>>> http://www.pgpool.net/mantisbt/file_download.php?file_id=55&type=bug
>>>>
>>>> Can you try it out? Still I am investigating why you did not see fail
>>>> over but I think you want to try the patch to avoid the error first.
>>>
>>> Oh, I think I see the reason why you do not see fail over.
>>> You have this:
>>> fail_over_on_backend_error = off
>>>
>>> In this case new_connection() does not trigger fail over.
>>>
>>>                         /* If fail_over_on_backend_error is true, do failover.
>>>                          * Otherwise, just exit this session.
>>>                          */
>>>                         if (pool_config->fail_over_on_backend_error)
>>>                         {
>>>                                 notice_backend_error(i);
>>>                         }
>>>                         child_exit(1);
>>
>> Thanks.
>>
>> Just so that I'm clear, your recommendation is that I set
>> fail_over_on_backend_error=on *AND* apply the patch you provided, and
>> that will allow pgpool to failover an unresponsive postgres server,
>> such that pgpool does not stop accepting client connections?
>
> Yes, exactly.

ok, thanks.  I'll need to test this out in my staging environment
(hopefully later this week), and then schedule downtime to apply it to
production (likely not until next month).


More information about the pgpool-general mailing list