[pgpool-general: 138] Re: Healthcheck timeout not always respected

Tatsuo Ishii ishii at postgresql.org
Wed Jan 11 18:50:30 JST 2012


Ok, I did:

# iptables -A FORWARD -j REJECT --reject-with icmp-port-unreachable

on the host where pgpoo is running. And pull network cable from
backend0 host network interface. Pgpool detected the host being down
as expected...
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Backend is not destination of this message, pgpool host is, and we don't
> want it to ever get it. With command I've sent you rule will be created for
> any source and destination.
> 
> Regards,
> Stevo.
> 
> On Wed, Jan 11, 2012 at 10:38 AM, Tatsuo Ishii <ishii at postgresql.org> wrote:
> 
>> I did following:
>>
>> Do following on the host where pgpool is running on:
>>
>> # iptables -A FORWARD -j REJECT --reject-with icmp-port-unreachable -d
>> 133.137.177.124
>> (133.137.177.124 is the host where backend is running on)
>>
>> Pull network cable from backend0 host network interface. Pgpool
>> detected the host being down as expected. Am I missing something?
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
>>
>> > Hello Tatsuo,
>> >
>> > With backend0 on one host just configure following rule on other host
>> where
>> > pgpool is:
>> >
>> > iptables -A FORWARD -j REJECT --reject-with icmp-port-unreachable
>> >
>> > and then have pgpool startup with health checking and retrying
>> configured,
>> > and then pull network cable from backend0 host network interface.
>> >
>> > Regards,
>> > Stevo.
>> >
>> > On Wed, Jan 11, 2012 at 6:27 AM, Tatsuo Ishii <ishii at postgresql.org>
>> wrote:
>> >
>> >> I want to try to test the situation you descrived:
>> >>
>> >> >> > When system is configured for security reasons not to return
>> >> destination
>> >> >> > host unreachable messages, even though health_check_timeout is
>> >>
>> >> But I don't know how to do it. I pulled out the network cable and
>> >> pgpool detected it as expected. Also I configured the server which
>> >> PostgreSQL is running on to disable the 5432 port. In this case
>> >> connect(2) returned EHOSTUNREACH (No route to host) so pgpool detected
>> >> the error as expected.
>> >>
>> >> Could you please instruct me?
>> >> --
>> >> Tatsuo Ishii
>> >> SRA OSS, Inc. Japan
>> >> English: http://www.sraoss.co.jp/index_en.php
>> >> Japanese: http://www.sraoss.co.jp
>> >>
>> >> > Hello Tatsuo,
>> >> >
>> >> > Thank you for replying!
>> >> >
>> >> > I'm not sure what exactly is blocking, just by pgpool code analysis I
>> >> > suspect it is the part where a connection is made to the db and it
>> >> doesn't
>> >> > seem to get interrupted by alarm. Tested thoroughly health check
>> >> behaviour,
>> >> > it works really well when host/ip is there and just backend/postgres
>> is
>> >> > down, but not when backend host/ip is down. I could see in log that
>> >> initial
>> >> > health check and each retry got delayed when host/ip is not reachable,
>> >> > while when just backend is not listening (is down) on the reachable
>> >> host/ip
>> >> > then initial health check and all retries are exact to the settings in
>> >> > pgpool.conf.
>> >> >
>> >> > PGCONNECT_TIMEOUT is listed as one of the libpq environment variables
>> in
>> >> > the docs (see
>> >> http://www.postgresql.org/docs/9.1/static/libpq-envars.html )
>> >> > There is equivalent parameter in libpq PGconnectdbParams ( see
>> >> >
>> >>
>> http://www.postgresql.org/docs/9.1/static/libpq-connect.html#LIBPQ-CONNECT-CONNECT-TIMEOUT
>> >> )
>> >> > At the beginning of that same page there are some important infos on
>> >> using
>> >> > these functions.
>> >> >
>> >> > psql respects PGCONNECT_TIMEOUT.
>> >> >
>> >> > Regards,
>> >> > Stevo.
>> >> >
>> >> > On Wed, Jan 11, 2012 at 12:13 AM, Tatsuo Ishii <ishii at postgresql.org>
>> >> wrote:
>> >> >
>> >> >> > Hello pgpool community,
>> >> >> >
>> >> >> > When system is configured for security reasons not to return
>> >> destination
>> >> >> > host unreachable messages, even though health_check_timeout is
>> >> >> configured,
>> >> >> > socket call will block and alarm will not get raised until TCP
>> timeout
>> >> >> > occurs.
>> >> >>
>> >> >> Interesting. So are you saying that read(2) cannot be interrupted by
>> >> >> alarm signal if the system is configured not to return destination
>> >> >> host unreachable message? Could you please guide me where I can get
>> >> >> such that info? (I'm not a network expert).
>> >> >>
>> >> >> > Not a C programmer, found some info that select call could be
>> replace
>> >> >> with
>> >> >> > select/pselect calls. Maybe it would be best if PGCONNECT_TIMEOUT
>> >> value
>> >> >> > could be used here for connection timeout. pgpool has libpq as
>> >> >> dependency,
>> >> >> > why isn't it using libpq for the healthcheck db connect calls, then
>> >> >> > PGCONNECT_TIMEOUT would be applied?
>> >> >>
>> >> >> I don't think libpq uses select/pselect for establishing connection,
>> >> >> but using libpq instead of homebrew code seems to be an idea. Let me
>> >> >> think about it.
>> >> >>
>> >> >> One question. Are you sure that libpq can deal with the case (not to
>> >> >> return destination host unreachable messages) by using
>> >> >> PGCONNECT_TIMEOUT?
>> >> >> --
>> >> >> Tatsuo Ishii
>> >> >> SRA OSS, Inc. Japan
>> >> >> English: http://www.sraoss.co.jp/index_en.php
>> >> >> Japanese: http://www.sraoss.co.jp
>> >> >>
>> >>
>>


More information about the pgpool-general mailing list