[pgpool-general: 8765] Re: Clients disconnection when slave node is off

Tatsuo Ishii ishii at sraoss.co.jp
Mon May 15 11:14:22 JST 2023


> Ok, thank you for your great work.
> 
> In this case the failover is due to the slave node.
> Taking into account I have no load balancing, I think clients should can
> connect to pgpool during this failover because master node is still the
> same and it is alive.

There are some code paths to access the standby node, which triggers
session disconnection:

1. When client connects to pgpool, pgpool tries to connect to standby
   PostgreSQL.

2. When client connects to pgpool, pgpool sends SET application_name
   command.

3. Pgpool detects the PostgreSQL shutdown event even if it's from
   standby, which results in session disconnection.

Although I am not sure if I can eliminate all the code paths above, I
will try for upcoming Pgpool 4.5.

> Is there a plan to solve this limitation in future releases?
> 
> Thanks in advance.
> 
> Best,
> Jesús
> 
> El dom, 14 may 2023 4:06, Tatsuo Ishii <ishii at sraoss.co.jp> escribió:
> 
>> Hi Jesús,
>>
>> > Hi Tatsuo. Thank you very much.
>> > Should we add It as a bug in Pgpool-II Bug Tracker?
>>
>> No, you don't need to as it's a known limitation that pgpool does not
>> accept new connection while failover.
>>
>> > El sáb, 13 may 2023 9:24, Tatsuo Ishii <ishii at sraoss.co.jp> escribió:
>> >
>> >> Ok. The errors were generated while clients tried to connect to
>> >> pgpool.  My patch covers the case when failover happens while
>> >> connections from clients to pgpool are *kept*.  However the patch does
>> >> not cover the case when clients try to establish connection to pgpool
>> >> while failover.
>> >>
>> >> I tested my patch using pgbench. If pgbench is given "-C" (create
>> >> connection for each transaction), I get same errors you mentioned.
>> >>
>> >> I have to admit my patch does not cover all the cases. I need more
>> >> time to deal with these problems.
>> >>
>> >> > Hi,
>> >> >
>> >> > I think we cannot connect to pgpool. I will show you the output of my
>> >> > dbCheck script.
>> >> >
>> >> >
>> >> >    - *Pgpool without patch and backend1 as slave**:*
>> >> >
>> >> > [root at pg_client1 services]# ./dbcheck.sh $VIP_PGPOOL
>> >> >
>> >> > psql: ERROR:  do command failed
>> >> >
>> >> > DETAIL:  backend error: "SFATAL"
>> >> >
>> >> > psql: ERROR:  unable to read data from DB node 1
>> >> >
>> >> > DETAIL:  socket read failed with error "Connection reset by peer"
>> >> >
>> >> > psql: server closed the connection unexpectedly
>> >> >
>> >> >         This probably means the server terminated abnormally
>> >> >
>> >> >         before or while processing the request.
>> >> >
>> >> > psql: server closed the connection unexpectedly
>> >> >
>> >> >         This probably means the server terminated abnormally
>> >> >
>> >> >         before or while processing the request.
>> >> >
>> >> >         This probably means the server terminated abnormally
>> >> >
>> >> >         before or while processing the request.
>> >> >
>> >> >
>> >> >
>> >> >    - *Pgpool with patch and backend1 as slave:*
>> >> >
>> >> > psql: ERROR:  unable to read message kind
>> >> >
>> >> > DETAIL:  kind does not match between main(52)
>> >> >
>> >> >
>> >> >
>> >> >    - *Pgpool with patch and backend1 as master**:*
>> >> >
>> >> > psql: ERROR:  unable to read data from DB node
>> >> >
>> >> > DETAIL:  socket read failed with error "Connection reset by peer"
>> >> >
>> >> > server closed the connection unexpectedly
>> >> >
>> >> >         This probably means the server terminated abnormally
>> >> >
>> >> >         before or while processing the request.
>> >> >
>> >> > connection to server was lost
>> >> >
>> >> > server closed the connection unexpectedly
>> >> >
>> >> >         This probably means the server terminated abnormally
>> >> >
>> >> >         before or while processing the request.
>> >> >
>> >> > connection to server was lost
>> >> >
>> >> >
>> >> > Anyway, with a client which uses ODBC, if it tries to access the
>> database
>> >> > during failover (from slave node) the following error is displayed:
>> >> "Driver
>> >> > Unable to Establish Connection with Data Source".
>> >> >
>> >> > El vie, 12 may 2023 a las 9:40, Tatsuo Ishii (<ishii at sraoss.co.jp>)
>> >> > escribió:
>> >> >
>> >> >> What do you mean by "database is not available"?
>> >> >>
>> >> >> 1. You can connect to pgpool but pgpool does not reply back.
>> >> >>
>> >> >> 2. You can cannect to pgpool but pgpool immediately disconnects.
>> >> >>
>> >> >> > Hi Tatsuo,
>> >> >> >
>> >> >> > I'm working with your patch but I continue facing a problem because
>> >> the
>> >> >> > database is not available during 1 second aprox (I have a script
>> >> calling
>> >> >> > select query every 0.1 seconds to check the time is not available
>> the
>> >> >> > database).
>> >> >> >
>> >> >> > I will explain two different cases:
>> >> >> >
>> >> >> > 1. Slave node (backend1 in pgpool.conf) is turn off. With your
>> patch
>> >> the
>> >> >> > database is always available. Without your patch the database is
>> not
>> >> >> > available during 1 second.
>> >> >> > 2. Master node (backend0) is turn off. Failover is done to promote
>> >> >> > backend1. After that, I turn on again backend0, which is now slave
>> >> node.
>> >> >> If
>> >> >> > I turn off this slave node (backend0), the database is not
>> available
>> >> >> during
>> >> >> > 1 second (with or without your patch)
>> >> >> >
>> >> >> > Do you have any idea why is this behaviour?
>> >> >> >
>> >> >> > Thanks in advance.
>> >> >> >
>> >> >> > Best,
>> >> >> > Jesús
>> >> >> >
>> >> >> > El vie, 14 abr 2023 3:41, Tatsuo Ishii <ishii at sraoss.co.jp>
>> escribió:
>> >> >> >
>> >> >> >> Hi Jesús,
>> >> >> >>
>> >> >> >> > Hi Tatsuo,
>> >> >> >> >
>> >> >> >> > At first, thank you so much for your time to investigate this
>> >> issue.
>> >> >> >>
>> >> >> >> No problem.
>> >> >> >>
>> >> >> >> > I have compiled pgpool 4.3.2 with your patch and the problem
>> with
>> >> >> pgbench
>> >> >> >> > is solved.
>> >> >> >> > I still need to test it in my environment.
>> >> >> >> >
>> >> >> >> > Anyway, I had a look your code and I have seen that the session
>> is
>> >> >> closed
>> >> >> >> > only if failover is not completed in 30 seconds.
>> >> >> >> > I have the following doubt related to this change. Is this
>> session
>> >> >> >> > operative during the failover? I mean, if failover spends 20
>> >> seconds,
>> >> >> is
>> >> >> >> > this session blocked during this time or this session can accept
>> >> any
>> >> >> >> > transaction?
>> >> >> >>
>> >> >> >> It is likely the session is blocked. The reason for "likely" is
>> the
>> >> >> >> function which has the logic inside can be called frequently
>> during
>> >> >> >> session but it is not always. It is possible that a pgpool process
>> >> >> >> already called the function by the time when failover starts, then
>> >> >> >> proceeds and sends a query to backend.
>> >> >> >>
>> >> >> >> > Let me another question. Should we add this issue as a bug?
>> >> >> >>
>> >> >> >> No you don't need. Developers already recognize this a bug report.
>> >> >> >>
>> >> >> >> > Thanks in advance.
>> >> >> >> >
>> >> >> >> > Best,
>> >> >> >> > Jesús
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > El mié, 12 abr 2023 3:33, Tatsuo Ishii <ishii at sraoss.co.jp>
>> >> escribió:
>> >> >> >> >
>> >> >> >> >> > However a downside of this is, while failover clients cannot
>> >> >> process
>> >> >> >> >> > queries or at least slow down processing. Below is the log
>> from
>> >> >> >> >> > pgbench using "-P 1" option to show progress. As you can see
>> >> from
>> >> >> 170
>> >> >> >> >> > s pgbench starts to slow down and recovers at 194 s. That is,
>> >> the
>> >> >> >> >> > slowdown continued for 24 seconds.
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >> After more research, I suspect the slow down is due to effect
>> of
>> >> >> >> >> checkpointing. If I add "-S" option to change the transaction
>> >> time, I
>> >> >> >> >> don't see the slow down anymore.
>> >> >> >> >>
>> >> >> >> >> Best reagards,
>> >> >> >> >> --
>> >> >> >> >> Tatsuo Ishii
>> >> >> >> >> SRA OSS LLC
>> >> >> >> >> English: http://www.sraoss.co.jp/index_en/
>> >> >> >> >> Japanese:http://www.sraoss.co.jp
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>>


More information about the pgpool-general mailing list