[pgpool-general: 8733] Re: Clients disconnection when slave node is off

Jesús Campoy jesuscampoy at gmail.com
Wed Apr 12 01:36:14 JST 2023


Thank you for your explanation Tatsuo.

I also reproduced the issue with a script that executes "select 1" via psql
10 times a second (please find attached it).
Besides, when I have pgadmin4 tool connected to the VIP, I can observe how
pgadmin4 is disconnected.

I'm working in a workaround that consists in set DISALLOW_FAILOVER in
backend1 to change it to ALLOW_FAILOVER in the failover script (via sed
command) and reload the pgpool configuration during failover. I don't like
it but I need to avoid client disconnection.

Do you think we should report it as a pgpool bug to work on it in future
releases?

Thanks in advance.

Best,
Jesús

El mar, 11 abr 2023 8:49, Tatsuo Ishii <ishii at sraoss.co.jp> escribió:

> Ok, I think I have found the cause of the issue. Unfortunately I am
> not able to find any solution for this at the moment. Let me explain.
>
> First of all, I was able to reproduce the issue by using pgbench and
> pgpool_setup (a test tool coming with pgpool to create pgbench +
> PostgreSQL cluster).
>
> $ pgpool_setup
> $ echo "load_balance_mode = off" >> etc/pgpool.conf
> $ echo "failover_on_backend_error = off" >> etc/pgpool.conf
> $ ./startall
> $ pgbench -p 11000 -n -c 1 -T 300 test
>
> At this point I hoped a standby node shutdown does not affect pgbench,
> because load balance mode is off, pgpool should only access to primary
> node.
>
> So I entered:
> $ pg_ctl -D data1 stop
>
> Then pgbench aborted shortly:
>
> pgbench: error: client 0 aborted in command 5 (SQL) of script 0; perhaps
> the backend died while processing
> [snip]
>
> After debugging pgpool using gdb, I got stack trace.
>
> #6  0x000056538ef05fc4 in child_exit (code=code at entry=1) at
> protocol/child.c:1336
> #7  0x000056538ef2ba6a in pool_virtual_main_db_node_id () at
> context/pool_query_context.c:341
> #8  0x000056538ef0e42b in read_packets_and_process (frontend=frontend at entry=0x5653900cb4c8,
> backend=backend at entry=0x7f732f21fcd8, reset_request=reset_request at entry=0,
>
>     state=state at entry=0x7ffcef518b6c, num_fields=num_fields at entry=0x7ffcef518b6a,
> cont=cont at entry=0x7ffcef518b74 "\001V") at
> protocol/pool_process_query.c:5098
> #9  0x000056538ef0f1e3 in pool_process_query (frontend=0x5653900cb4c8,
> backend=0x7f732f21fcd8, reset_request=reset_request at entry=0)
>     at protocol/pool_process_query.c:298
> #10 0x000056538ef0734a in do_child (fds=fds at entry=0x565390107ba0) at
> protocol/child.c:455
> #11 0x000056538eedb906 in fork_a_child (fds=0x565390107ba0, id=26) at
> main/pgpool_main.c:842
> #12 0x000056538eee3e2a in PgpoolMain (discard_status=<optimized out>,
> clear_memcache_oidmaps=<optimized out>) at main/pgpool_main.c:541
> #13 0x000056538eed9826 in main (argc=<optimized out>, argv=<optimized
> out>) at main/main.c:365
>
> As the child process exits and the socket connection closed, and
> pgbench complains it. Why pgpool exited? Pgpool maintains information
> including, which node is primary, which node is up or
> down. pool_virtual_main_db_node_id() returns such that
> information. Unfortunately the information is changing while
> failover. If pgpool process uses unstable information, it would cause
> bad results including segfaults (that actually happend in the
> past. See mantis bug track no. 481 and 482). Thus if
> pool_virtual_main_db_node_id() finds that failover is ongoing, it
> exits the process and session is disconnected. One solution for this
> is, locking the information by pgpool process. However, this is not
> only hard to implement, but could result in dead lock. So I hesitate
> to employ it.
>
> BTW, the reson why the issue happens in pgbench, and does not happen
> in psql is, I think pgbench produces extremely busy transactions which
> increases the chance to reproduce the problem.
>
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS LLC
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp
>
> > Hi,
> >
> > Thanks. I will looking into this.
> >
> > Best reagards,
> > --
> > Tatsuo Ishii
> > SRA OSS LLC
> > English: http://www.sraoss.co.jp/index_en/
> > Japanese:http://www.sraoss.co.jp
> >
> >> Hi,
> >>
> >> Please find attached logs.
> >>
> >> Thank you.
> >>
> >> Best,
> >> Jesús
> >>
> >> El lun, 3 abr 2023 11:06, Tatsuo Ishii <ishii at sraoss.co.jp> escribió:
> >>
> >>> Hi,
> >>>
> >>> Can you share pgpool log?
> >>>
> >>> > Hi Tatsuo,
> >>> >
> >>> > Should I test anything else to try to solve my problem?
> >>> >
> >>> > Thank you.
> >>> >
> >>> > Best,
> >>> > Jesús
> >>> >
> >>> > El vie, 31 mar 2023 12:10, Jesús Campoy <jesuscampoy at gmail.com>
> >>> escribió:
> >>> >
> >>> >> Hi,
> >>> >>
> >>> >> I just performed the same tests with auto_failback to off but the
> >>> >> behaviour is the same (clients are disconnected).
> >>> >>
> >>> >> Thank you Tatsuo.
> >>> >>
> >>> >> Best,
> >>> >> Jesús
> >>> >>
> >>> >> El vie, 31 mar 2023 2:31, Tatsuo Ishii <ishii at sraoss.co.jp>
> escribió:
> >>> >>
> >>> >>> Hi Jesús,
> >>> >>>
> >>> >>> Can you try again after setting "auto_failback = off"? I suspect
> >>> >>> auto_failback confuses pgpool.
> >>> >>>
> >>> >>> Best reagards,
> >>> >>> --
> >>> >>> Tatsuo Ishii
> >>> >>> SRA OSS LLC
> >>> >>> English: http://www.sraoss.co.jp/index_en/
> >>> >>> Japanese:http://www.sraoss.co.jp
> >>> >>>
> >>> >>> > Hi Tatsuo,
> >>> >>> >
> >>> >>> > When I connect to pgpool with psql this session is not
> disconnected.
> >>> >>> > However, I've performed a test with pgbench inserting data with
> 30
> >>> >>> clients
> >>> >>> > in the database and when I shutdown server2 some clients of
> pgbench
> >>> are
> >>> >>> > disconnected.
> >>> >>> > Please find attached a zip file with pgpool logs, pgbench log and
> >>> >>> > configurations of pgpool and postgres.
> >>> >>> >
> >>> >>> > Thank you for your assistance in this matter.
> >>> >>> >
> >>> >>> > Best,
> >>> >>> > Jesús
> >>> >>> >
> >>> >>> > El mié, 22 mar 2023 a las 2:05, Tatsuo Ishii (<
> ishii at sraoss.co.jp>)
> >>> >>> > escribió:
> >>> >>> >
> >>> >>> >> > Please find attached my pgpool config and a log file when the
> >>> standby
> >>> >>> >> > (server2) is powered off.
> >>> >>> >> > Thank you for your help.
> >>> >>> >>
> >>> >>> >> I assume server2 = host B.
> >>> >>> >>
> >>> >>> >> I have looked into the log file but failed to find log lines
> related
> >>> >>> >> to user sessions which were diconnected. I was looking for such
> log
> >>> >>> >> lines because you said:
> >>> >>> >>
> >>> >>> >> > Sometimes I have to power off the host B and then, the clients
> >>> >>> connected
> >>> >>> >>
> >>> >>> >> If such an event occurs, there should be such log lines.
> >>> >>> >> Many log lines like:
> >>> >>> >>
> >>> >>> >> 2023-03-20 10:59:27.725: [unknown] pid 31499: LOG:  failover or
> >>> >>> failback
> >>> >>> >> event detected
> >>> >>> >> 2023-03-20 10:59:27.725: [unknown] pid 31499: DETAIL:
> restarting
> >>> >>> myself
> >>> >>> >> 2023-03-20 10:59:27.726: main pid 30237: LOG:  child process
> with
> >>> pid:
> >>> >>> >> 31499 exits with status 256
> >>> >>> >> 2023-03-20 10:59:27.727: main pid 30237: LOG:  fork a new child
> >>> process
> >>> >>> >> with pid: 29017
> >>> >>> >>
> >>> >>> >> just show that process 31499 is not related to any client
> session
> >>> >>> >> ([unknown] indicates this) and even if the process exited, any
> >>> client
> >>> >>> >> will not be affected.
> >>> >>> >>
> >>> >>> >> Can you connect to pgpool using psql and shutdown server2 so
> that
> >>> log
> >>> >>> >> lines I am expecting are recorded?
> >>> >>> >>
> >>> >>> >> Best reagards,
> >>> >>> >> --
> >>> >>> >> Tatsuo Ishii
> >>> >>> >> SRA OSS LLC
> >>> >>> >> English: http://www.sraoss.co.jp/index_en/
> >>> >>> >> Japanese:http://www.sraoss.co.jp
> >>> >>> >>
> >>> >>> >>
> >>> >>> >> > Best,
> >>> >>> >> > Jesús
> >>> >>> >> >
> >>> >>> >> > El vie, 17 mar 2023 a las 0:13, Tatsuo Ishii (<
> ishii at sraoss.co.jp
> >>> >)
> >>> >>> >> > escribió:
> >>> >>> >> >
> >>> >>> >> >> > Ok, I will send you the log ASAP.
> >>> >>> >> >> > I forget to indicate that we are running two instances of
> >>> pgpool
> >>> >>> using
> >>> >>> >> >> > watchdog and VIP.
> >>> >>> >> >> >
> >>> >>> >> >> > I mean, in host A is running pgpool (active) and primary
> >>> >>> database. In
> >>> >>> >> >> host
> >>> >>> >> >> > B is running the other instance of pgpool and the standby
> >>> >>> database.
> >>> >>> >> >> > Sometimes I have to power off the host B and then, the
> clients
> >>> >>> >> connected
> >>> >>> >> >> to
> >>> >>> >> >> > pgpool in VIP are disconnected.
> >>> >>> >> >> >
> >>> >>> >> >> > I have the same pgpool.conf for both pgpool instances. Do
> you
> >>> >>> need It?
> >>> >>> >> >>
> >>> >>> >> >> No, one pgpool.conf is enough.
> >>> >>> >> >>
> >>> >>> >> >> > Thanks for your help!
> >>> >>> >> >>
> >>> >>> >> >> You are welcome.
> >>> >>> >> >> --
> >>> >>> >> >> Tatsuo Ishii
> >>> >>> >> >> SRA OSS LLC
> >>> >>> >> >> English: http://www.sraoss.co.jp/index_en/
> >>> >>> >> >> Japanese:http://www.sraoss.co.jp
> >>> >>> >> >>
> >>> >>> >>
> >>> >>>
> >>> >>
> >>>
> > _______________________________________________
> > pgpool-general mailing list
> > pgpool-general at pgpool.net
> > http://www.pgpool.net/mailman/listinfo/pgpool-general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20230411/5a2d7e41/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dbCheck.sh
Type: application/x-sh
Size: 1299 bytes
Desc: not available
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20230411/5a2d7e41/attachment-0001.sh>


More information about the pgpool-general mailing list