[pgpool-general: 7040] Closing connection in OmniDB causes degenerate backend request on pgPool

Láznička Vladimír Vladimir.Laznicka at cca.cz
Tue May 26 22:28:45 JST 2020


Hello,

With the last version of pgPool (4.1.2) we have noticed a strange behavior while using the OmniDB client application. Firstly - we are running pgPool in a native replication mode in this configuration:

PgPool servers (coordinated by the Watchdog process):

AISGPGP01 - primary pgpool server with a virtual IP for client connections.
AISGPGP02 - standby pgpool server.

PostgreSQL servers (DML + DDL commands are replicated via pgPool to both, SELECTs are load balanced on one of them):

AISGDBD01 - DB node 0.
AISGDBD02 - DB node 1.

Our problem is following - when closing a window with editable SQL code in OmniDB application (effectively closing the session under which the window is connecting to the database), it causes a failover of the DB node as if the database on one of that node went down. The result is that one DB node is down, while other is still up. If it happens a second time (someone closes the connection from OmniDB), pgPool degenerates the last node and both databases are effectively down (PostgreSQL processes on AIGSDBD servers run fine) until they are manually brought up by PCP commands. The log from pgPool during the event is following:

May 26 12:07:59 AISGPGP01 pgpool[29612]: [2067-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  new connection received
May 26 12:07:59 AISGPGP01 pgpool[29612]: [2067-2] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |DETAIL:  connecting host=nb197.cca.cz port=58065
May 26 12:07:59 AISGPGP01 pgpool[29612]: [2068-1] 2020-05-26 12:07:59 ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: |LOG:  found the pg_terminate_backend request for backend pid:32254 on backend node:0
May 26 12:07:59 AISGPGP01 pgpool[29612]: [2068-2] 2020-05-26 12:07:59 ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: |DETAIL:  setting the connection flag
May 26 12:07:59 AISGPGP01 pgpool[29820]: [2340-1] 2020-05-26 12:07:59 ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: |LOG:  reading and processing packets
May 26 12:07:59 AISGPGP01 pgpool[29820]: [2340-2] 2020-05-26 12:07:59 ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: |DETAIL:  postmaster on DB node 0 was shutdown by administrative command
May 26 12:07:59 AISGPGP01 pgpool[29820]: [2341-1] 2020-05-26 12:07:59 ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: |LOG:  received degenerate backend request for node_id: 0 from pid [29820]
May 26 12:07:59 AISGPGP01 pgpool[22480]: [131-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  new IPC connection received
May 26 12:07:59 AISGPGP01 pgpool[22480]: [132-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  watchdog received the failover command from local pgpool-II on IPC interface
May 26 12:07:59 AISGPGP01 pgpool[22480]: [133-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
May 26 12:07:59 AISGPGP01 pgpool[22480]: [134-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  we do not need quorum to hold to proceed with failover
May 26 12:07:59 AISGPGP01 pgpool[22480]: [134-2] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |DETAIL:  proceeding with the failover
May 26 12:07:59 AISGPGP01 pgpool[22480]: [134-3] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |HINT:  failover_when_quorum_exists is set to false
May 26 12:07:59 AISGPGP01 pgpool[22478]: [2354-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  Pgpool-II parent process has received failover request
May 26 12:07:59 AISGPGP01 pgpool[22480]: [135-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  new IPC connection received
May 26 12:07:59 AISGPGP01 pgpool[22480]: [136-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  received the failover indication from Pgpool-II on IPC interface
May 26 12:07:59 AISGPGP01 pgpool[22480]: [137-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  watchdog is informed of failover start by the main process
May 26 12:07:59 AISGPGP01 pgpool[22478]: [2355-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  starting degeneration. shutdown host aisgdbd01.cca.cz(5432)
May 26 12:07:59 AISGPGP01 pgpool[22478]: [2356-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |WARNING:  All the DB nodes are in down status and skip writing status file.
May 26 12:07:59 AISGPGP01 pgpool[22478]: [2357-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  failover: no valid backend node found
May 26 12:07:59 AISGPGP01 pgpool[22478]: [2358-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  Restart all children
May 26 12:07:59 AISGPGP01 pgpool[22478]: [2359-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  failover: set new primary node: -1
May 26 12:07:59 AISGPGP01 pgpool[22480]: [138-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  new IPC connection received
May 26 12:07:59 AISGPGP01 pgpool[22480]: [139-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  received the failover indication from Pgpool-II on IPC interface
May 26 12:07:59 AISGPGP01 pgpool[29662]: [2327-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  worker process received restart request
May 26 12:07:59 AISGPGP01 pgpool[22480]: [140-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  watchdog is informed of failover end by the main process
May 26 12:07:59 AISGPGP01 pgpool[22478]: [2360-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  failover done. shutdown host aisgdbd01.cca.cz(5432)

The log on the DB node 0 (that is being degenerated by pgPool) during that time is following:

2020-05-26 12:07:59.265 CEST ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: HOST - 172.20.15.64 :H: CMD - SELECT :C: |LOG:  AUDIT: SESSION,1,1,READ,SELECT,,,select pg_terminate_backend(32254),<not logged>
2020-05-26 12:07:59.265 CEST ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: HOST - 172.20.15.64 :H: CMD - idle :C: |FATAL:  terminating connection due to administrator command
2020-05-26 12:07:59.355 CEST ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: HOST - 172.20.15.64 :H: CMD - idle in transaction :C: |LOG:  unexpected EOF on client connection with an open transaction

We have not noticed this behavior while running previous versions of the pgPool (4.1.1 and older) and even tested the same scenario on an environment with 4.1.1 without the nodes being degenerated, so it seems to be an issue of the new version, but I obviously cannot confirm that. I have also tried to change the failover_on_backend_error parameter, but it has no effect in this particular situation.

I am sending our pgPool configuration file as an attachment. Please, let me know, if there's a way to find out, what is actually happening between pgPool and OmniDB and how to prevent this behavior, or if you need to some additional information to solve this issue. Thank you for your time.

With regards,
Vladimír Láznička, CCA Group a. s.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20200526/ce94ef6e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pgpool.conf
Type: application/octet-stream
Size: 43471 bytes
Desc: pgpool.conf
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20200526/ce94ef6e/attachment-0001.obj>


More information about the pgpool-general mailing list