[pgpool-general: 7044] Re: Closing connection in OmniDB causes degenerate backend request on pgPool

Takuma Hoshiai hoshiai at sraoss.co.jp
Wed May 27 17:32:26 JST 2020


Hello,

On Tue, 26 May 2020 13:28:45 +0000
Láznička Vladimír <Vladimir.Laznicka at cca.cz> wrote:

> Hello,
> 
> With the last version of pgPool (4.1.2) we have noticed a strange behavior while using the OmniDB client application. Firstly - we are running pgPool in a native replication mode in this configuration:
> 
> PgPool servers (coordinated by the Watchdog process):
> 
> AISGPGP01 - primary pgpool server with a virtual IP for client connections.
> AISGPGP02 - standby pgpool server.
> 
> PostgreSQL servers (DML + DDL commands are replicated via pgPool to both, SELECTs are load balanced on one of them):
> 
> AISGDBD01 - DB node 0.
> AISGDBD02 - DB node 1.
> 
> Our problem is following - when closing a window with editable SQL code in OmniDB application (effectively closing the session under which the window is connecting to the database), it causes a failover of the DB node as if the database on one of that node went down. The result is that one DB node is down, while other is still up. If it happens a second time (someone closes the connection from OmniDB), pgPool degenerates the last node and both databases are effectively down (PostgreSQL processes on AIGSDBD servers run fine) until they are manually brought up by PCP commands. The log from pgPool during the event is following:
> 
> May 26 12:07:59 AISGPGP01 pgpool[29612]: [2067-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  new connection received
> May 26 12:07:59 AISGPGP01 pgpool[29612]: [2067-2] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |DETAIL:  connecting host=nb197.cca.cz port=58065
> May 26 12:07:59 AISGPGP01 pgpool[29612]: [2068-1] 2020-05-26 12:07:59 ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: |LOG:  found the pg_terminate_backend request for backend pid:32254 on backend node:0
> May 26 12:07:59 AISGPGP01 pgpool[29612]: [2068-2] 2020-05-26 12:07:59 ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: |DETAIL:  setting the connection flag
> May 26 12:07:59 AISGPGP01 pgpool[29820]: [2340-1] 2020-05-26 12:07:59 ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: |LOG:  reading and processing packets
> May 26 12:07:59 AISGPGP01 pgpool[29820]: [2340-2] 2020-05-26 12:07:59 ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: |DETAIL:  postmaster on DB node 0 was shutdown by administrative command
> May 26 12:07:59 AISGPGP01 pgpool[29820]: [2341-1] 2020-05-26 12:07:59 ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: |LOG:  received degenerate backend request for node_id: 0 from pid [29820]
> May 26 12:07:59 AISGPGP01 pgpool[22480]: [131-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  new IPC connection received
> May 26 12:07:59 AISGPGP01 pgpool[22480]: [132-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  watchdog received the failover command from local pgpool-II on IPC interface
> May 26 12:07:59 AISGPGP01 pgpool[22480]: [133-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
> May 26 12:07:59 AISGPGP01 pgpool[22480]: [134-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  we do not need quorum to hold to proceed with failover
> May 26 12:07:59 AISGPGP01 pgpool[22480]: [134-2] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |DETAIL:  proceeding with the failover
> May 26 12:07:59 AISGPGP01 pgpool[22480]: [134-3] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |HINT:  failover_when_quorum_exists is set to false
> May 26 12:07:59 AISGPGP01 pgpool[22478]: [2354-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  Pgpool-II parent process has received failover request
> May 26 12:07:59 AISGPGP01 pgpool[22480]: [135-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  new IPC connection received
> May 26 12:07:59 AISGPGP01 pgpool[22480]: [136-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  received the failover indication from Pgpool-II on IPC interface
> May 26 12:07:59 AISGPGP01 pgpool[22480]: [137-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  watchdog is informed of failover start by the main process
> May 26 12:07:59 AISGPGP01 pgpool[22478]: [2355-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  starting degeneration. shutdown host aisgdbd01.cca.cz(5432)
> May 26 12:07:59 AISGPGP01 pgpool[22478]: [2356-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |WARNING:  All the DB nodes are in down status and skip writing status file.
> May 26 12:07:59 AISGPGP01 pgpool[22478]: [2357-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  failover: no valid backend node found
> May 26 12:07:59 AISGPGP01 pgpool[22478]: [2358-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  Restart all children
> May 26 12:07:59 AISGPGP01 pgpool[22478]: [2359-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  failover: set new primary node: -1
> May 26 12:07:59 AISGPGP01 pgpool[22480]: [138-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  new IPC connection received
> May 26 12:07:59 AISGPGP01 pgpool[22480]: [139-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  received the failover indication from Pgpool-II on IPC interface
> May 26 12:07:59 AISGPGP01 pgpool[29662]: [2327-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  worker process received restart request
> May 26 12:07:59 AISGPGP01 pgpool[22480]: [140-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  watchdog is informed of failover end by the main process
> May 26 12:07:59 AISGPGP01 pgpool[22478]: [2360-1] 2020-05-26 12:07:59 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  failover done. shutdown host aisgdbd01.cca.cz(5432)
> 
> The log on the DB node 0 (that is being degenerated by pgPool) during that time is following:
> 
> 2020-05-26 12:07:59.265 CEST ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: HOST - 172.20.15.64 :H: CMD - SELECT :C: |LOG:  AUDIT: SESSION,1,1,READ,SELECT,,,select pg_terminate_backend(32254),<not logged>
> 2020-05-26 12:07:59.265 CEST ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: HOST - 172.20.15.64 :H: CMD - idle :C: |FATAL:  terminating connection due to administrator command
> 2020-05-26 12:07:59.355 CEST ::: APP - OmniDB :A: DB - aisgdvyv :D: USER - aisg :U: HOST - 172.20.15.64 :H: CMD - idle in transaction :C: |LOG:  unexpected EOF on client connection with an open transaction
> 
> We have not noticed this behavior while running previous versions of the pgPool (4.1.1 and older) and even tested the same scenario on an environment with 4.1.1 without the nodes being degenerated, so it seems to be an issue of the new version, but I obviously cannot confirm that. I have also tried to change the failover_on_backend_error parameter, but it has no effect in this particular situation.

I think the cause of failover is pg_terminate_backend() command. Usually,
it is happend by executing this command directly, without using Pgpool. 
Because pgpool can not know if postgres shutdown or only terminate connection.

This behavior is dscried in documentation:
  https://www.pgpool.net/docs/latest/en/html/restrictions.html

However I don't understand why it is happend by Pgpool-II 4.1.2 and it 
isn't happend by Pgpool-II 4.1.1 and under. 
Could you Reproduce this work with enabled following paramer in pgpool ?

 log_per_node_statement = on
 log_client_messages = on

> I am sending our pgPool configuration file as an attachment. Please, let me know, if there's a way to find out, what is actually happening between pgPool and OmniDB and how to prevent this behavior, or if you need to some additional information to solve this issue. Thank you for your time.
> 
> With regards,
> Vladimír Láznička, CCA Group a. s.

Best Regarads,

Takuma Hoshiai

-- 
Takuma Hoshiai <hoshiai at sraoss.co.jp>



More information about the pgpool-general mailing list