View Issue Details

IDProjectCategoryView StatusLast Update
0000666Pgpool-IIBugpublic2020-12-04 17:03
Reporterwl_wang2020 Assigned Tot-ishii  
PriorityhighSeverityminorReproducibilityalways
Status closedResolutionopen 
PlatformlinuxOSredhatOS Version6.9
Product Version4.1.0 
Summary0000666: Restoring standby nodes in streaming replication mode causes discarding all connections (DETAIL: failover or failback Event dete
DescriptionWhen recovering the standby node using PCP_attach_node-D-D-P 9898-n 1, the PGPool causes all connections to be dropped and the front-end application to report an error with the connection database
Steps To Reproduce1. Stop the standby node Postgres database service
2. Confirm that the standby node shown on the PGPool is closed
3. Start the standby node Postgres database service
4. Run PCP_attach_node-d-D-p 9898-n 1 on pgPool
5. The front-end application connected to the database and reported an error
Additional Information2020-11-27 22:15:30: pid 29908: LOG: Pgpool-II parent process has received failover request
2020-11-27 22:15:30: pid 29908: LOG: starting fail back. reconnect host 172.31.4.62(5444)
2020-11-27 22:15:30: pid 29908: LOG: Node 0 is not down (status: 2)
2020-11-27 22:15:30: pid 120387: LOG: PCP process with pid: 60809 exit with SUCCESS.
2020-11-27 22:15:30: pid 120387: LOG: PCP process with pid: 60809 exits with status 0
2020-11-27 22:15:30: pid 29908: LOG: Do not restart children because we are failing back node id 1 host: 172.31.4.62 port: 5444 and we are in streaming replication mode and not all backends were down
2020-11-27 22:15:30: pid 29908: LOG: find_primary_node_repeatedly: waiting for finding a primary node
2020-11-27 22:15:30: pid 29908: LOG: find_primary_node: primary node is 0
2020-11-27 22:15:30: pid 29908: LOG: find_primary_node: standby node is 1
2020-11-27 22:15:30: pid 29908: LOG: failover: set new primary node: 0
2020-11-27 22:15:30: pid 29908: LOG: failover: set new master node: 0
2020-11-27 22:15:30: pid 29908: LOG: failback done. reconnect host 172.31.4.62(5444)
2020-11-27 22:15:30: pid 120388: LOG: worker process received restart request
2020-11-27 22:15:31: pid 120387: LOG: restart request received in pcp child process
2020-11-27 22:15:31: pid 29908: LOG: PCP child 120387 exits with status 0 in failover()
2020-11-27 22:15:31: pid 29908: LOG: fork a new PCP child pid 60811 in failover()
2020-11-27 22:15:31: pid 29908: LOG: child process with pid: 119783 exits with status 256
2020-11-27 22:15:31: pid 60811: LOG: PCP process: 60811 started
2020-11-27 22:15:31: pid 29908: LOG: fork a new child process with pid: 60812
2020-11-27 22:15:31: pid 29908: LOG: worker child process with pid: 120388 exits with status 256
2020-11-27 22:15:31: pid 60812: LOG: failover or failback event detected
2020-11-27 22:15:31: pid 60812: DETAIL: restarting myself
2020-11-27 22:15:31: pid 29908: LOG: fork a new worker child process with pid: 60813
2020-11-27 22:15:31: pid 29908: LOG: child process with pid: 60812 exits with status 256
2020-11-27 22:15:31: pid 29908: LOG: fork a new child process with pid: 60814
2020-11-27 22:16:01: pid 45021: LOG: failover or failback event detected
2020-11-27 22:16:01: pid 45021: DETAIL: restarting myself
2020-11-27 22:16:01: pid 43288: LOG: failover or failback event detected
2020-11-27 22:16:01: pid 43288: DETAIL: restarting myself
2020-11-27 22:16:01: pid 48801: LOG: failover or failback event detected
2020-11-27 22:16:01: pid 48801: DETAIL: restarting myself
2020-11-27 22:16:01: pid 59340: LOG: failover or failback event detected
2020-11-27 22:16:01: pid 59340: DETAIL: restarting myself
2020-11-27 22:16:01: pid 50267: LOG: failover or failback event detected
2020-11-27 22:16:01: pid 50267: DETAIL: restarting myself
2020-11-27 22:16:01: pid 29908: LOG: child process with pid: 45021 exits with status 256
2020-11-27 22:16:01: pid 29908: LOG: fork a new child process with pid: 60923
2020-11-27 22:16:01: pid 29908: LOG: child process with pid: 43288 exits with status 256
2020-11-27 22:16:01: pid 29908: LOG: fork a new child process with pid: 60924
2020-11-27 22:16:01: pid 29908: LOG: child process with pid: 48801 exits with status 256
2020-11-27 22:16:01: pid 29908: LOG: fork a new child process with pid: 60925
2020-11-27 22:16:01: pid 29908: LOG: child process with pid: 50267 exits with status 256
2020-11-27 22:16:01: pid 29908: LOG: fork a new child process with pid: 60926
2020-11-27 22:16:01: pid 29908: LOG: child process with pid: 59340 exits with status 256
2020-11-27 22:16:01: pid 29908: LOG: fork a new child process with pid: 60927
2020-11-27 22:16:01: pid 44904: LOG: md5 authentication successful with frontend
2020-11-27 22:16:01: pid 44904: LOG: selecting backend connection
2020-11-27 22:16:01: pid 44904: DETAIL: failover or failback event detected, discarding existing connections
2020-11-27 22:16:12: pid 60336: LOG: failover or failback event detected
2020-11-27 22:16:12: pid 60336: DETAIL: restarting myself
2020-11-27 22:16:12: pid 60335: LOG: failover or failback event detected
2020-11-27 22:16:12: pid 60335: DETAIL: restarting myself
2020-11-27 22:16:12: pid 60334: LOG: failover or failback event detected
2020-11-27 22:16:12: pid 60334: DETAIL: restarting myself
2020-11-27 22:16:12: pid 60333: LOG: failover or failback event detected
2020-11-27 22:16:12: pid 60333: DETAIL: restarting myself
2020-11-27 22:16:12: pid 29908: LOG: child process with pid: 60335 exits with status 256
2020-11-27 22:16:12: pid 29908: LOG: fork a new child process with pid: 60934
2020-11-27 22:16:12: pid 29908: LOG: child process with pid: 60333 exits with status 256
2020-11-27 22:16:12: pid 29908: LOG: fork a new child process with pid: 60935
2020-11-27 22:16:12: pid 29908: LOG: child process with pid: 60334 exits with status 256
2020-11-27 22:16:12: pid 29908: LOG: fork a new child process with pid: 60936
2020-11-27 22:16:12: pid 29908: LOG: child process with pid: 60336 exits with status 256
2020-11-27 22:16:12: pid 29908: LOG: fork a new child process with pid: 60937
2020-11-27 22:20:01: pid 33046: LOG: failover or failback event detected
2020-11-27 22:20:01: pid 33046: DETAIL: restarting myself
2020-11-27 22:20:01: pid 29480: LOG: failover or failback event detected
2020-11-27 22:20:01: pid 29480: DETAIL: restarting myself
2020-11-27 22:20:01: pid 49547: LOG: failover or failback event detected
2020-11-27 22:20:01: pid 49547: DETAIL: restarting myself
2020-11-27 22:20:01: pid 31702: LOG: failover or failback event detected
2020-11-27 22:20:01: pid 31702: DETAIL: restarting myself
2020-11-27 22:20:01: pid 29908: LOG: child process with pid: 29480 exits with status 256
2020-11-27 22:20:01: pid 29908: LOG: fork a new child process with pid: 61365
2020-11-27 22:20:01: pid 29908: LOG: child process with pid: 33046 exits with status 256
2020-11-27 22:20:01: pid 29908: LOG: fork a new child process with pid: 61366
2020-11-27 22:20:01: pid 29908: LOG: child process with pid: 31702 exits with status 256
2020-11-27 22:20:01: pid 29908: LOG: fork a new child process with pid: 61367
2020-11-27 22:20:01: pid 29908: LOG: child process with pid: 49547 exits with status 256
2020-11-27 22:20:01: pid 29908: LOG: fork a new child process with pid: 61368
2020-11-27 22:20:01: pid 52390: LOG: failover or failback event detected
2020-11-27 22:20:01: pid 52390: DETAIL: restarting myself
2020-11-27 22:20:01: pid 45018: LOG: md5 authentication successful with frontend
2020-11-27 22:20:01: pid 45018: LOG: selecting backend connection
2020-11-27 22:20:01: pid 45018: DETAIL: failover or failback event detected, discarding existing connections
2020-11-27 22:20:01: pid 35048: LOG: failover or failback event detected
2020-11-27 22:20:01: pid 35048: DETAIL: restarting myself
2020-11-27 22:20:01: pid 70310: LOG: failover or failback event detected
2020-11-27 22:20:01: pid 70310: DETAIL: restarting myself
2020-11-27 22:20:01: pid 50605: DETAIL: failover or failback event detected, discarding existing connections
2020-11-27 22:20:01: pid 60771: LOG: failover or failback event detected
2020-11-27 22:20:01: pid 60771: DETAIL: restarting myself
2020-11-27 22:25:02: pid 41702: LOG: failover or failback event detected
2020-11-27 22:25:02: pid 41702: DETAIL: restarting myself
2020-11-27 22:25:02: pid 44741: LOG: failover or failback event detected
2020-11-27 22:25:02: pid 44741: DETAIL: restarting myself
2020-11-27 22:25:02: pid 47692: LOG: failover or failback event detected
2020-11-27 22:25:02: pid 47692: DETAIL: restarting myself
2020-11-27 22:25:02: pid 59593: LOG: failover or failback event detected
2020-11-27 22:25:02: pid 59593: DETAIL: restarting myself
2020-11-27 22:25:02: pid 105289: LOG: failover or failback event detected
2020-11-27 22:25:02: pid 105289: DETAIL: restarting myself
2020-11-27 22:25:02: pid 29908: LOG: child process with pid: 47692 exits with status 256
2020-11-27 22:25:02: pid 29908: LOG: fork a new child process with pid: 61822
2020-11-27 22:25:02: pid 29908: LOG: child process with pid: 105289 exits with status 256
2020-11-27 22:25:02: pid 29908: LOG: fork a new child process with pid: 61823
2020-11-27 22:25:02: pid 29908: LOG: child process with pid: 41702 exits with status 256
2020-11-27 22:25:02: pid 29908: LOG: fork a new child process with pid: 61824
2020-11-27 22:25:02: pid 29908: LOG: child process with pid: 59593 exits with status 256
2020-11-27 22:25:02: pid 29908: LOG: fork a new child process with pid: 61825
2020-11-27 22:25:02: pid 29908: LOG: child process with pid: 44741 exits with status 256
2020-11-27 22:25:02: pid 29908: LOG: fork a new child process with pid: 61826
2020-11-27 22:25:02: pid 52588: LOG: md5 authentication successful with frontend
2020-11-27 22:25:02: pid 52588: LOG: selecting backend connection
2020-11-27 22:25:02: pid 52588: DETAIL: failover or failback event detected, discarding existing connections
2020-11-27 22:25:02: pid 29908: LOG: child process with pid: 45018 exits with status 256
2020-11-27 22:25:02: pid 29908: LOG: fork a new child process with pid: 61828
2020-11-27 22:25:02: pid 29908: LOG: child process with pid: 50605 exits with status 256
2020-11-27 22:25:02: pid 29908: LOG: fork a new child process with pid: 61829
2020-11-27 22:30:02: pid 46720: LOG: failover or failback event detected
2020-11-27 22:30:02: pid 46720: DETAIL: restarting myself
2020-11-27 22:30:02: pid 55148: LOG: failover or failback event detected
2020-11-27 22:30:02: pid 55148: DETAIL: restarting myself
2020-11-27 22:30:02: pid 50371: LOG: failover or failback event detected
2020-11-27 22:30:02: pid 50371: DETAIL: restarting myself
2020-11-27 22:30:02: pid 57427: LOG: failover or failback event detected
2020-11-27 22:30:02: pid 57427: DETAIL: restarting myself
2020-11-27 22:30:02: pid 29908: LOG: child process with pid: 46720 exits with status 256
2020-11-27 22:30:02: pid 29908: LOG: fork a new child process with pid: 62337
2020-11-27 22:30:02: pid 29908: LOG: child process with pid: 50371 exits with status 256
2020-11-27 22:30:02: pid 29908: LOG: fork a new child process with pid: 62338
2020-11-27 22:30:02: pid 29908: LOG: child process with pid: 55148 exits with status 256
2020-11-27 22:30:02: pid 29908: LOG: fork a new child process with pid: 62339
2020-11-27 22:30:02: pid 29908: LOG: child process with pid: 57427 exits with status 256
2020-11-27 22:30:02: pid 29908: LOG: fork a new child process with pid: 62340
2020-11-27 22:30:02: pid 50438: LOG: md5 authentication successful with frontend
2020-11-27 22:30:02: pid 50438: LOG: selecting backend connection
2020-11-27 22:30:02: pid 50438: DETAIL: failover or failback event detected, discarding existing connections
TagsNo tags attached.

Activities

wl_wang2020

2020-12-01 11:57

reporter  

pgpool.conf (43,645 bytes)

t-ishii

2020-12-01 16:07

developer   ~0003623

I could not find log lines which suggests the client disconnection. Which line do you refer to?
(also I was not able to reproduce your problem.)

wl_wang2020

2020-12-01 16:44

reporter   ~0003624

How do you do

The front-end application reports an error and exits at the same time.The exit time is consistent with the PGPool log, as follows:
2020-11-27 22:16:01: pid 44904: LOG: selecting backend connection
2020-11-27 22:16:01: pid 44904: DETAIL: failover or failback event detected, discarding existing connections

t-ishii

2020-12-01 17:15

developer   ~0003625

Let me clarify. your front-end application reports what? Something like failed to connect to pgpool? How does your front-end application connect to pgpool? Does it directly connect to pgpool? Or via connection pooling of Tomcat?

wl_wang2020

2020-12-01 17:44

reporter   ~0003626

How do you do
1. The front end application error: selectentdtmngtbBySyorikbn 20 (errcode: 0) SQLException: An I/O error occured while sending to the backend. Com. Edb. Util. PSQLException: An I/O error occured while sending to the backend.
2. Direct PGPOOL for front-end application
3. When the front-end application went wrong, I was restoring the standby node using the command: pcp_attach_node -d -p 9898 -n 1. Once the execution is complete, the front application exits (this has happened twice in the build environment)

wl_wang2020

2020-12-01 17:47

reporter   ~0003627

2020-11-27 22:20:01: pid 52390: LOG: failover or failback event detected
2020-11-27 22:20:01: pid 52390: DETAIL: restarting myself
2020-11-27 22:20:01: pid 45018: LOG: md5 authentication successful with frontend
2020-11-27 22:20:01: pid 45018: LOG: selecting backend connection
2020-11-27 22:20:01: pid 45018: DETAIL: failover or failback event detected, discarding existing connections
What do these messages mean when they last for a day and then disappear? Does it matter?

t-ishii

2020-12-02 12:26

developer   ~0003629

Last edited: 2020-12-02 12:27

When failback event happens, pgpool main process sets a flag to let pgpool child process (they are actually interacting with frontend) know the event. When a child process finds the flag while they are in idle state (no connection from frontend), it suicides to refresh the status about backend because the backend state is cached in the child process's memory. Log messages below indicate it:

2020-11-27 22:20:01: pid 52390: LOG: failover or failback event detected
2020-11-27 22:20:01: pid 52390: DETAIL: restarting myself

Once all the child process do that, the message will not be displayed. Because you have many child process (num_init_children = 1000), it will take while.

The messages:
2020-11-27 22:20:01: pid 45018: LOG: md5 authentication successful with frontend
2020-11-27 22:20:01: pid 45018: LOG: selecting backend connection
2020-11-27 22:20:01: pid 45018: DETAIL: failover or failback event detected, discarding existing connections

are logged in the different situation. A pgpool child process receives new connection request without a chance to do the restarting explained above. So it proceeds with authentication (md5) then notices the flag is set. What it is doing here is, looking for backend connection cache which is expired by connection_life_time. If such connection exists, it closes it. But since you disable the parameter, actually pgpool does nothing here.

My guess is, after this the pgpool child process failed to use the existing connection cache to backend, it closed the connection to frontend. Question is, why pid 45018 did not leave the error log after this?

wl_wang2020

2020-12-02 14:53

reporter   ~0003630

Thank you very much.

The relevant log information has been understood.

Through investigation, when "detail: failure or failback event detected, discarding existing connections" occurs frequently, the front-end application does not report an error or quit.
The recovery node should cause the front-end application to report an error and exit (error: selectentdtmngtbbysyorikbn 20 (error code: 0) sqlexception: an I / O error occurred while sending to the backend). Will pgpool reset all connections when the node is restored?

t-ishii

2020-12-04 07:39

developer   ~0003636

It depends. If primary node is restored, pgpool will reset all connections. If standby node is restored, connections are not reset except front-end access DB while pgpool is doing fail back. This continues until pgpool finishes the fail back process.
When it's done, you should see something like in the log file.

failback done. reconnect host ...

wl_wang2020

2020-12-04 15:49

reporter   ~0003639

Thank you very much. This questionnaire can be closed

t-ishii

2020-12-04 17:03

developer   ~0003640

Thank you for confirmation. I am going to close this issue.

Issue History

Date Modified Username Field Change
2020-12-01 11:57 wl_wang2020 New Issue
2020-12-01 11:57 wl_wang2020 File Added: pgpool.conf
2020-12-01 13:26 t-ishii Assigned To => t-ishii
2020-12-01 13:26 t-ishii Status new => assigned
2020-12-01 16:07 t-ishii Note Added: 0003623
2020-12-01 16:07 t-ishii Status assigned => feedback
2020-12-01 16:07 t-ishii Steps to Reproduce Updated
2020-12-01 16:07 t-ishii Additional Information Updated
2020-12-01 16:44 wl_wang2020 Note Added: 0003624
2020-12-01 16:44 wl_wang2020 Status feedback => assigned
2020-12-01 17:15 t-ishii Note Added: 0003625
2020-12-01 17:15 t-ishii Status assigned => feedback
2020-12-01 17:44 wl_wang2020 Note Added: 0003626
2020-12-01 17:44 wl_wang2020 Status feedback => assigned
2020-12-01 17:47 wl_wang2020 Note Added: 0003627
2020-12-02 12:26 t-ishii Note Added: 0003629
2020-12-02 12:27 t-ishii Note Edited: 0003629
2020-12-02 14:53 wl_wang2020 Note Added: 0003630
2020-12-04 07:39 t-ishii Note Added: 0003636
2020-12-04 07:41 t-ishii Status assigned => feedback
2020-12-04 15:49 wl_wang2020 Note Added: 0003639
2020-12-04 15:49 wl_wang2020 Status feedback => assigned
2020-12-04 17:03 t-ishii Note Added: 0003640
2020-12-04 17:03 t-ishii Status assigned => closed