View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000666 | Pgpool-II | Bug | public | 2020-12-01 11:57 | 2020-12-04 17:03 |
| Reporter | wl_wang2020 | Assigned To | t-ishii | ||
| Priority | high | Severity | minor | Reproducibility | always |
| Status | closed | Resolution | open | ||
| Platform | linux | OS | redhat | OS Version | 6.9 |
| Product Version | 4.1.0 | ||||
| Summary | 0000666: Restoring standby nodes in streaming replication mode causes discarding all connections (DETAIL: failover or failback Event dete | ||||
| Description | When recovering the standby node using PCP_attach_node-D-D-P 9898-n 1, the PGPool causes all connections to be dropped and the front-end application to report an error with the connection database | ||||
| Steps To Reproduce | 1. Stop the standby node Postgres database service 2. Confirm that the standby node shown on the PGPool is closed 3. Start the standby node Postgres database service 4. Run PCP_attach_node-d-D-p 9898-n 1 on pgPool 5. The front-end application connected to the database and reported an error | ||||
| Additional Information | 2020-11-27 22:15:30: pid 29908: LOG: Pgpool-II parent process has received failover request 2020-11-27 22:15:30: pid 29908: LOG: starting fail back. reconnect host 172.31.4.62(5444) 2020-11-27 22:15:30: pid 29908: LOG: Node 0 is not down (status: 2) 2020-11-27 22:15:30: pid 120387: LOG: PCP process with pid: 60809 exit with SUCCESS. 2020-11-27 22:15:30: pid 120387: LOG: PCP process with pid: 60809 exits with status 0 2020-11-27 22:15:30: pid 29908: LOG: Do not restart children because we are failing back node id 1 host: 172.31.4.62 port: 5444 and we are in streaming replication mode and not all backends were down 2020-11-27 22:15:30: pid 29908: LOG: find_primary_node_repeatedly: waiting for finding a primary node 2020-11-27 22:15:30: pid 29908: LOG: find_primary_node: primary node is 0 2020-11-27 22:15:30: pid 29908: LOG: find_primary_node: standby node is 1 2020-11-27 22:15:30: pid 29908: LOG: failover: set new primary node: 0 2020-11-27 22:15:30: pid 29908: LOG: failover: set new master node: 0 2020-11-27 22:15:30: pid 29908: LOG: failback done. reconnect host 172.31.4.62(5444) 2020-11-27 22:15:30: pid 120388: LOG: worker process received restart request 2020-11-27 22:15:31: pid 120387: LOG: restart request received in pcp child process 2020-11-27 22:15:31: pid 29908: LOG: PCP child 120387 exits with status 0 in failover() 2020-11-27 22:15:31: pid 29908: LOG: fork a new PCP child pid 60811 in failover() 2020-11-27 22:15:31: pid 29908: LOG: child process with pid: 119783 exits with status 256 2020-11-27 22:15:31: pid 60811: LOG: PCP process: 60811 started 2020-11-27 22:15:31: pid 29908: LOG: fork a new child process with pid: 60812 2020-11-27 22:15:31: pid 29908: LOG: worker child process with pid: 120388 exits with status 256 2020-11-27 22:15:31: pid 60812: LOG: failover or failback event detected 2020-11-27 22:15:31: pid 60812: DETAIL: restarting myself 2020-11-27 22:15:31: pid 29908: LOG: fork a new worker child process with pid: 60813 2020-11-27 22:15:31: pid 29908: LOG: child process with pid: 60812 exits with status 256 2020-11-27 22:15:31: pid 29908: LOG: fork a new child process with pid: 60814 2020-11-27 22:16:01: pid 45021: LOG: failover or failback event detected 2020-11-27 22:16:01: pid 45021: DETAIL: restarting myself 2020-11-27 22:16:01: pid 43288: LOG: failover or failback event detected 2020-11-27 22:16:01: pid 43288: DETAIL: restarting myself 2020-11-27 22:16:01: pid 48801: LOG: failover or failback event detected 2020-11-27 22:16:01: pid 48801: DETAIL: restarting myself 2020-11-27 22:16:01: pid 59340: LOG: failover or failback event detected 2020-11-27 22:16:01: pid 59340: DETAIL: restarting myself 2020-11-27 22:16:01: pid 50267: LOG: failover or failback event detected 2020-11-27 22:16:01: pid 50267: DETAIL: restarting myself 2020-11-27 22:16:01: pid 29908: LOG: child process with pid: 45021 exits with status 256 2020-11-27 22:16:01: pid 29908: LOG: fork a new child process with pid: 60923 2020-11-27 22:16:01: pid 29908: LOG: child process with pid: 43288 exits with status 256 2020-11-27 22:16:01: pid 29908: LOG: fork a new child process with pid: 60924 2020-11-27 22:16:01: pid 29908: LOG: child process with pid: 48801 exits with status 256 2020-11-27 22:16:01: pid 29908: LOG: fork a new child process with pid: 60925 2020-11-27 22:16:01: pid 29908: LOG: child process with pid: 50267 exits with status 256 2020-11-27 22:16:01: pid 29908: LOG: fork a new child process with pid: 60926 2020-11-27 22:16:01: pid 29908: LOG: child process with pid: 59340 exits with status 256 2020-11-27 22:16:01: pid 29908: LOG: fork a new child process with pid: 60927 2020-11-27 22:16:01: pid 44904: LOG: md5 authentication successful with frontend 2020-11-27 22:16:01: pid 44904: LOG: selecting backend connection 2020-11-27 22:16:01: pid 44904: DETAIL: failover or failback event detected, discarding existing connections 2020-11-27 22:16:12: pid 60336: LOG: failover or failback event detected 2020-11-27 22:16:12: pid 60336: DETAIL: restarting myself 2020-11-27 22:16:12: pid 60335: LOG: failover or failback event detected 2020-11-27 22:16:12: pid 60335: DETAIL: restarting myself 2020-11-27 22:16:12: pid 60334: LOG: failover or failback event detected 2020-11-27 22:16:12: pid 60334: DETAIL: restarting myself 2020-11-27 22:16:12: pid 60333: LOG: failover or failback event detected 2020-11-27 22:16:12: pid 60333: DETAIL: restarting myself 2020-11-27 22:16:12: pid 29908: LOG: child process with pid: 60335 exits with status 256 2020-11-27 22:16:12: pid 29908: LOG: fork a new child process with pid: 60934 2020-11-27 22:16:12: pid 29908: LOG: child process with pid: 60333 exits with status 256 2020-11-27 22:16:12: pid 29908: LOG: fork a new child process with pid: 60935 2020-11-27 22:16:12: pid 29908: LOG: child process with pid: 60334 exits with status 256 2020-11-27 22:16:12: pid 29908: LOG: fork a new child process with pid: 60936 2020-11-27 22:16:12: pid 29908: LOG: child process with pid: 60336 exits with status 256 2020-11-27 22:16:12: pid 29908: LOG: fork a new child process with pid: 60937 2020-11-27 22:20:01: pid 33046: LOG: failover or failback event detected 2020-11-27 22:20:01: pid 33046: DETAIL: restarting myself 2020-11-27 22:20:01: pid 29480: LOG: failover or failback event detected 2020-11-27 22:20:01: pid 29480: DETAIL: restarting myself 2020-11-27 22:20:01: pid 49547: LOG: failover or failback event detected 2020-11-27 22:20:01: pid 49547: DETAIL: restarting myself 2020-11-27 22:20:01: pid 31702: LOG: failover or failback event detected 2020-11-27 22:20:01: pid 31702: DETAIL: restarting myself 2020-11-27 22:20:01: pid 29908: LOG: child process with pid: 29480 exits with status 256 2020-11-27 22:20:01: pid 29908: LOG: fork a new child process with pid: 61365 2020-11-27 22:20:01: pid 29908: LOG: child process with pid: 33046 exits with status 256 2020-11-27 22:20:01: pid 29908: LOG: fork a new child process with pid: 61366 2020-11-27 22:20:01: pid 29908: LOG: child process with pid: 31702 exits with status 256 2020-11-27 22:20:01: pid 29908: LOG: fork a new child process with pid: 61367 2020-11-27 22:20:01: pid 29908: LOG: child process with pid: 49547 exits with status 256 2020-11-27 22:20:01: pid 29908: LOG: fork a new child process with pid: 61368 2020-11-27 22:20:01: pid 52390: LOG: failover or failback event detected 2020-11-27 22:20:01: pid 52390: DETAIL: restarting myself 2020-11-27 22:20:01: pid 45018: LOG: md5 authentication successful with frontend 2020-11-27 22:20:01: pid 45018: LOG: selecting backend connection 2020-11-27 22:20:01: pid 45018: DETAIL: failover or failback event detected, discarding existing connections 2020-11-27 22:20:01: pid 35048: LOG: failover or failback event detected 2020-11-27 22:20:01: pid 35048: DETAIL: restarting myself 2020-11-27 22:20:01: pid 70310: LOG: failover or failback event detected 2020-11-27 22:20:01: pid 70310: DETAIL: restarting myself 2020-11-27 22:20:01: pid 50605: DETAIL: failover or failback event detected, discarding existing connections 2020-11-27 22:20:01: pid 60771: LOG: failover or failback event detected 2020-11-27 22:20:01: pid 60771: DETAIL: restarting myself 2020-11-27 22:25:02: pid 41702: LOG: failover or failback event detected 2020-11-27 22:25:02: pid 41702: DETAIL: restarting myself 2020-11-27 22:25:02: pid 44741: LOG: failover or failback event detected 2020-11-27 22:25:02: pid 44741: DETAIL: restarting myself 2020-11-27 22:25:02: pid 47692: LOG: failover or failback event detected 2020-11-27 22:25:02: pid 47692: DETAIL: restarting myself 2020-11-27 22:25:02: pid 59593: LOG: failover or failback event detected 2020-11-27 22:25:02: pid 59593: DETAIL: restarting myself 2020-11-27 22:25:02: pid 105289: LOG: failover or failback event detected 2020-11-27 22:25:02: pid 105289: DETAIL: restarting myself 2020-11-27 22:25:02: pid 29908: LOG: child process with pid: 47692 exits with status 256 2020-11-27 22:25:02: pid 29908: LOG: fork a new child process with pid: 61822 2020-11-27 22:25:02: pid 29908: LOG: child process with pid: 105289 exits with status 256 2020-11-27 22:25:02: pid 29908: LOG: fork a new child process with pid: 61823 2020-11-27 22:25:02: pid 29908: LOG: child process with pid: 41702 exits with status 256 2020-11-27 22:25:02: pid 29908: LOG: fork a new child process with pid: 61824 2020-11-27 22:25:02: pid 29908: LOG: child process with pid: 59593 exits with status 256 2020-11-27 22:25:02: pid 29908: LOG: fork a new child process with pid: 61825 2020-11-27 22:25:02: pid 29908: LOG: child process with pid: 44741 exits with status 256 2020-11-27 22:25:02: pid 29908: LOG: fork a new child process with pid: 61826 2020-11-27 22:25:02: pid 52588: LOG: md5 authentication successful with frontend 2020-11-27 22:25:02: pid 52588: LOG: selecting backend connection 2020-11-27 22:25:02: pid 52588: DETAIL: failover or failback event detected, discarding existing connections 2020-11-27 22:25:02: pid 29908: LOG: child process with pid: 45018 exits with status 256 2020-11-27 22:25:02: pid 29908: LOG: fork a new child process with pid: 61828 2020-11-27 22:25:02: pid 29908: LOG: child process with pid: 50605 exits with status 256 2020-11-27 22:25:02: pid 29908: LOG: fork a new child process with pid: 61829 2020-11-27 22:30:02: pid 46720: LOG: failover or failback event detected 2020-11-27 22:30:02: pid 46720: DETAIL: restarting myself 2020-11-27 22:30:02: pid 55148: LOG: failover or failback event detected 2020-11-27 22:30:02: pid 55148: DETAIL: restarting myself 2020-11-27 22:30:02: pid 50371: LOG: failover or failback event detected 2020-11-27 22:30:02: pid 50371: DETAIL: restarting myself 2020-11-27 22:30:02: pid 57427: LOG: failover or failback event detected 2020-11-27 22:30:02: pid 57427: DETAIL: restarting myself 2020-11-27 22:30:02: pid 29908: LOG: child process with pid: 46720 exits with status 256 2020-11-27 22:30:02: pid 29908: LOG: fork a new child process with pid: 62337 2020-11-27 22:30:02: pid 29908: LOG: child process with pid: 50371 exits with status 256 2020-11-27 22:30:02: pid 29908: LOG: fork a new child process with pid: 62338 2020-11-27 22:30:02: pid 29908: LOG: child process with pid: 55148 exits with status 256 2020-11-27 22:30:02: pid 29908: LOG: fork a new child process with pid: 62339 2020-11-27 22:30:02: pid 29908: LOG: child process with pid: 57427 exits with status 256 2020-11-27 22:30:02: pid 29908: LOG: fork a new child process with pid: 62340 2020-11-27 22:30:02: pid 50438: LOG: md5 authentication successful with frontend 2020-11-27 22:30:02: pid 50438: LOG: selecting backend connection 2020-11-27 22:30:02: pid 50438: DETAIL: failover or failback event detected, discarding existing connections | ||||
| Tags | No tags attached. | ||||
|
|
|
|
|
I could not find log lines which suggests the client disconnection. Which line do you refer to? (also I was not able to reproduce your problem.) |
|
|
How do you do The front-end application reports an error and exits at the same time.The exit time is consistent with the PGPool log, as follows: 2020-11-27 22:16:01: pid 44904: LOG: selecting backend connection 2020-11-27 22:16:01: pid 44904: DETAIL: failover or failback event detected, discarding existing connections |
|
|
Let me clarify. your front-end application reports what? Something like failed to connect to pgpool? How does your front-end application connect to pgpool? Does it directly connect to pgpool? Or via connection pooling of Tomcat? |
|
|
How do you do 1. The front end application error: selectentdtmngtbBySyorikbn 20 (errcode: 0) SQLException: An I/O error occured while sending to the backend. Com. Edb. Util. PSQLException: An I/O error occured while sending to the backend. 2. Direct PGPOOL for front-end application 3. When the front-end application went wrong, I was restoring the standby node using the command: pcp_attach_node -d -p 9898 -n 1. Once the execution is complete, the front application exits (this has happened twice in the build environment) |
|
|
2020-11-27 22:20:01: pid 52390: LOG: failover or failback event detected 2020-11-27 22:20:01: pid 52390: DETAIL: restarting myself 2020-11-27 22:20:01: pid 45018: LOG: md5 authentication successful with frontend 2020-11-27 22:20:01: pid 45018: LOG: selecting backend connection 2020-11-27 22:20:01: pid 45018: DETAIL: failover or failback event detected, discarding existing connections What do these messages mean when they last for a day and then disappear? Does it matter? |
|
|
When failback event happens, pgpool main process sets a flag to let pgpool child process (they are actually interacting with frontend) know the event. When a child process finds the flag while they are in idle state (no connection from frontend), it suicides to refresh the status about backend because the backend state is cached in the child process's memory. Log messages below indicate it: 2020-11-27 22:20:01: pid 52390: LOG: failover or failback event detected 2020-11-27 22:20:01: pid 52390: DETAIL: restarting myself Once all the child process do that, the message will not be displayed. Because you have many child process (num_init_children = 1000), it will take while. The messages: 2020-11-27 22:20:01: pid 45018: LOG: md5 authentication successful with frontend 2020-11-27 22:20:01: pid 45018: LOG: selecting backend connection 2020-11-27 22:20:01: pid 45018: DETAIL: failover or failback event detected, discarding existing connections are logged in the different situation. A pgpool child process receives new connection request without a chance to do the restarting explained above. So it proceeds with authentication (md5) then notices the flag is set. What it is doing here is, looking for backend connection cache which is expired by connection_life_time. If such connection exists, it closes it. But since you disable the parameter, actually pgpool does nothing here. My guess is, after this the pgpool child process failed to use the existing connection cache to backend, it closed the connection to frontend. Question is, why pid 45018 did not leave the error log after this? |
|
|
Thank you very much. The relevant log information has been understood. Through investigation, when "detail: failure or failback event detected, discarding existing connections" occurs frequently, the front-end application does not report an error or quit. The recovery node should cause the front-end application to report an error and exit (error: selectentdtmngtbbysyorikbn 20 (error code: 0) sqlexception: an I / O error occurred while sending to the backend). Will pgpool reset all connections when the node is restored? |
|
|
It depends. If primary node is restored, pgpool will reset all connections. If standby node is restored, connections are not reset except front-end access DB while pgpool is doing fail back. This continues until pgpool finishes the fail back process. When it's done, you should see something like in the log file. failback done. reconnect host ... |
|
|
Thank you very much. This questionnaire can be closed |
|
|
Thank you for confirmation. I am going to close this issue. |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2020-12-01 11:57 | wl_wang2020 | New Issue | |
| 2020-12-01 11:57 | wl_wang2020 | File Added: pgpool.conf | |
| 2020-12-01 13:26 | t-ishii | Assigned To | => t-ishii |
| 2020-12-01 13:26 | t-ishii | Status | new => assigned |
| 2020-12-01 16:07 | t-ishii | Note Added: 0003623 | |
| 2020-12-01 16:07 | t-ishii | Status | assigned => feedback |
| 2020-12-01 16:07 | t-ishii | Steps to Reproduce Updated | |
| 2020-12-01 16:07 | t-ishii | Additional Information Updated | |
| 2020-12-01 16:44 | wl_wang2020 | Note Added: 0003624 | |
| 2020-12-01 16:44 | wl_wang2020 | Status | feedback => assigned |
| 2020-12-01 17:15 | t-ishii | Note Added: 0003625 | |
| 2020-12-01 17:15 | t-ishii | Status | assigned => feedback |
| 2020-12-01 17:44 | wl_wang2020 | Note Added: 0003626 | |
| 2020-12-01 17:44 | wl_wang2020 | Status | feedback => assigned |
| 2020-12-01 17:47 | wl_wang2020 | Note Added: 0003627 | |
| 2020-12-02 12:26 | t-ishii | Note Added: 0003629 | |
| 2020-12-02 12:27 | t-ishii | Note Edited: 0003629 | |
| 2020-12-02 14:53 | wl_wang2020 | Note Added: 0003630 | |
| 2020-12-04 07:39 | t-ishii | Note Added: 0003636 | |
| 2020-12-04 07:41 | t-ishii | Status | assigned => feedback |
| 2020-12-04 15:49 | wl_wang2020 | Note Added: 0003639 | |
| 2020-12-04 15:49 | wl_wang2020 | Status | feedback => assigned |
| 2020-12-04 17:03 | t-ishii | Note Added: 0003640 | |
| 2020-12-04 17:03 | t-ishii | Status | assigned => closed |