View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000431 | Pgpool-II | Bug | public | 2018-10-10 10:59 | 2018-10-16 15:54 |
| Reporter | nagata | Assigned To | nagata | ||
| Priority | normal | Severity | minor | Reproducibility | always |
| Status | closed | Resolution | fixed | ||
| Product Version | 3.6.12 | ||||
| Summary | 0000431: In native replication mode, online-recovery is blocked after a child process exits abnormally while accepting a connection. | ||||
| Description | When native replication mode is used, 2nd stage script is executed in online-recovery after all the connection are closed. The counter of connections is incremented when a child process accepts a connection, and decremented when the session is closed. However, if a child process exits abnormally, for example, due to the segfault or kill -9, the counter is never decremented, and this blocks the 2nd stage script forever. | ||||
| Steps To Reproduce | 1. Configure native replication cluster by pgpool_setup 2. Stop a backend node 3. Connect to Pgpool-Ii using psql 4. Kill the child process which is connected from psql 5. Run online recovery using pcp_recovery_node -> 2nd stage script is blocked forever | ||||
| Tags | No tags attached. | ||||
|
|
You should enable client_idle_limit_in_recovery. |
|
|
Thank you for your quick response. Yes, enable client_idle_limit_in_recovery prevents online-recovery from being blocked forever. However, online-recovery itself fails with the following error. LOG: wait_connection_closed: existing connections did not close in 90 sec. ERROR: node recovery failed, waiting connection closed in the other pgpools timeout I think, the only way to enable online-recovery in this situation is to restart Pgpool-II to reset the Req_info->conn_counter to zero, right? |
|
|
In any case, if pgpool child process gets killed abnormaly, there's not too much Pgpool-II can do. I recommend to restart Pgpool-II. |
|
|
ok. I take it that it is difficult to resolve this by fixing Pgpool-II. Our clients is suffered from a segfault of child processes and this made online-recovery problem come to the surface. So, the root problem is the segfalut, and I am continuing to investigate this now. I'll report if i find something new. |
|
|
The segmentation I mentioned in this thread is reported in below, http://www.pgpool.net/mantisbt/view.php?id=434 so I 'll close this thread. Thanks. |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2018-10-10 10:59 | nagata | New Issue | |
| 2018-10-10 11:00 | nagata | Description Updated | |
| 2018-10-10 11:20 | t-ishii | Note Added: 0002189 | |
| 2018-10-10 11:47 | nagata | Note Added: 0002190 | |
| 2018-10-10 11:48 | nagata | Note Edited: 0002190 | |
| 2018-10-10 11:56 | t-ishii | Note Added: 0002191 | |
| 2018-10-10 12:14 | nagata | Note Added: 0002192 | |
| 2018-10-10 12:15 | nagata | Note Edited: 0002192 | |
| 2018-10-16 15:53 | nagata | Note Added: 0002203 | |
| 2018-10-16 15:54 | nagata | Assigned To | => nagata |
| 2018-10-16 15:54 | nagata | Status | new => closed |
| 2018-10-16 15:54 | nagata | Resolution | open => fixed |