View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000319 | Pgpool-II | Bug | public | 2017-07-03 22:40 | 2017-07-19 16:48 |
| Reporter | drrtuy | Assigned To | |||
| Priority | normal | Severity | block | Reproducibility | sometimes |
| Status | closed | Resolution | open | ||
| Platform | x86_64 | OS | Centos | OS Version | 7.1 |
| Product Version | 3.6.4 | ||||
| Summary | 0000319: pgpool hangs in pool_check_fd() | ||||
| Description | Greetings, I have an issue with pgpool 3.6.4 and postgres 9.4.12. Somehow packets of the frontend protocol got lost and pgpool hangs waiting an answer from postgres. The topology is simple pgpool(P) ----> postgres(PG). P's ip is 10.69.64.27 and PG's ip is 10.69.64.164. Here are the states of the pgpool process [1] and postgres [2] 1. https://pastebin.com/UeGLkUYA 2. https://pastebin.com/z2U8Xhfk Whether it possible to call pool_set_timeout() with a reasonable timeout right before pool_check_fd()? | ||||
| Tags | No tags attached. | ||||
|
|
I think there's no reasonable timeout since each SQL command could take very long time. It is possible that Pgpool-II waits for answer from PostgreSQL in wrong timing. To judge it, we need self contained test case. |
|
|
I have two questions regarding the issue: 1) What could be the reason of such lost messages between a pool and a backend in your opinion? I couldn't imagine because the only thought I have is a network layer packet loss. But if it is a network PL then TCP retransmission would fix the gap asking for lost segments. 2) Whether it is acceptable to swap infinite select() with a loop that contains select() with configurable TO and a function that tries to send Describe protocol message. The command will fail if the remote backend waits for a next command or reaches timeout if the backend is active for real. I could write the prototype and try it out while the situation is reproducable in my environment. |
|
|
I am not sure what you mean by "lost messages". I think Pgpool-II just waits for message coming from backend (or frontend) in vain. > 2) Whether it is acceptable to swap infinite select() with a loop that contains select() with configurable TO and a function that > tries to send Describe protocol message. The command will fail if the remote backend waits for a next command or reaches > timeout if the backend is active for real. I could write the prototype and try it out while the situation is reproducable in my > > environment. Failed command will cause lots of other problems: transaction aborting, unwanted error messages coming from backend. I believe the proper solution would be finding put the cause of the hang and fix the bug. That's why I requested a self contained test case (but you do not respond to my request). I'm not sure what kind of problem you have because there's no test case. Anyway, attached patch *may* fix your problem which was created from different error report. (bug317) |
|
|
Thx for the answer Tatsuo. I will try the patch. Meanwhile the issue should be closed since I can't reproduce the behavior in a controlable fashion. |
|
|
Ok, issue closed. |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2017-07-03 22:40 | drrtuy | New Issue | |
| 2017-07-04 10:40 | t-ishii | Note Added: 0001568 | |
| 2017-07-04 22:21 | drrtuy | Note Added: 0001574 | |
| 2017-07-19 14:35 | t-ishii | File Added: pgpool-hung.diff | |
| 2017-07-19 14:35 | t-ishii | Note Added: 0001587 | |
| 2017-07-19 14:35 | t-ishii | Status | new => feedback |
| 2017-07-19 15:33 | drrtuy | Note Added: 0001590 | |
| 2017-07-19 15:33 | drrtuy | Status | feedback => new |
| 2017-07-19 16:48 | t-ishii | Note Added: 0001593 | |
| 2017-07-19 16:48 | t-ishii | Status | new => closed |