View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000025 | Pgpool-II | Bug | public | 2012-08-28 03:57 | 2012-09-10 21:32 |
| Reporter | singh.gurjeet | Assigned To | t-ishii | ||
| Priority | high | Severity | minor | Reproducibility | always |
| Status | resolved | Resolution | reopened | ||
| OS | HP-UX | ||||
| Summary | 0000025: s_do_auth doesn't handle NoticeResponse (N) message | ||||
| Description | A customer of ours sees these messages in log fie: 2012-08-02 13:10:57 ERROR: pid 12428: pool_read2: EOF encountered with backend 2012-08-02 13:10:57 ERROR: pid 12428: make_persistent_db_connection: s_do_auth failed 2012-08-02 13:10:57 DEBUG: pid 12428: pool_ssl: SSL requested but SSL support is not available 2012-08-02 13:10:57 ERROR: pid 12393: pool_read2: EOF encountered with backend 2012-08-02 13:10:57 ERROR: pid 12393: make_persistent_db_connection: s_do_auth failed 2012-08-02 13:10:57 ERROR: pid 12393: find_primary_node: make_persistent_connection failed 2012-08-02 13:10:57 DEBUG: pid 12393: starting health checking 2012-08-02 13:10:57 DEBUG: pid 12393: health_check: 0 th DB node status: 2 ...... 2012-08-02 13:10:57 ERROR: pid 12428: s_do_auth: unknown response "N" before processing BackendKeyData ...... 2012-08-02 13:10:57 ERROR: pid 12428: s_do_auth: unknown response "J" before processing BackendKeyData ...... 2012-08-02 13:10:57 ERROR: pid 12393: health check failed during read. host 172.16.1.2 at port 5445 is down. reason: Connection reset by peer 2012-08-02 13:10:57 LOG: pid 12393: set 0 th backend down status 2012-08-02 13:10:57 DEBUG: pid 12393: failover_handler called 2012-08-02 13:10:57 DEBUG: pid 12393: failover_handler: starting to select new master node ...... 2012-08-02 13:10:57 LOG: pid 12393: execute command: /opt/PostgresPlus/9.1AS/bin/failover_stream.sh 0 172.16.1.3 /opt/PostgresPlus/9.1AS/data/trigger_file ...... 2012-08-02 13:09:28 ERROR: pid 12428: s_do_auth: unknown response "N" before processing BackendKeyData Robert Hass' analysis of the problem in pgpool code: <snip> I analyzed the following error messages in detail: 2012-08-02 13:10:57 ERROR: pid 12428: s_do_auth: unknown response "N" before processing BackendKeyData 2012-08-02 13:10:57 ERROR: pid 12428: s_do_auth: unknown response "J" before processing BackendKeyData It looks like the function s_do_auth() gets invoked when pgpool descides to try to create a persistent connection to the backend. It has a loop where it reads parameter-status (S) messages while waiting for a backend-key-data (K) message. If it gets any other message type, it bombs out with the above error. Now, a notice-response message (N) is actually valid here, but the code doesn't handle it. Now, you might ask where the complaint about "J" comes from, since AFAICT there's no J message anywhere. But it turns out that after each complaint about an N message, there are three complaints about a message where the message type is a NUL byte followed by a complaint about a "J" message. This happens because the code is s_do_read doesn't actually bail out when it reads an unrecognized message-type byte, nor does it make any effort to skip the message and preserve sync. Instead it just blindly read a byte at a time and hopes something good happens, which of course it doesn't, because now it's reading through the length-word of the protocol message a byte at a time, trying to interpret each byte as the beginning of a new message. Oops. We will need to get this fixed in the community version of pgpool. </snip> | ||||
| Tags | No tags attached. | ||||
|
|
Robert Haas's comment seems correct. I need to know how to reproduce (notice message arriving case). Can you provide? |
|
|
Anyway I have committed fix to master. http://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=7205711e4da0037043dc85c04b0373493798fd1a |
|
|
Hello. We can reproduce it with following setting. Master server 's postgresql.conf client_min_messages = debug5 And I confirmed that the patch fixes this problem on my test machine. But I have no idea where "J" comes from. Regards. |
|
|
Glad to hear that. Thanks for the report! |
|
|
The fix looks good. But I do not agree with this hunk of commit diff: case 'S': /* parameter status */ + case 'N': /* notice response */ + case 'E': /* error response */ pool_debug("s_do_auth: parameter status data received"); That is plain misleading. Emitting the message "parameter status data received" upon receiving an error or notice response will mislead anyone reading the log file. I think each of those cases deserve a message of their own. |
|
|
Will fix. |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2012-08-28 03:57 | singh.gurjeet | New Issue | |
| 2012-08-28 09:52 | t-ishii | Assigned To | => t-ishii |
| 2012-08-28 09:52 | t-ishii | Status | new => assigned |
| 2012-08-28 09:54 | t-ishii | Note Added: 0000061 | |
| 2012-08-28 10:52 | t-ishii | Note Edited: 0000061 | |
| 2012-08-28 10:53 | t-ishii | Note Added: 0000062 | |
| 2012-08-28 16:58 | ketanaka | Note Added: 0000063 | |
| 2012-08-28 20:05 | t-ishii | Note Added: 0000065 | |
| 2012-08-28 20:06 | t-ishii | Status | assigned => resolved |
| 2012-08-28 20:06 | t-ishii | Resolution | open => fixed |
| 2012-08-29 01:04 | singh.gurjeet | Note Added: 0000066 | |
| 2012-08-29 01:04 | singh.gurjeet | Status | resolved => feedback |
| 2012-08-29 01:04 | singh.gurjeet | Resolution | fixed => reopened |
| 2012-08-29 07:52 | t-ishii | Note Added: 0000067 | |
| 2012-08-29 11:44 | t-ishii | Status | feedback => resolved |