[pgpool-general: 3622] Re: PGPool sending r/w queries to wrong DB node

Pablo Sanchez pablo at blueoakdb.com
Tue Apr 14 06:10:45 JST 2015


[ Comments below, in-line ]

On 04/12/2015 08:40 PM, Yugo Nagata wrote:
> On Thu, 09 Apr 2015 14:23:07 -0400
> Pablo Sanchez <pablo at blueoakdb.com> wrote:
>
>> On 04/09/2015 01:13 AM, Yugo Nagata wrote:
>>> Hi,
>>
>> Hi Yugo,
>>
>> With your insight I believe I may have pieced together what happened.
>> I think we may have a bug (or two?) with pgpool but I'm not sure.
>> Please see my questions below ("q:)
>>
>> As always, thank you for your time!
>>
>> ::: Details :::
>>
>> pgpool information
>> ------------------
>> Our failover isn't implemented so consequently, we have the following
>> pgpool.conf parameters set:
>>
>>      o failover_command = ''
>
> I think this is the reason why the standby PostgreSQL doesn't
> promote to primary. pgpool-II itself does nothing about
> this promotion and a failover script is needed.

Hi Yugo,

The Primary /never/ crashed.  Neither did the Slave.

In other words, there was no need to do a promotion.

>> Reconstruction
>> --------------
>> Here's what I believe happened, in order of execution with questions
>> on whether we have a bug or not:
>>
>> o Because "fail_over_on_backend_error = on", when we encountered the
>>     error [1], pgpool degenerated
>>
>>     q:  Should we have degenerated because of the failed DELETE
>>         statement?  Is this a bug?
>>
>>         I remember terminating the DELETE statement on PG.
>
> The DELETE statement failture itself doesn't cause the degeneration.
> I think that this is caused by terminating the statement. How and
> why did you terminate the statement?

I terminated the DELETE because it had been running for +30 minutes.
The users of the system informed me this DELETE should complete in
under a second.

At the time of termination, there was very little other activity on
the DB.  However I cannot say definitively that terminating the DELETE
cause the degneration but the log below [1] suggests it.

Notice the pid of the DELETE is "14902"

    (pid 14902): mydb: LOG:  pool_send_and_wait: Error or notice
    message from backend: : DB node id: 0 backend pid: 31461
    statement: "delete from t_case_file_tag_association where
    case_file_id=$1" message: "terminating

Now, notice this entry where it's saying the degenerate backend
request for ... pid "14902":

    (pid 14902): mydb: LOG:  received degenerate backend request for
    node_id: 0 from pid [14902]

Isn't the log saying the degeneration was due to pid "14902"?  Which
is the "DELETE"?

>>
>> References
>> ----------
>> [1] - degenerate
>>
>> (pid 14902): mydb: LOG:  pool_send_and_wait: Error or notice message
>> from backend: : DB node id: 0 backend pid: 31461 statement: "delete from
>> t_case_file_tag_association where case_file_id=$1" message: "terminating
>> connection due to administrator command"
>> (pid 15214): mydb: LOG:  pool_send_and_wait: Error or notice message
>> from backend: : DB node id: 0 backend pid: 578 statement: "delete from
>> t_case_file_tag_association where case_file_id=$1" message: "terminating
>> connection due to administrator command"
>> (pid 14902): mydb: ERROR:  unable to forward message to frontend
>> (pid 14902): mydb: DETAIL:  FATAL error occured on backend
>> (pid 15214): mydb: ERROR:  unable to forward message to frontend
>> (pid 15214): mydb: DETAIL:  FATAL error occured on backend
>> (pid 14902): mydb: LOG:  received degenerate backend request for
>> node_id: 0 from pid [14902]
>>
>> [2] - seeking a primary node
>>
>> LOG:  find_primary_node_repeatedly: waiting for finding a primary node
>> LOG:  find_primary_node: checking backend no 0
>> LOG:  find_primary_node: checking backend no 1
>>
>> [3] - pgpool selects the Slave DB as the Primary
>>
>> LOG:  failover: set new primary node: -1
>> LOG:  failover: set new master node: 1
>>
>> --
>> Pablo Sanchez - Blueoak Database Engineering, Inc
>> Ph:    819.459.1926         Blog:  http://pablo-blog.blueoakdb.com
>> iNum:  883.5100.0990.1054
>
>



--
Pablo Sanchez - Blueoak Database Engineering, Inc
Ph:    819.459.1926         Blog:  http://pablo-blog.blueoakdb.com
iNum:  883.5100.0990.1054



More information about the pgpool-general mailing list