[pgpool-general: 3611] Re: PGPool sending r/w queries to wrong DB node

Pablo Sanchez pablo at blueoakdb.com
Fri Apr 10 03:23:07 JST 2015


[ Comments below, in-line ]

On 04/09/2015 01:13 AM, Yugo Nagata wrote:
> Hi,

Hi Yugo,

With your insight I believe I may have pieced together what happened.
I think we may have a bug (or two?) with pgpool but I'm not sure.
Please see my questions below ("q:)

As always, thank you for your time!

::: Details :::

pgpool information
------------------
Our failover isn't implemented so consequently, we have the following
pgpool.conf parameters set:

    o failover_command = ''
    o failback_command = ''
    o fail_over_on_backend_error = on
    o search_primary_node_timeout = 10

When I start pgpool, I use --discard-status.

Reconstruction
--------------
Here's what I believe happened, in order of execution with questions
on whether we have a bug or not:

o Because "fail_over_on_backend_error = on", when we encountered the
   error [1], pgpool degenerated

   q:  Should we have degenerated because of the failed DELETE
       statement?  Is this a bug?

       I remember terminating the DELETE statement on PG.

o pgpool attempts to seek a primary node[2]

o I believe after "search_primary_node_timeout", pgpool mistakenly
   selects the Slave DB as the primary[3]

   q:  Why did pgpool select the Slave DB?  Is this a bug?

       The DB Cluster has not been restarted since the middle of last
       month.

       As a point of reference, to fix the situation, all I did was
       restart pgpool.

References
----------
[1] - degenerate

(pid 14902): mydb: LOG:  pool_send_and_wait: Error or notice message 
from backend: : DB node id: 0 backend pid: 31461 statement: "delete from 
t_case_file_tag_association where case_file_id=$1" message: "terminating 
connection due to administrator command"
(pid 15214): mydb: LOG:  pool_send_and_wait: Error or notice message 
from backend: : DB node id: 0 backend pid: 578 statement: "delete from 
t_case_file_tag_association where case_file_id=$1" message: "terminating 
connection due to administrator command"
(pid 14902): mydb: ERROR:  unable to forward message to frontend
(pid 14902): mydb: DETAIL:  FATAL error occured on backend
(pid 15214): mydb: ERROR:  unable to forward message to frontend
(pid 15214): mydb: DETAIL:  FATAL error occured on backend
(pid 14902): mydb: LOG:  received degenerate backend request for 
node_id: 0 from pid [14902]

[2] - seeking a primary node

LOG:  find_primary_node_repeatedly: waiting for finding a primary node
LOG:  find_primary_node: checking backend no 0
LOG:  find_primary_node: checking backend no 1

[3] - pgpool selects the Slave DB as the Primary

LOG:  failover: set new primary node: -1
LOG:  failover: set new master node: 1

--
Pablo Sanchez - Blueoak Database Engineering, Inc
Ph:    819.459.1926         Blog:  http://pablo-blog.blueoakdb.com
iNum:  883.5100.0990.1054


More information about the pgpool-general mailing list