[Pgpool-hackers] Major bug with pcp_detach_node

Gilles Darold gilles.darold at dalibo.com
Thu Jan 6 22:45:58 UTC 2011


Hi,

I found an annoying problem with the PCP command pcp_detach_node. I have
3 computers running each a postgresql instance in a streaming
replication line. PgPool is running on the first node which is the
master. The problem comes when you give a node id outside the real node
numbers.

As I explain above I just have 3 nodes so node id goes from 0 up to 2
and if I use node id 3 that doesn't exists, here are the results:

/usr/bin/pcp_detach_node -d 10 192.168.1.11 9898 postgres postgres 3

DEBUG: send: tos="R", len=46
DEBUG: recv: tos="r", len=21, data=AuthenticationOK
DEBUG: send: tos="D", len=6
DEBUG: recv: tos="d", len=20, data=CommandComplete
DEBUG: send: tos="X", len=4
------------- log file ----------------
LOG: notice_backend_error: node 0 is not valid backend.
LOG: starting degeneration. shutdown host 192.168.1.13(5432)
LOG: execute command: /home/postgres/bin/failover.sh 2 192.168.1.13
192.168.1.11 /home/postgres/data/postgres.trigger
LOG: failover_handler: set new master node: 0
LOG: failover done. shutdown host 192.168.1.13(5432)
LOG: find_primary_node: primary node id is 0
 
[postgres at vm1 ~]$ psql -p 9999 -c "SHOW pool_nodes;"
 node_id |   hostname   | port | status | lb_weight | state
---------+--------------+------+--------+-----------+-------
 0       | 192.168.1.11 | 5432 | 2      | 0.333333  | P
 1       | 192.168.1.12 | 5432 | 2      | 0.333333  | S
 2       | 192.168.1.13 | 5432 | 3      | 0.333333  | S
(3 rows)

As you can see node 2 has been detached instead of aborting and
displaying an error, I also experienced that the detached node was node
0, which is worst.

I've attached a patch that will return the following :

/usr/bin/pcp_detach_node -d 10 192.168.1.11 9898 postgres postgres 3
DEBUG: send: tos="R", len=46
DEBUG: recv: tos="r", len=21, data=AuthenticationOK
DEBUG: send: tos="D", len=6
EOFError
DEBUG: send: tos="X", len=4
------------- log file ----------------
LOG: pcp_child: node id 3 is not valid
LOG: PCP child 32232 exits with status 256
LOG: fork a new PCP child pid 32299


Regards,

-- 
Gilles Darold
http://dalibo.com - http://dalibo.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: pgpool-II-detach-bug.diff
Type: text/x-patch
Size: 487 bytes
Desc: not available
URL: <http://pgfoundry.org/pipermail/pgpool-hackers/attachments/20110106/2c3f6e31/attachment.bin>


More information about the Pgpool-hackers mailing list