[pgpool-general: 1948] tight lockup of main pgpool process

Sean Hogan sean at compusult.net
Sat Jul 27 03:06:18 JST 2013

Hi Tatsuo,

I have been experimenting with a pgpool-II setup having three machines: 
two of them have one pgpool-II (port 5430) and one PostgreSQL each, and 
the third has just PostgreSQL.  This is git master code (3.3.1-RC1) with 
the exception that it also has the small patch for continuing correctly 
after a SQL parse error.

I have been having a devil of a time getting this configuration to 
function properly, in that queries return inconsistent numbers of 
results among the backends.  That is a serious problem, but not my main 
concern right now.

If I manually down the backend that disagrees with the other two (I 
actually thought pgpool did this automatically, but that doesn't seem to 
happen) then pgpool on one of the nodes gets into a bad state:

2013-07-26 14:54:07 ERROR: pid 27863: connect_inet_domain_socket: 
getsockopt() detected error: Connection refused
2013-07-26 14:54:07 ERROR: pid 27863: connection to 
psql-vm2.compusult.net(5432) failed
2013-07-26 14:54:07 ERROR: pid 27863: new_connection: create_cp() failed
2013-07-26 14:54:56 LOG:   pid 27735: wd_create_send_socket: connect() 
reports failure (Cannot assign requested address). You can safely ignore 
this while starting up.
2013-07-26 14:54:56 LOG:   pid 27735: send_packet_4_nodes: packet for 
psql-vm2.compusult.net:9000 is canceled

and then the last two lines repeat indefinitely.  That process 27735 
(the main pgpool process) is unresponsive to ordinary kills; -9 is 
required to stop it.  Of course if I do that, then all its children have 
to be killed individually which is tremendously tedious.  This command:
     psql -U postgres -p 5430 -c "show pool_nodes"
also locks up and has to be killed with Control-\.

This happens to be the standby pgpool instance, but I believe I have 
seen it happen with the active one as well.

Any ideas what might be happening here?

