[pgpool-general: 2810] pgpool: stuck connections

Anna Aronova ankaar at gmail.com
Wed May 7 23:38:57 JST 2014


Hi,

we are testing pgpool-II 3.3.3 with Postgresql 9.3 before production.
pgpool-II is configured for connection-pooling and failover for Postgresql
with streaming replication (master + 4 slaves).
In addition, we are using Postgresql for multi-tenant application. hence we
expect connections from hundreds of users.

At some point of time, the database connections started to be rejected.

Expert from pgpool.conf:

#------------------------------------------------------------------------------
# POOLS
#------------------------------------------------------------------------------

# - Pool size -

num_init_children = 8
                                   # Number of pools
                                   # (change requires restart)
max_pool = 240
                                   # Number of connections per pool
                                   # (change requires restart)

# - Life time -

child_life_time = 300
                                   # Pool exits after being idle for this
many seconds
child_max_connections = 0
                                   # Pool exits after receiving that many
connections
                                   # 0 means no exit
connection_life_time = 60
                                   # Connection to backend closes after
being idle for this many seconds
                                   # 0 means no close
client_idle_limit = 60
                                   # Client is disconnected after being
idle for that many seconds
                                   # (even inside an explicit transactions!)
                                   # 0 means no disconnection

The pgpool processes looked as below :

root      1987     1  0 Mar04 ?        00:00:00 abrt-dump-oops -d
/var/spool/abrt -rwx /var/log/messages

postgres 10053     1  0 Apr29 ?        00:01:00 /usr/bin/pgpool -f
/etc/pgpool-II/pgpool.conf -n

postgres 10060 10053  0 Apr29 ?        00:00:00 pgpool: watchdog

postgres 10061 10053  0 Apr29 ?        00:00:01 pgpool: heartbeat receiver

postgres 10062 10053  0 Apr29 ?        00:00:02 pgpool: heartbeat sender

postgres 10063 10053  0 Apr29 ?        00:00:26 pgpool: lifecheck

postgres 10072 10053  0 Apr29 ?        00:00:00 pgpool: PCP: wait for
connection request

postgres 10073 10053  0 Apr29 ?        00:01:13 pgpool: worker process

postgres 11544 10053  0 08:14 ?        00:00:11 pgpool: t_921413 tomr_qa
15.22.23.6(60214) idle

postgres 11553 10053  0 08:15 ?        00:00:11 pgpool: t_921413 tomr_qa
15.22.23.6(60196) idle

postgres 11554 10053  0 08:15 ?        00:00:11 pgpool: t_921413 tomr_qa
15.22.23.6(60206) idle

postgres 11557 10053  0 08:15 ?        00:00:11 pgpool: t_921413 tomr_qa
15.22.23.6(60200) idle

postgres 11558 10053  0 08:15 ?        00:00:11 pgpool: t_921413 tomr_qa
15.22.23.7(49434) idle

postgres 11559 10053  0 08:15 ?        00:00:11 pgpool: t_921413 tomr_qa
15.22.23.6(60210) idle

postgres 11697 10053  0 08:23 ?        00:00:11 pgpool: t_921413 tomr_qa
15.22.23.7(49431) idle

postgres 12708 10053  0 09:26 ?        00:00:01 pgpool: t_921413 tomr_qa
15.22.23.7(49427) idle


*pcp_proc_info 10 pool-vip 9898 pguser XXXXXX 11544*

toma_qa tom_admin_lab 2014-04-30 08:15:00 2014-04-30 08:24:02 3 0 6 29783 0

toma_qa tom_admin_lab 2014-04-30 08:15:00 2014-04-30 08:24:02 3 0 6 26487 0

toma_qa tom_admin_lab 2014-04-30 08:15:00 2014-04-30 08:24:02 3 0 6 20197 0

tomr_qa tom_admin_exec 2014-04-30 08:15:00 2014-04-30 08:27:34 3 0 1767
8784 0

tomr_qa tom_admin_exec 2014-04-30 08:15:00 2014-04-30 08:27:34 3 0 1767
22090 0

tomr_qa tom_admin_exec 2014-04-30 08:15:00 2014-04-30 08:27:34 3 0 1767
29893 0

tomr_qa tom_admin_exec 2014-04-30 08:15:00 2014-04-30 08:27:34 3 0 1767
26598 0

tomr_qa tom_admin_exec 2014-04-30 08:15:00 2014-04-30 08:27:34 3 0 1767
20307 0

toma_qa t_103886 2014-04-30 08:15:00 2014-04-30 08:33:23 3 0 16 8964 0

toma_qa t_103886 2014-04-30 08:15:00 2014-04-30 08:33:23 3 0 16 22270 0

toma_qa t_103886 2014-04-30 08:15:00 2014-04-30 08:33:23 3 0 16 30072 0

toma_qa t_103886 2014-04-30 08:15:00 2014-04-30 08:33:23 3 0 16 26777 0

toma_qa t_103886 2014-04-30 08:15:00 2014-04-30 08:33:23 3 0 16 20489 0

tomr_qa tom_admin_lab 2014-04-30 08:15:00 2014-04-30 08:43:55 3 0 6 9295 0

tomr_qa tom_admin_lab 2014-04-30 08:15:00 2014-04-30 08:43:55 3 0 6 22600 0

tomr_qa tom_admin_lab 2014-04-30 08:15:00 2014-04-30 08:43:55 3 0 6 30402 0

tomr_qa tom_admin_lab 2014-04-30 08:15:00 2014-04-30 08:43:55 3 0 6 27106 0

tomr_qa tom_admin_lab 2014-04-30 08:15:00 2014-04-30 08:43:55 3 0 6 20818 0

*tomr_qa t_921413 2014-04-30 08:15:00 2014-04-30 08:56:37 3 0 340 9695 1*

*tomr_qa t_921413 2014-04-30 08:15:00 2014-04-30 08:56:37 3 0 340 23003 1*

*tomr_qa t_921413 2014-04-30 08:15:00 2014-04-30 08:56:37 3 0 340 30802 1*

*tomr_qa t_921413 2014-04-30 08:15:00 2014-04-30 08:56:37 3 0 340 27506 1*

*tomr_qa t_921413 2014-04-30 08:15:00 2014-04-30 08:56:37 3 0 340 21229 1*

*toma_qa t_921413 2014-04-30 08:15:00 2014-04-30 09:13:18 3 0 24 10227 0*

*toma_qa t_921413 2014-04-30 08:15:00 2014-04-30 09:13:18 3 0 24 23535 0*

*toma_qa t_921413 2014-04-30 08:15:00 2014-04-30 09:13:18 3 0 24 31335 0*

*toma_qa t_921413 2014-04-30 08:15:00 2014-04-30 09:13:18 3 0 24 28038 0*

*toma_qa t_921413 2014-04-30 08:15:00 2014-04-30 09:13:18 3 0 24 21760 0*



In the database we saw several sessions from the same user *t_921413 in
different statuses.*



postgres 21218  5469  0 Apr30 ?        00:00:00 postgres: t_921413 tomr_qa
15.22.23.6(33660) UPDATE waiting

postgres 21223  5469  0 Apr30 ?        00:00:00 postgres: t_921413 tomr_qa
15.22.23.6(33685) UPDATE waiting

postgres 21224  5469  0 Apr30 ?        00:00:00 postgres: t_921413 tomr_qa
15.22.23.6(33690) UPDATE waiting

postgres 21226  5469  0 Apr30 ?        00:00:00 postgres: t_921413 tomr_qa
15.22.23.6(33695) UPDATE waiting

postgres 21229  5469  0 Apr30 ?        00:00:00 postgres: t_921413 tomr_qa
15.22.23.6(33705) UPDATE waiting

postgres 21249  5469  0 Apr30 ?        00:00:00 postgres: t_921413 tomr_qa
15.22.23.6(33755) UPDATE waiting

postgres 21250  5469  0 Apr30 ?        00:00:00 postgres: t_921413 tomr_qa
15.22.23.6(33760) UPDATE waiting

postgres 21758  5469  0 Apr30 ?        00:00:00 postgres: t_921413 toma_qa
15.22.23.6(34759) idle

postgres 21759  5469  0 Apr30 ?        00:00:00 postgres: t_921413 toma_qa
15.22.23.6(34760) idle

postgres 21760  5469  0 Apr30 ?        00:00:00 postgres: t_921413 toma_qa
15.22.23.6(34767) idle

postgres 21761  5469  0 Apr30 ?        00:00:00 postgres: t_921413 toma_qa
15.22.23.6(34770) idle

postgres 21767  5469  0 Apr30 ?        00:00:00 postgres: t_921413 toma_qa
15.22.23.6(34785) idle

postgres 21771  5469  0 Apr30 ?        00:00:00 postgres: t_921413 toma_qa
15.22.23.6(34800) idle

postgres 21772  5469  0 Apr30 ?        00:00:00 postgres: t_921413 toma_qa
15.22.23.6(34805) idle

postgres 22161  5469  0 Apr30 ?        00:00:00 postgres: t_921413 tomr_qa
15.22.23.6(35590) UPDATE


The problems we encountered and can't understand what is causing and if
there is a fix to it or there is a an error in configuration:

1.  Application server 15.22.23.6 was taken down but pgpool still showed us
that front-end is connected (pcp_proc_info - status 1)

2.  There were several idle cached connections in database that were not
closed even after there were idle for much more that 60 seconds.

3.  Only after we killed all connections to database of the user t_921413
(in all nodes), other users could continue working


Please, advise!


Thank you in advance


Anna
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20140507/be5fb9c4/attachment.html>


More information about the pgpool-general mailing list