[Pgpool-hackers] Resource Leak on PGPool-II stop or restart

Tue Jun 28 14:22:46 UTC 2011

Hi!

Looks like I forgot to attach the logfile ... sorry - so once again:

I have the following issue with PGPool-II:
I tried 3.0.4 and 3.1.0-alpha2 with the same result. 

It sometimes happens, that after stopping PGPool, some SHM-segements don't get freed.
Precisely I'm talking about 5 SHM-segments and 1 semaphore (as reported by ipcs).

When I then restart pgpool-II, it fails to start with the log-output: "Address already in use".
This is just the consecutive fault, but the most noticeable.

The real failure seems to happen on shutdown, and - as further investigations have shown - probably only if a client is still connected to pgpool.
In that case the shm-resources are not released again.
It is possible to trace this whith ipcs. The number of used shm-segments increase after starting pgpool, and normally revert to the old value after stopping it again. But if a client is still busy on this database, the number of used shms doesn't decrease after a shutdown, and the next restart of pgpool fails.

I have attached this part of the logfile, because it would be too long to have it inline here.

But an indication might be this line:
2011-06-28 13:14:36 DEBUG: pid 13370: child receives smart shutdown request but it's not in idle state
(for the rest please have a look in the attachment).

Well, the number of 6 leaks on a restart doesn't seem to be much, but our system shall run in a high-available environment and it might be necessary that pgpool needs to be restarted by the cluster software after any process fails.
Also could it be necessary to restart pgpool because of a configuration-change of the backends.

And during tests we already came to the point, when after 100 restarts it was not possible to start it any more because no resources where available.

Example of startup-log after 100 restarts: "could not create semaphores".

2011-06-28 15:02:43 DEBUG: pid 6000: num_backends: 2 total_weight: 1.000000
2011-06-28 15:02:43 DEBUG: pid 6000: backend 0 weight: 2147483647.000000
2011-06-28 15:02:43 DEBUG: pid 6000: backend 1 weight: 0.000000
pid file found but it seems bogus. Trying to start pgpool anyway...
2011-06-28 15:02:43 ERROR: pid 6000: pool_init_pool_passwd: couldn't open /etc/pgpool-II-90/pool_passwd. reason: Permission denied
2011-06-28 15:02:43 ERROR: pid 6000: could not create 3 semaphores: No space left on device
2011-06-28 15:02:43 ERROR: pid 6000: Unable to create semaphores. Exiting...
2011-06-28 15:02:43 DEBUG: pid 6000: shmem_exit(1)

I have attached a logfile which contains a startup, a shutdown and the afterwards failing restart.
Maybe anyone can identify the source of this problem.

And BTW: I cannot file this as a bug (or browse the buglist) at http://pgfoundry.org/tracker/?group_id=1000055
it says: Database Error: ERROR: could not open relation "artifact_message": No such file or directory

best regards,
Harald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pgpool-shm_not_freed.log
Type: text/x-log
Size: 28046 bytes
Desc: not available
URL: <http://pgfoundry.org/pipermail/pgpool-hackers/attachments/20110628/1446e58a/attachment-0001.bin>