[pgpool-general: 87] Re: num_int_children and max_pool

Mario Splivalo mario.splivalo at megafon.hr
Mon Dec 12 16:56:36 JST 2011


On 12/10/2011 04:04 PM, Tatsuo Ishii wrote:
> If you see seldome errors, I think you are getting closer to
> performance limit of the system. According to the waiting queue
> theory, if the request rate(your application) getting closer to
> service rate(pgpool and PostgreSQL), waiting time is getting longer
> very quickly and reaches the connect timeout limit(timeout value is
> varried depending on platforms. On Linux it is about 200 seconds). So
> I guess you need to make application consume lower execution time.

I will try to investigate this some more, but I don't think it's the
case here. Before pgpool was introduced this server was capable of
handling cca 500 simultaneous connections, and had issues with poorly
written selects that would sometimes utilise all available CPU on the
box (16 cores). Then postgres was upgraded to 9.1, SELECTs were improved
and a slave server was added (to offload selects). Pgpool is set up in
loadbalance mode with all selects sent to the slave server (there are
still selects on the master server but those are part of data modifying
transactions so pgpool keeps them on master only, as desired). Now the
master server has around 75% of it's cpu available (is idling), and
there is pgpool and postgres running on the box, with plenty of
resources left.

Also, I'm noticing that pgpool kicks slave server out of the pool. This
is the error I'm receiving in the logs:

2011-12-12 01:37:22 ERROR: pid 22369: connect_inet_domain_socket:
connect() failed: Connection timed out
2011-12-12 01:37:22 ERROR: pid 22369: connection to 10.21.32.82(5432) failed
2011-12-12 01:37:22 ERROR: pid 22369: new_connection: create_cp() failed
2011-12-12 01:37:22 LOG:   pid 22369: degenerate_backend_set: 1 fail
over request from pid 22369
2011-12-12 01:37:22 LOG:   pid 22160: starting degeneration. shutdown
host 10.21.32.82(5432)
2011-12-12 01:37:22 LOG:   pid 22160: Restart all children

These are the errors on the slave server during 'connection timed out':

2011-12-12 01:37:22 CST [991]: [1-1] LOG:  unexpected EOF on client
connection
2011-12-12 01:37:22 CST [789]: [1-1] LOG:  unexpected EOF on client
connection
2011-12-12 01:37:22 CST [755]: [1-1] LOG:  unexpected EOF on client
connection
2011-12-12 01:37:23 CST [947]: [1-1] LOG:  could not send data to
client: Broken pipe

Then I put slave back to the pool, and few minutes later:

2011-12-12 01:45:20 ERROR: pid 32756: connect_inet_domain_socket:
connect() failed: Connection timed out
2011-12-12 01:45:20 ERROR: pid 32756: connection to 10.21.32.82(5432) failed
2011-12-12 01:45:20 ERROR: pid 32756: new_connection: create_cp() failed
2011-12-12 01:45:20 LOG:   pid 32756: degenerate_backend_set: 1 fail
over request from pid 32756
2011-12-12 01:45:20 LOG:   pid 22160: starting degeneration. shutdown
host 10.21.32.82(5432)
2011-12-12 01:45:20 LOG:   pid 22160: Restart all children

And the same errors on the slave:
2011-12-12 01:45:21 CST [3876]: [1-1] LOG:  unexpected EOF on client
connection
2011-12-12 01:45:21 CST [3901]: [1-1] LOG:  unexpected EOF on client
connection

Now, there is no 'broken pipe', I'm assuming because slave was not
sending any data to the pgpool.

I can't figure out why is this happening. The slave server is not loaded
at all, and streaming replication is working all the time without issues.

	Mario

> 
>> Btw, thank you very much for the clarification on the internal workings
>> of the pgpool.
> 
> You are welcome!
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp



More information about the pgpool-general mailing list