[pgpool-general: 309] Re: Trouble with connection during long/large data queries

Joe Barnett joe.barnett at mr72.com
Tue Mar 27 14:08:39 JST 2012


Thank you for the responses.

We are running pgpool-ii 3.1.2 against postgres 9.1.2.  While we employ 
(postgres') streaming replication, we point pgpool only to the master 
server at this time (running a pgpool daemon local to each client 
machine).  In test, however, we do point to both the master and the 
(hot) standby server.  The pgpool behavior we are seeing is identical on 
both our test and production systems.

I followed the suggestion of disabling SSL at the pgpool level ("ssl = 
off" in pgpool.conf; prior I had just set PGSSLMODE=disable in the shell 
prior to running my Ruby script).  This produced some interesting 
results (which I will try to reproduce): the script ran without error, 
and errors were logged in neither the postgres log nor the pgpool log, 
but the resulting output contained no data.  The output, in this case, 
are shapefiles (ESRI formatted geographic data) each of which contain 
several pieces including a .dbf file.  The .dbf files were empty, though 
everything else looked as would be expected.  As the logs give no clue, 
I can not really identify what happened.  Again, I will try to reproduce 
this behavior.

In a more normal setting (SSL enabled both from the client to pgpool 
and from pgpool to the postgres server) we get an expected amount of 
data for the queries that run before the connection goes away.  In cases 
where this is run through pgpool we get about 1/3 of the way through the 
job before the connection goes away.  When we access the database 
directly, we get all of the data expected.  As I think about this, I 
might be able to modify the script to request a new connection for each 
iteration and that might prove to be another workaround--though even if 
it works, it will still only buy us some time rather than fix the root 
problem.

More on the issue of (nearly) empty return sets if I can reproduce 
them.  Otherwise, I hope this info helps shed some light on this.

Thanks,

Joe

On 2012-03-26 18:03, Matt Wise wrote:
> I believe we may be seeing this problem as well, but with much
> shorter querytimes. We have a few jobs that run database transactions
> for up to 15minutes or so. They will do 500-1000 individual database
> queries, that range from 1-30 seconds in length. We continue to see
> failures when running through PGPool and end up having to run these
> through our legacy PGBouncer service directly on the database hosts.
>
> We DO use SSL between PGPool and the backend PostgreSQL databases. We
> also use SSL in between PGPool and the clients (which is a local copy
> of PGBouncer on each client), but we do that with Stunnel so its
> transparent to the applications.
>
> —Matt
>
>
>
> On Mar 26, 2012, at 5:07 PM, Tatsuo Ishii wrote:
>
>> What pgpoo version are you using?
>>
>> Also if you disable SSL between pgpool and PostgreSQL, do you see 
>> any
>> difference?
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
>>
>>> We are using pgpool in front of a postrgesql 9.1 database and,
>>> generally, we are experiencing tremendous improvement in 
>>> performance
>>> (and even stability) over accessing the database directly.  We are
>>> seeing one behavior, however, which is troubling: long queries 
>>> which
>>> produce much resulting data seem to kill (or are killed by) the 
>>> pgpool
>>> process handling the connection.  By long/large, I mean connections
>>> lasting ninety minutes or more (often comprising just a handful of
>>> long running queries) and returning >1GB of data.  Specifically, 
>>> this
>>> is happening when querying (and returning) large, global sets of
>>> geographic data (which are not conducive to being run as smaller,
>>> independent queries), though we can reproduce this problem with any
>>> query taking sufficient time and returning sufficient data.
>>>
>>> The workaround is to run these queries against the database 
>>> directly,
>>> bypassing pgpool.  We would, however, like to require nearly all
>>> access to the database to go through pgpool, restricting direct 
>>> access
>>> to things of an administrative nature.
>>>
>>> Ideas, advice, and criticism welcome, thanks,
>>>
>>> Joe Barnett
>>>
>>> --- Errors and logs follow ---
>>>
>>> ### Error returned when connection goes away (from Ruby script):
>>>
>>> (with SSL enabled)
>>> ./bin/shapes.rb:150:in `exec': SSL SYSCALL error: EOF detected
>>> (PG::Error)
>>>
>>> (with SSL disabled)
>>> ./bin/shapes.rb:150:in `exec': PG::Error
>>>
>>> ### Excerpts from pgpool.log:
>>>
>>> 2012-03-26 13:16:35 LOG: pid 14860: statement: SELECT
>>> max(octet_length(hist_file_path::text)) from temp_shapes_daily
>>> 2012-03-26 13:16:35 LOG: pid 14860: DB node id: 0 backend pid: 
>>> 16005
>>> statement: SELECT max(octet_length(hist_file_path::text)) from
>>> temp_shapes_daily
>>> 2012-03-26 13:16:36 LOG: pid 14860: statement: SELECT
>>> max(octet_length(hist_file_name::text)) from temp_shapes_daily
>>> 2012-03-26 13:16:36 LOG: pid 14860: DB node id: 0 backend pid: 
>>> 16005
>>> statement: SELECT max(octet_length(hist_file_name::text)) from
>>> temp_shapes_daily
>>> 2012-03-26 13:16:37 LOG: pid 14860: statement: SELECT *
>>>                        FROM temp_shapes_daily
>>>                        WHERE mission_phase_name = 'COMMISSIONING'
>>>                        AND original_product_id LIKE 'nac%'
>>>
>>> 2012-03-26 13:16:37 LOG: pid 14860: DB node id: 0 backend pid: 
>>> 16005
>>> statement: SELECT *
>>>                        FROM temp_shapes_daily
>>>                        WHERE mission_phase_name = 'COMMISSIONING'
>>>                        AND original_product_id LIKE 'nac%'
>>>
>>> 2012-03-26 13:17:28 LOG: pid 14860: statement: SELECT *
>>>                        FROM temp_shapes_daily
>>>                        WHERE mission_phase_name = 'COMMISSIONING'
>>>                        AND original_product_id LIKE 'wac%'
>>>
>>> 2012-03-26 13:17:28 LOG: pid 14860: DB node id: 0 backend pid: 
>>> 16005
>>> statement: SELECT *
>>>                        FROM temp_shapes_daily
>>>                        WHERE mission_phase_name = 'COMMISSIONING'
>>>                        AND original_product_id LIKE 'wac%'
>>>
>>> 2012-03-26 13:17:33 LOG: pid 14846: connection received:
>>> host=localhost port=39636
>>> 2012-03-26 13:18:08 LOG: pid 14860: statement: SELECT *
>>>                        FROM temp_shapes_daily
>>>                        WHERE mission_phase_name = 'NOMINAL MISSION'
>>>                        AND original_product_id LIKE 'nac%'
>>>
>>> 2012-03-26 13:18:08 LOG: pid 14860: DB node id: 0 backend pid: 
>>> 16005
>>> statement: SELECT *
>>>                        FROM temp_shapes_daily
>>>                        WHERE mission_phase_name = 'NOMINAL MISSION'
>>>                        AND original_product_id LIKE 'nac%'
>>>
>>> 2012-03-26 13:22:38 LOG: pid 14860: statement: SELECT *
>>>                        FROM temp_shapes_daily
>>>                        WHERE mission_phase_name = 'NOMINAL MISSION'
>>>                        AND original_product_id LIKE 'wac%'
>>>
>>> 2012-03-26 13:22:38 LOG: pid 14860: DB node id: 0 backend pid: 
>>> 16005
>>> statement: SELECT *
>>>                        FROM temp_shapes_daily
>>>                        WHERE mission_phase_name = 'NOMINAL MISSION'
>>>                        AND original_product_id LIKE 'wac%'
>>>
>>> 2012-03-26 13:22:43 ERROR: pid 14860: pool_read2: EOF encountered 
>>> with
>>> backend
>>> 2012-03-26 13:22:43 LOG: pid 14860: do_child: exits with status 1 
>>> due
>>> to error
>>>
>>> (I have pgpool logs with DEBUG info as well, but they are very 
>>> large
>>> (and I am not certain how much useful information they contain).  I
>>> can certainly provide excerpts from those logs if anyone thinks 
>>> they
>>> are useful--perhaps suggest things for which I might look.)
>>>
>>> ### Excerpts from postgresql log:
>>>
>>> LOG:  could not send data to client: Connection reset by peer
>>> STATEMENT:                      SELECT *
>>>                                FROM temp_shapes_daily
>>>                                WHERE mission_phase_name = 'NOMINAL 
>>> MISSION'
>>>                                AND original_product_id LIKE 'wac%'
>>>
>>> LOG:  could not send data to client: Broken pipe
>>> STATEMENT:                      SELECT *
>>>                                FROM temp_shapes_daily
>>>                                WHERE mission_phase_name = 'NOMINAL 
>>> MISSION'
>>>                                AND original_product_id LIKE 'wac%'
>>>
>>> LOG:  SSL error: bad write retry
>>> LOG:  could not send data to client: Connection reset by peer
>>>
>>> _______________________________________________
>>> pgpool-general mailing list
>>> pgpool-general at pgpool.net
>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>> _______________________________________________
>> pgpool-general mailing list
>> pgpool-general at pgpool.net
>> http://www.pgpool.net/mailman/listinfo/pgpool-general

-- 
Joe Barnett
623.670.1326


More information about the pgpool-general mailing list