[Pgpool-general] Current problems in pgpool2.1b2+segfault patch

Nico -telmich- Schottelius nico-pgpool at schottelius.org
Wed Jun 4 10:14:29 UTC 2008


Hello Simone,

Simone Tregnago [Wed, Jun 04, 2008 at 09:31:12AM +0200]:
> Nico -telmich- Schottelius wrote:
>> Hello!
>>
>> Currently I had to disable pgpool, because of the following problems:
>>
>> - no real logging support
>
> uhm? what to you mean as 'real logging'? The log you've reported below  
> isn't enough?

Sorry, should have told those not familar with 2.1b2:

- it requires -d (DEBUG) to get any log output. But I am not interested
  in debug, I want only LOG
- it requires -n (no fork) to keep on logging after startup; normally
  in setups with real init systems and supervisors I am a big fan of
  this. But under traditional Linux systems it's a mess, if you have to
  run a process in foreground.

>> - too often disconnects: 
>>
>> 2008-05-30 09:45:19 LOG:   pid 15208: statement: begin; select data from php_sessions where session_id = 'cd4752abe
>> 216370eeec09e8dde9354f7' for update; 2008-05-30 09:45:19 LOG:   pid 
>> 15252: statement: begin; select data from php_sessions where session_id 
>> = '87df8fc96
>> 924eba9999f2aceb3dfbd5b' for update; 2008-05-30 09:45:19 LOG:   pid 
>> 14567: statement:  RESET ALL
>> 2008-05-30 09:45:19 LOG:   pid 14567: statement:  SET SESSION AUTHORIZATION DEFAULT
>> 2008-05-30 09:45:19 ERROR: pid 15252: pool_process_query: 1 th kind C does not match with master connection kind D
>> 2008-05-30 09:45:19 LOG:   pid 15252: notice_backend_error: 1 fail over request from pid 15252
>> 2008-05-30 09:45:19 LOG:   pid 5045: starting degeneration. shutdown host 62.65.130.180(6543)
>>
>> There are just too many situations, when a node is disconnected
>
> disconnects rightly occurs when there is data mismatch between backends.
> If you have a lot of data mismatch disconnects, probably you've big  
> troubles with your databases. Full resync them and try again, if it  
> always happens then there's something wrong (for ex. another app connect  
> directly with one db and change data)

I rsync'd it just 4 hours before, including the final PITR.

> Reading your log seems that you're using "select ... for update"  
> statements.
> Well, from pgpool doc:
>
> '''
> condition for load balance
>
> For the query to be load balanced, all the requirements below must be met:
> ...
>     * it's not SELECT FOR UPDATE
> ...
> '''
>
> So, pay attention to not use load balancing if you need select for  
> update clauses.

That's the problem: Those limitations break with existing (read in:
proprietary, closed source) applications.

Perhaps this is even a design problem of pgpool, as it would have to be
intelligent to find out which queries do modify data and which don't.


>> - problems on syncing, even with PITR: As the applications do not release the connection
>>   from pgpool, the pgp_recovery_node only works if I restart pgpool and insert
>>   an iptables -j REJECT to the right port before the whole sync
>>    -> more downtime than just for PITR
>
> sorry, I haven't used PITR

The idea is nice, what I am somehow missing is a clean disconnect
process like:

- begin PITR
- do not allow new connections
- begin to close idle connections
- wait $hard_timeout for the running queries to terminate
- kill all still running queries
- sync database
- accept connections again

>> - connection problems when one backend is down / the available connections
>>   at the backends are full: no answer from pgpool
>
> Sure, if you've filled the number of connections you can't connect  
> anymore. It's a config problem.

So if it is, can you tell me the correct values for num_init_children
and max_pool for the following situation:

- 2 postgres backends, both accepting up to 400 connections
- 3 webservers (one connect per site access, almost always the same
  connection parameters) and one streaming server (one permanent connection)

So I would set it to:

num_init_children = 800
max_pool = 1

The website says (http://pgpool.projects.postgresql.org/):
"If you want to ensure that queries can be cancelled, set this value to
twice the expected connections"

This would make

num_init_children = 1600
max_pool = 1

Now comes the first problem: What happens, if the three webservers open
801 connections?

The next problem: What happens if one database server is disconnected
(which is quite often the case here)?

Then 800 or 1600 are 400 or 800 (depends on whether you count it twice
or not, see above) too much.


>> - no / wrong reuse of existing connections: pgpool opens up to the maximum number
>>   of connections persistent to the backends, but does NOT reuse them until all
>>   are opened (afaics) -> leads to strange behaviour if the last made connections
>>   (see above)
>
> read above

Why is the not reusage related to a configuration error?

If the following happens:

 webserver1 [user=test,db=test] -> pgpool
 pgpool_child_1 -> backend (persistent)
 webserver1 closes the connection after some seconds.

And now the following happens:
 webserver1 [user=test,db=test] -> pgpool
 pgpool_child_2 -> backend (persistent)
 webserver1 closes the connection after some seconds.

Why does pgpool not reuse the the first child with the first connection?

>> If I see imrovements in pgpool I'll give it a try again, but the pgpool-II 2.1b2
>> version is not usable yet.
>>
>
> pgpool 2.1b2, as the name says: it's a beta.

That's clear and that's why I give feedback to the devs to incorporate
into future work.

> So it's intended for  
> testing purposes only. If you need a stable version you could try a  
> previous one.

There was a reason why I couldn't use it, but I'm not sure which feature
prevented me from using it.

I think maybe I choosed the wrong tool for the right job, as I wanted
more something that does not need query rewriting or allow data
inconsistency. I know that this is a non-trivial requirement, especially
when trying to do load balancing.

I also had a look at slony-1, but as it is replicated asynchron it is no
choice for me.

Sincerly

Nico

-- 
Think about Free and Open Source Software (FOSS).
http://nico.schottelius.org/documentations/foss/the-term-foss/

PGP: BFE4 C736 ABE5 406F 8F42  F7CF B8BE F92A 9885 188C
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://pgfoundry.org/pipermail/pgpool-general/attachments/20080604/1167c554/attachment-0001.bin 


More information about the Pgpool-general mailing list