[Pgpool-general] Several questions about pgpool

Mon Aug 14 08:51:09 UTC 2006

> > Odd. If the number of concurret transactions becomes larger,
> > then the distribution of connections should become more uniform, no?
> 
> Hm, what do you mean by more uniform? I don't worry about connection distribution, but
> about connections number. I'll give you an example. Imagine you have a server farm of,
> let's say 50 servers that are "workers" or "nodes" of a cluster setup. Each node has identical
> setup and each node needs to have let's say 3 permanent connections to the database for
> several daemons running on them. Again each node using usernames "user1", "user2" and
> "user3" for DB connections. In this case we will have 150 connections to pgpool. This is unavoidable,
> but we also will have 150 database connections, which is what I'm trying to avoid. 
> Suppose that application I'm running is not very database intensive and those connections are
> "idle" most of the time. In my particular application I'd estimate 20 backend connections at max
> would be able to serve all the needs if client connections could be multiplexed into DB connections.
> Do you think I'm talking about something marginal?

Why don't you ask programmers to release database connections as soon
as the database processing finished then?

> > Your arguments sound not very strong to me. It seems you are saying
> > that "big names do like this, tnen we should do like them too".
> 
> This is not what I meant to say. What I meant to say was that I think they've done it for a reason.
> If you can suggest a solution better then "connection manager" I described, I'd very appreciate it.

Look like a circular argument:-) I think you should tell me the reason
why big names do like that first.

> > That's exactly what I'm worring about. I don't think the workaround is
> > trivial. Do you know how these big names handle this problem?
> 
> I don't know, but I can propose several workarounds out of the top of my head with the first and simplest
> is to require clients that use these commands to reset it after each query, similar to the way you require now
> that calling functions with side effects to be issued in the manner that will prohibit them from being load-balanced.

My guess is they do exactly what we are doing now. Probably the only
difference is they use thread pool while we are using process pool. I
don't think they allow to steal a not-yet-finished-session (probably
the thread would be marked as "busy").

> In other words you can make it clear in documentation and push responsibility to the users.
> Another way that comes to my mind would be to use dedicated backend connections for such clients, i.e. to work
> with them the same way pgpool works now. 
> And finally you can watch which SET commands issued by a client and attach it to session properties, so no other
> clients (unless they issued the same SET commands) would match.

Again, I think you'd better to teach programmers release the database
connection earlier. I think pgpool is designed for short but large
number of sessions, something like web based systems. 

> > > You mentioned pgpool-II several times on the list. Could you please point me
> > > to where I can read about it and watch it progress?
> > > Thanks.
> > 
> > The progress of the project is not open at the moment but it will be
> > finished by the end of this August anyway. pgpool-II will be
> > distributed as open source software via pgfoundry in September. Also
> > the presentation material at the PostgreSQL Anniversary Summit is
> > placed at:
> > 
> > http://www.sraoss.co.jp/event_seminar/2006/pgpool_feat_and_devel.pdf
> > 
> > If you want to obtain the source code before September, please let me
> > know.
> 
> Thanks for the info. I'll check that link. At the moment I'd just like to see where it's going.

You are welcome. Please feel free to ask questions regarding
pgpool-II.
--
Tatsuo Ishii
SRA OSS, Inc. Japan