[Pgpool-hackers] Transforming SQL to prepared statements inside pgpool-II
Anssi Kääriäinen
anssi.kaariainen at thl.fi
Wed May 18 07:01:13 UTC 2011
A quick update about transforming simple queries to extended protocol
queries inside pgpool-II.
I managed to do static transformation of a query (select 1 from foo join
bar on bar.id = foo.id where xyzzy = 2;) inside pgpool. Transforming the
query inside pgpool isn't that hard, there is just a few extended
protocol message types which needs to be handled. The performance is as
expected - you do get rid of planning overhead. Also, the data returned
by pgpool to the frontend is identical when passing the simple query and
when transforming to extended protocol query, so there should be no
problem in that part.
Now, there are two hard parts, as far as I can see.
The first one is parsing the query and matching it against the potential
prepare targets. For this, I would have to create a parse tree walker
that can handle any node possible in select statements. And that is
quite a list. This should be doable, but requires a lot of work.
The second problem is handling prepared statement state across
connections. For this feature to be useful (at least for my use case) I
would need to persist the prepared statements for each backend
connection in the pool. Typical use for me would be a web application
which makes a lot of short connections. So if the transformation to
prepared statements is done per frontend connection, there is not much
to gain. On the other hand, if prepare is done just once for backend
connection, then the problem is how to handle "deallocate all" and
"discard all". Trying to bind to non-existing prepared statement will
result in aborted transaction, so "bind and if fail prepare" is not
possible.
If the plans are kept for the duration of the backend connection, then a
frontend using prepared statements could have problems. At connection
start there are prepared statements from old connections still hanging
around. And as said, if not kept, there is not much gain from this
feature. Also, if the frontend manually deallocates all plans (possibly
inside a plpgsql function), then we are screwed. There is no cheap way
to check if a given prepared statement exists if I am not mistaken. A
potential simple solution for this is to document that "deallocate all"
and "discard all" should not be used when using this feature and using
other than unnamed prepared statements in frontend connections can lead
to name clashes.
I am not exactly sure why use of extended query protocol is wanted. The
other option is to use "prepare _pgpool_plan_x as (query);" and "execute
_pgpool_plan_x;" using simple query protocol. This would be a bit easier
to do and allow for more use cases. For example function arguments are
not allowed in extended query protocol, but "execute
fooplan(some_func())" is allowed.
Unfortunately, at the moment I do not have too much time to devote to
this. I will get back to this when I have time. This could be an useful
feature, and at least I can learn a lot while hacking.
- Anssi
More information about the Pgpool-hackers
mailing list