[Pgpool-general] Parallel query order by / limit

Drew Robb drewrobb at gmail.com
Thu Jun 2 16:17:34 UTC 2011


Hello, I'm fairly new to pgpool (enjoying it so far) and I'm trying to
understand parallel query behavior. I'm using pgpool 3.0.4 and pqsql 9.0.

I have a parallel table USERS distributed via the primary key user_id.  When
I execute the following query

> select user_id, date_created from users order by date_created limit 10;
>

I get the following output from pgpool debug:

> statement2: SELECT pool_parallel("SELECT user_id, users.date_created FROM
> users")
>

My understanding is that the parallel engine is unable to translate the
order by and limit clause to into the query executed on the worker nodes?
This creates a big performance problem since all records in the table are
seemingly being processed.  I read the documentation, but was unable to
figure out why this is the case. It seems to me that the order by and limit
clauses could be applied to both the workers (returning upto N*10 records)
and then reapplied to filter these results down to just 10 records. Is
the behavior i'm seeing expected, does the optimization I had in mind make
sense?

Either way, is there a way that I could specify exactly the parallel query
to be executing on the workers to avoid this problem? I know that creating a
particular view in and registering it in the dist_def table might work, but
this isn't very dynamic.

Thanks,
Drew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pgfoundry.org/pipermail/pgpool-general/attachments/20110602/e31bf7f7/attachment.html>


More information about the Pgpool-general mailing list