[pgpool-general: 2008] is sharding possible with pgpool?
Torsten Förtsch
torsten.foertsch at gmx.net
Sat Aug 10 03:42:21 JST 2013
Hi,
I want to set up from a structural point of view identical backend
databases. Each backend then is to hold the data of a certain range of
customers. I was told pgpool can help here.
So, I started experimenting and by now I have the impression that pgpool
is the wrong tool. But maybe I am too stupid.
I set up 4 backends each with a "pgptest" table with 2 columns "id" and
"txt". The system database contains one record in
pgpool_catalog.dist_def. pgpool_catalog.replicate_def is empty.
pgpool=> select * from pgpool_catalog.dist_def ;
-[ RECORD 1 ]-+----------------------------
dbname | pgpool
schema_name | public
table_name | pgptest
col_name | id
col_list | {id,txt}
type_list | {int,text}
dist_def_func | pgpool_catalog.dist_pgptest
The pgpool_catalog.dist_pgptest function is defined as follows:
pgpool=> \df+ pgpool_catalog.dist_pgptest
List of functions
-[ RECORD 1 ]-------+---------------
Schema | pgpool_catalog
Name | dist_pgptest
Result data type | integer
Argument data types | i integer
Type | normal
Volatility | immutable
Owner | pgpool
Language | sql
Source code |
| select $1%4
Pgpool is version 3.1.3 and configured with
load_balance_mode = on
replication_mode = off
parallel_mode = on
Now, if I insert data into the table it works as expected:
insert into pgptest(id, txt) values( 125, 'test125');
The query is sent to only one backend and on the system db I see this
statement being executed with $1=125:
SELECT pgpool_catalog.dist_pgptest($1::int)
But the query
select * from pgptest where id=125
is sent to all backends, although pgpool could know that id=125 can be
only in backend 1. Also, a command to update the record with id=125 is
sent to all backends.
At first I thought that *all* read statements are sent to all backends
and only writes are distributed. But then I tried:
select substr(txt, 1, 1) letter, count(*) count
from pgptest group by letter
and was surprised to see that it was sent only to *one* backend. Though,
the equivalent
select substr(txt, 1, 1) letter, count(*) count
from pgptest group by substr(txt, 1, 1)
is sent to all backends.
Also
with xx as (select * from pgptest) select * from xx
is sent to only one backend.
Very confusing...
I have also tried turning replication_mode on and load_balance_mode off
and both of them on. That didn't help. A google search came up with
similar questions but I couldn't find any answer.
What do I wrong?
Torsten
More information about the pgpool-general
mailing list