[Pgpool-general] Parallel Query Mode - SELECT, Phppgadmin, Psql]

calimlimvl at nationalbookstore.com.ph calimlimvl at nationalbookstore.com.ph
Tue Jul 15 10:02:54 UTC 2008


Hello,

> Hi,
>
> From: calimlimvl at nationalbookstore.com.ph
> Subject: Re: [Pgpool-general] Parallel Query Mode - SELECT, Phppgadmin,
> Psql
> Date: Tue, 15 Jul 2008 13:06:05 +0800 (PHT)
>
>> It's now working in pgpool-II 2.1 RC1. No more problems. And yes I use
>> PHP
>> PDO and AdoDB. My previous post has more details. So what's the
>> replicate_def table and replicate_select for? Thanks again. :)
>
> replicate_def is to replicate data, not to distribute.
>
> ex) partitioning data
>   node1: dist1 -> (1, 2, 3)  dist2 -> (4, 5, 6)
>   node2: dist1 -> (4, 5, 6)  dist2 -> (7, 8, 9)
>   node3: dist1 -> (7, 8, 9)  dist2 -> (10, 11, 12)
>
> When pgpool executes the follwing query, the query is not parallel
> execution. Because we need to join dist1 table on node2 and dist2
> table on node1.
>
>   SELECT * FROM dist1 INNER JOIN dist2 on dist1.a = dist2.a;
>
>
> ex) partitioning data
>   node1: dist1 -> (1, 2, 3)  rep1 -> (4, 5, 6, 7, 8, 9, 10, 11, 12)
>   node2: dist1 -> (4, 5, 6)  rep1 -> (4, 5, 6, 7, 8, 9, 10, 11, 12)
>   node3: dist1 -> (7, 8, 9)  rep1 -> (4, 5, 6, 7, 8, 9, 10, 11, 12)
>
> pgpool can execute the following query concurrently. Because rep1
> table is the same on all nodes.
>
>   SELECT * FROM dist1 INNER JOIN rep1 ON dist1.a = rep1.a;
>
> I think small tables are defined in replicate_def.

If small tables are ideal in replicate_def then, it may not be applicable
to our situation because here's the structure of the whole thing as
planned:

database: loadbal
schema: public
tables: salesdetail
        salesheader

Tables salesdetail and salesheader both contain transaction date and
branchcode columns. We decided to partition the data by transaction date
because, as calculated, it partitions data more evenly as compared to by
branchcode partitioning. We then extract the MONTH part of the transaction
date then use it as an argument on the SELECT CASE function. Therefore,
possible values are 1-12. The 2 tables are very very large (millions of
records per table because we have like 100+ branches and
hundreds/thousands of transactions per day). So that is the reason why my
function looks like this:

*** Note: $1 = transaction date

SELECT CASE WHEN EXTRACT(MONTH FROM $1) => 1 AND EXTRACT(MONTH FROM $1) <=
4 THEN 0

*** Months 1-4 goes to node0

Question: Is it safe to say that it is not advisable to use replicate_def
because the tables are very large or should I use it to replicate the
records on all nodes?

Whenever I enter a value in the transactiondate field, ex. 2008-07-15, it
goes to node1 alone and no copy on others. Is this the right behavior? Are
we going to be able to take advantage of parallel query if this is going
to be the setup? If not then what would possibly be the best partitioning
scheme to take full advantage of parallel query given the structure/setup
above? Our goal again is to prove that pgpool-II parallel query can cut
the SELECT query time shorter as compared to a single MySQL server with
same amount of data (millions). Ideally, time taken should divided by 3
because of 3 backends.

Thank you very much for all your help.

Regards,
Viril Calimlim

>
> replicate_select is a parameter for replication mode. So you don't
> need to set it.
>
> I'm not good at English. So if you cannot understand, feel free to ask
> more.
>
> Regards,
> --
> Yoshiyuki Asaba
> y-asaba at sraoss.co.jp
>
>>
>> Regards,
>> Viril Calimlim
>>
>> > Hi,
>> >
>> > From: calimlimvl at nationalbookstore.com.ph
>> > Subject: Re: [Pgpool-general] Parallel Query Mode - SELECT,
>> Phppgadmin,
>> > Psql
>> > Date: Thu, 10 Jul 2008 17:25:41 +0800 (PHT)
>> >
>> >> This is really serious because we code using PHP and we wouldn't be
>> able
>> >> to make use of pgpool-II if we couldn't solve this problem.
>> >
>> > Do you use PHP-PDO?
>> >
>> >  By the way,
>> >> where can I get more information regarding other parameters like
>> >> replicate_select (yes it replicates the SELECT on all nodes but
>> what's
>> >> the
>> >> advantage? did I miss anything on the manual?), replicate_def table,
>> >> etc.
>> >> I am more than willing to further test the software for you to
>> eliminate
>> >> more bugs. Thank you so much for the speedy response.
>> >
>> > Could you try pgpool-II 2.1 RC1?
>> >
>> > Regards,
>> > --
>> > Yoshiyuki Asaba
>> > y-asaba at sraoss.co.jp
>> >
>> >
>> >
>> >>
>> >> Best regards,
>> >> Viril Calimlim
>> >>
>> >> > Hi,
>> >> >
>> >> > Do you set parallel_mode to true?
>> >> >
>> >> > Regards,
>> >> > --
>> >> > Yoshiyuki Asaba
>> >> > y-asaba at sraoss.co.jp
>> >> >
>> >> >
>> >> > From: calimlimvl at nationalbookstore.com.ph
>> >> > Subject: [Pgpool-general] Parallel Query Mode - SELECT, Phppgadmin,
>> >> Psql
>> >> > Date: Thu, 10 Jul 2008 15:45:11 +0800 (PHT)
>> >> >
>> >> >> Hello!
>> >> >>
>> >> >> First of all, thank you for taking the time to reply and excellent
>> >> tip
>> >> >> regarding my last question about Parallel Query. However, I've got
>> a
>> >> new
>> >> >> problem.
>> >> >>
>> >> >> In Parallel Query Mode, as per the Tutorial guidelines, I created
>> the
>> >> >> following:
>> >> >>
>> >> >> - Database "test"
>> >> >> - Schema "testschema"
>> >> >> - Table "testtable"
>> >> >> - Fields "id integer", "name varchar(20)"
>> >> >> - Function pgpool_catalog.dist_def_test with same ranges
>> (1..100000
>> >> etc)
>> >> >> - Inserted a row in dist_def table with key "id"
>> >> >>
>> >> >> I have 3 backend servers 0, 1, 2 and 1 pgpool-II dedicated server.
>> I
>> >> did
>> >> >> not insert a row in replicate_def because I still don't understand
>> >> that
>> >> >> part (sorry).
>> >> >>
>> >> >> My question is that whenever I issue the command:
>> >> >>
>> >> >> postgres at pgpool-server:~$ psql -p 9999 -c "INSERT INTO
>> >> >> testschema.testtable VALUES (100000,'testname')" test
>> >> >>
>> >> >> ... I checked all the backends and it send the data to node0 ALONE
>> >> and
>> >> >> when I do a SELECT *, it shows only 1 row.
>> >> >>
>> >> >> But when I use Phppgadmin or even run a PHP script (using PDO) on
>> >> port
>> >> >> 9999 of the pgpool-II server to insert data, I checked all the
>> >> backends
>> >> >> and all of them has the copy of the inserted row. When I do SELECT
>> *,
>> >> I
>> >> >> get 3 rows of the same data. Another thing about Phppgadmin, when
>> I
>> >> >> click
>> >> >> on the BROWSE button on testtable, it just keeps on Loading...
>> >> meaning,
>> >> >> it
>> >> >> stuck forever for some reason and I had to kill all pgpool
>> processes.
>> >> >>
>> >> >> tail postgresql log: lots of "LOG:  unexpected EOF on client
>> >> connection"
>> >> >> tail pgpool log:
>> >> >>
>> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: read_kind_from_backend: read
>> >> kind
>> >> >> from 0 th backend Z NUM_BACKENDS: 3
>> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: read_kind_from_backend: read
>> >> kind
>> >> >> from 1 th backend Z NUM_BACKENDS: 3
>> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: read_kind_from_backend: read
>> >> kind
>> >> >> from 2 th backend Z NUM_BACKENDS: 3
>> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: pool_process_query: kind from
>> >> >> backend: Z
>> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: pool_read_message_length:
>> slot:
>> >> 0
>> >> >> length: 5
>> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: pool_read_message_length:
>> slot:
>> >> 1
>> >> >> length: 5
>> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: pool_read_message_length:
>> slot:
>> >> 2
>> >> >> length: 5
>> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: ReadyForQuery: message
>> length: 5
>> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: ReadyForQuery: transaction
>> >> state: I
>> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: pool_connection_pool_timer:
>> set
>> >> >> close
>> >> >> time 1215675688
>> >> >>
>> >> >> Version: pgpool-II 2.1beta2
>> >> >> OS: Ubuntu 8.04 Server
>> >> >> PostgreSQL: 8.3.1
>> >> >>
>> >> >> Please help. I really want this to work. Ive been after this for 3
>> >> >> months
>> >> >> now.
>> >> >>
>> >> >> Thank you very much for your help.
>> >> >>
>> >> >> Regards,
>> >> >> Viril
>> >> >>
>> >> >> _______________________________________________
>> >> >> Pgpool-general mailing list
>> >> >> Pgpool-general at pgfoundry.org
>> >> >> http://pgfoundry.org/mailman/listinfo/pgpool-general
>> >> > _______________________________________________
>> >> > Pgpool-general mailing list
>> >> > Pgpool-general at pgfoundry.org
>> >> > http://pgfoundry.org/mailman/listinfo/pgpool-general
>> >> >
>> >> >
>> >>
>> >>
>> >
>> >
>>
>>
>
>





More information about the Pgpool-general mailing list