[Pgpool-general] pgpool II first experiences

Mon Sep 18 12:27:41 UTC 2006

Hi,

Thank you for trying pgpool-II.

From: "Milen Kulev" <makulev at gmx.net>
Subject: RE: pgpool II first experiences
Date: Sun, 17 Sep 2006 16:09:39 +0200
Message-ID: <000e01c6da62$ed5fb340$0a00a8c0 at trivadis.com>

> Hello Mr. Ishii, 
> At the end (after my holiday) I have found a time to try PgpoolII.
> The installation went smoothly-> I have just followed you installation manual.
> Then I use pgbench to load data.
> I hava change a little bit the functions since I do not have 3 nodes, but only two.
> The functions dist_def_accounts and dist_def_tellers are returning node2, if the the values are not in a specified
> range. After that I was able to generate (INSERT) the data
> With  "./pgbench -p 9999  -i pgpool"  command. Pgpool is listeneing on port 9999:
> 
> bizadmin at node5#[s1] /u00/app/bizgres/product/bizgres/src/contrib]$ netstat  -antp | grep -i listen
> (Es konnten nicht alle Prozesse identifiziert werden; Informationen über
> nicht-eigene Processe werden nicht angezeigt; Root kann sie anzeigen.)
> tcp        0      0 127.0.0.1:5601              0.0.0.0:*                   LISTEN      12068/postmaster
> tcp        0      0 127.0.0.1:5602              0.0.0.0:*                   LISTEN      30017/postmaster
> tcp        0      0 127.0.0.1:5603              0.0.0.0:*                   LISTEN      29997/postmaster
> tcp        0      0 0.0.0.0:9898                0.0.0.0:*                   LISTEN      30622/pgpool
> tcp        0      0 127.0.0.1:9999              0.0.0.0:*                   LISTEN      30622/pgpool
> 
> 
> And now the problems
> 1)  I am connecting to pgpool  with the command:
> 	 psql  -p 9999  pgpool
> If I issue :
>  SELECT  aid from  accounts  where aid < 100;
> Then BOTH od PG databases (with service names S1 and S2) are queried???
> ==> /u01/bizdata/s2/pg_log/s2.log <==
> S2 - 30712 LOG:  statement:  SELECT "aid" FROM "accounts" WHERE "aid"<100
> S2 - 30712 LOG:  duration: 1.953 ms
> S2 - 30712 LOG:  duration: 1.953 ms  statement:  SELECT "aid" FROM "accounts" WHERE "aid"<100
> 
> ==> /u01/bizdata/s3/pg_log/s3.log <==
> S3 - 30713 LOG:  statement:  SELECT "aid" FROM "accounts" WHERE "aid"<100
> S3 - 30713 LOG:  duration: 0.979 ms
> S3 - 30713 LOG:  duration: 0.979 ms  statement:  SELECT "aid" FROM "accounts" WHERE "aid"<100
> 
> I thought that the function dist_def_accounts is used to rewrite the SQL statement.
> And I had actually expected only S2 PG database to be queried ! I am wrong I my undestanding 
> How pgpool II works ? I was expecxting that gpool II would have detected (from  its repository)
> That data with aid< 100  will be found only on  s2 (node1), and  would have issued the SELECT "aid" FROM "accounts"
> WHERE "aid"<100 ONLY on this node!
> 
> The definition of the function:
> CREATE OR REPLACE FUNCTION pgpool_catalog.dist_def_accounts (val ANYELEMENT) RETURNS INTEGER AS '
>   SELECT CASE WHEN $1 >= 1 and $1 <= 30000 THEN 0
> 	      WHEN $1 > 30000 and $1 <= 60000 THEN 1
> 	      ELSE 0 
>   END' LANGUAGE SQL;
> 
> The pgpoool II log looks like this:
> 2006-09-15 22:31:48 LOG:   pid 30665: statement: SELECT  aid from  accounts  where aid < 101 ;
> 2006-09-15 22:31:48 DEBUG: pid 30665: can pool_parallel_exec  SELECT "aid" FROM "accounts" WHERE "aid"<101

The data partitioning rule is used only for INSERT. Other queries
including SELECT do not use it. So the behavior you are watching is
expected one.

> 2) I am not able to use dblink, although I am sure that I a have installes db_links fintions
> >From "contrib"  directory on all database clusters:
> 
> pgpool=#  select  * from  accounts where aid < 1000 order by aid ;
> ERROR:  function dblink("unknown", "unknown") does not exist
> HINT:  No function matches the given name and argument types. You may need to add explicit type casts.
> pgpool=#
> 
> Excerpt from PgPool II log :
> 
> 2006-09-15 22:32:57 DEBUG: pid 30665: read kind from frontend Q(51)
> 2006-09-15 22:32:57 LOG:   pid 30665: statement: select  * from  accounts where aid < 1000 order by aid ;
> 2006-09-15 22:32:57 DEBUG: pid 30665: OneNode_do_command: Query:  SELECT *  FROM dblink('host=localhost dbname=pgpool
> port=9999 user=bizadmin','SELECT pool_parallel("select * from accounts WHERE aid <1000")') AS accounts(aid integer,bid
> integer,abalance integer,filler character(84)) ORDER BY aid
> 2006-09-15 22:32:57 DEBUG: pid 30665: pool_rewrite_stmt: XXX message_code 7
> 2006-09-15 22:32:57 DEBUG: pid 30665: pool_rewrite_stmt: XXX rule 704
> 2006-09-15 22:32:57 DEBUG: pid 30665: pool_process_query: num_fds: 12
> 
> Why I am doing here wrong ?

You need to install dblink() in system DB. In your case localhost port
5601.

> I hope I will get answer soon.
> 
> Best  Regards. 
> Milen Kulev
> Trivadis GmbH, Munich. Germany
>
> P.S. My  pgpool.conf is (only the changes lines):
> # if non 0, run in parallel query mode
> #parallel_mode = false
> parallel_mode = true
> 
> # if non 0, use query cache
> enable_query_cache = false
> 
> #set pgpool2 hostname
> pgpool2_hostname = 'localhost'
> 
> # system DB info
> system_db_hostname = 'localhost'
> system_db_port = 5601
> system_db_dbname = 'pgpool'
> system_db_schema = 'pgpool_catalog'
> system_db_user = 'pgpool'
> system_db_password = 'pgpool'
> 
> # backend_hostname, backend_port, backend_weight
> # here are examples
> backend_hostname0 = 'localhost'
> backend_port0 = 5602
> backend_weight0 = 1
> backend_hostname1 = 'localhost'
> backend_port1 = 5603
> backend_weight1 = 1
> 
> Where  does the backend_weight parameter is used for ?

backend_weight is not used in parallel_mode.

> 
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp] 
> Sent: Monday, July 31, 2006 2:30 AM
> To: makulev at gmx.net
> Cc: ipa-pgpool2 at sranhm.sra.co.jp
> Subject: Re: pgpool II infomation/sources
> 
> 
> Hi,
> 
> We have decided to provide you an eary version of pgpool-II (alpha 0.1). Please find the attached tar ball.
> 
> Currently no English documentations are available. Here is a brief instruction how to install and run pgpool-II.
> 
> 1) requirements
> 
> You should install PostgreSQL 7.4 or later client libraries (libpq) on your machine to build/running pgpool-II. We are
> testing pgpool-II against 8.1.4 and recommend to use same version to avoid any confusion.
> 
> 2) configure, make, make install
> 
> We are using standard configure system and just typing configure, make, make install should be fine. There are few
> options for configure:
> 
> --prefix=dir install pgpool-II onto dir, rather than /usr/local --with-pgsql=dir look for libpq and staffs in dir rather
> than --/usr/local/pgsql
> 
> >From now on we refer the pgpool-II install directory as $PGPOOL.
> 
> 3) edit pgpool.conf
> 
> pgpool.conf is the main configuration file for pgpool-II.
> 
> cp $PGPOOL/etc/pgpool.conf.sample $PGPOOL/etc/pgpool.conf
> 
> and edit pgpool.conf. Here are places you need to edit to run pgpool-II in parallel query mode.
> 
> # if non 0, run in parallel query mode
> parallel_mode = true
> 
> # system DB info
> system_db_hostname = 'localhost'
> system_db_port = 5432
> system_db_dbname = 'pgpool'
> system_db_schema = 'pgpool_catalog'
> system_db_user = 'pgpool'
> system_db_password = ''
> 
> # backend_hostname, backend_port, backend_weight
> # here are examples
> #backend_hostname0 = 'host1'
> #backend_port0 = 5432
> #backend_weight0 = 1
> #backend_hostname1 = 'host2'
> #backend_port1 = 5433
> #backend_weight1 = 1
> 
> 4) edit pcp.conf
> 
> cp $PGPOOL/etc/pcp.conf.sample $PGPOOL/etc/pcp.conf
> 
> pcp is a pgpool-II management subsystem and it requires username and password to logged on. The username and password
> pair should be written in pcp.conf. The password must be encrypted using  MD5 hash. For your convenience, a command to
> generate MD5 hashed password.
> Type:
> 
> $PGPOOL/bin/pg_md5 bar
> 37b51d194a7513e45b56f6524f2d51f2
> 
> if your password is "bar". Now you could add username=foo and its password in to pcp.conf.
> 
> foo: 37b51d194a7513e45b56f6524f2d51f2
> 
> 5) install system DB
> 
> Run following SQL script to initialize system DB on pgpool DB (you need to tweak this if you modify system DB part of
> pgpoo.conf).
> 
> -------------------------------------------------------
> DROP TABLE pgpool_catalog.dist_def;
> DROP TABLE pgpool_catalog.query_cache;
> DROP SCHEMA pgpool_catalog;
> 
> 
> CREATE SCHEMA pgpool_catalog;
> 
> CREATE TABLE pgpool_catalog.dist_def(
> 	dbname TEXT,
> 	schema_name TEXT,
> 	table_name TEXT,
> 	col_name TEXT NOT NULL CHECK (col_name = ANY (col_list)),
> 	col_list TEXT[] NOT NULL,
> 	type_list TEXT[] NOT NULL,
> 	dist_def_func TEXT NOT NULL,
> 	PRIMARY KEY (dbname,schema_name,table_name)
> );
> 
> CREATE TABLE pgpool_catalog.query_cache (
> 	hash TEXT,
> 	query TEXT,
> 	value bytea,
> 	dbname TEXT,
> 	create_time TIMESTAMP WITH TIME ZONE,
> 	PRIMARY KEY(hash, dbname)
> );
> -------------------------------------------------------
> 
> 6) create your own table partitioning rule
> 
> It is user's task to create table partitioning rules. Here is an example for pgbench's tables (accounts, branches,
> tellers). "history" table is not defined below and it will be created in the system DB host.
> 
> -------------------------------------------------------
> INSERT INTO pgpool_catalog.dist_def VALUES (
> 	'pgpool',
> 	'public',
> 	'accounts',
> 	'aid',
> 	ARRAY['aid','bid','abalance','filler'],
> 	ARRAY['integer','integer','integer','character(84)'],
> 	'pgpool_catalog.dist_def_accounts'
> );
> 
> INSERT INTO pgpool_catalog.dist_def VALUES (
> 	'pgpool',
> 	'public',
> 	'branches',
> 	'bid',
> 	ARRAY['bid','bbalance','filler'],
> 	ARRAY['integer','integer','character(84)'],
> 	'pgpool_catalog.dist_def_branches'
> );
> INSERT INTO pgpool_catalog.dist_def VALUES (
> 	'pgpool',
> 	'public',
> 	'tellers',
> 	'tid',
> 	ARRAY['tid','bid','tbalance','filler'],
> 	ARRAY['integer','integer','integer','character(84)'],
> 	'pgpool_catalog.dist_def_tellers'
> );
> 
> 
> CREATE OR REPLACE FUNCTION pgpool_catalog.dist_def_accounts (val ANYELEMENT) RETURNS INTEGER AS '
>   SELECT CASE WHEN $1 >= 1 and $1 <= 30000 THEN 0
> 	      WHEN $1 > 30000 and $1 <= 60000 THEN 1
> 	      ELSE 2
>   END' LANGUAGE SQL;
>  
> CREATE OR REPLACE FUNCTION pgpool_catalog.dist_def_branches (val ANYELEMENT) RETURNS INTEGER AS '
>   SELECT 0
>   ' LANGUAGE SQL;
> 	 
> CREATE OR REPLACE FUNCTION pgpool_catalog.dist_def_tellers (val ANYELEMENT) RETURNS INTEGER AS '
>   SELECT CASE WHEN $1 >= 1 and $1 <= 3 THEN 0
>           WHEN $1 > 3 and $1 <= 6 THEN 1
> 	  ELSE 2
>   END' LANGUAGE SQL;
> -------------------------------------------------------
> 
> 7) run pgpool
> 
> Finally you can start up pgpool.
> 
> $PGPOOL/bin/pgpool -d -n >& somewhere
> 
> will produce lots of debugging messages which should help you if you have any trouble.
> 
> regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> 
> > Hello Milen Kulev,
> > 
> > Thank you for interested in pgpool-II.
> > I'm going to discuss with our pgpool-II development team and reply 
> > back to you soon.
> > 
> > regards,
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> > 
> > > Hello Mr. Ishii,
> > > recently I have found on
> > > "http://pgpool.projects.postgresql.org/pgpool-II/en/" that the new version of pgpool is planned to be released in
> September. 
> > > 
> > > Currently I have become a task to perform an evaluation of Postgres 
> > > capabilities in DWH and BI(Business Intelligence) field.
> > > 
> > > I has found the architecture of pgpool II pretty intriguing and 
> > > promising.
> > > I wuould like to  compare pgpool II with extendDB (roughly the same idea pgpool II, but implemented in Java  ), and
> bizgres MPP.
> > > 
> > > I would like to ask you to send me /or point me out where I could 
> > > find / the sources of pgpool II, so that I can test pgpool II.
> > > 
> > > I intend to install pgpool II on RHEL4 and/or SLES9 .
> > > 
> > > I know that release date for pgpool II is planned to be in Sep 2006, 
> > > but I would like to be able to test pgpool II a little bit more 
> > > (Yes, I know that this will not be the final version).
> > > 
> > > I would be glad if I can become a pgpool II sources und thus to be 
> > > able to
> > > perform some beta - testing /and give you feedback, of course/.
> > > 
> > > Best Regards.
> > > Milen Kulev, Oracle DBA 
> > > Munich, Germany
> > > 
> > > 
> > > 
> > > --
> > > 
> > > 
> > > Echte DSL-Flatrate dauerhaft für 0,- Euro*. Nur noch kurze Zeit! 
> > > "Feel free" mit GMX DSL: http://www.gmx.net/de/go/dsl