[Pgpool-general] Parallel SELECT now works but is very slow...but dblink is parallel?

Sandeep Thakkar sandeeptt at yahoo.com
Thu Feb 3 10:00:37 UTC 2011


I see that the complete data is available on all the nodes. I guess, in the 
pgbench example, the data is distributed based on aid, bid, tid values for 
accounts, branches and tellers accounts resp. Am I correct? If yes, then the 
following query should return the value through only one port and with pgpool 
port. Because, the row where aid is "3", is present on only one of the nodes.

psql -p 5432 -c "select abalance from pgbench_accounts where aid=3" 
bench_parallel

But, I see that the query returns correct value when I specify the port of other 
nodes as well. (5433, 5434). Why? Does it mean that parallel query mode was not 
enabled? Or am I misunderstanding something here? Also, do we see anything in 
pgpool.log regarding parallel query process? Please help here. 


Thank You.





________________________________
From: Alessandro Candini <candini at meeo.it>
To: pgpool-general at pgfoundry.org
Sent: Tue, February 1, 2011 7:57:41 PM
Subject: [Pgpool-general] Parallel SELECT now works but is very slow...but 
dblink is parallel?

Finally I was able to properly configure pgpool in order to perform a parallel
query.

But it is so slow that it seems that the query on my splitted database is
performed sequentially rather than in a parallel way.

I have on the same machine pgpool-II-3.0.1 and 4 istances of postgresql-9.0.2
(respectovely on port 5433, 5434, 5435, 5436).
Every istance has a different piece (no replication!) of the same database.

A test query that I prepared which retrieve a big amount of data, takes 6
seconds with pgpool,
but if I launch the same query with 4 different but
contemporary threads directly on database ports (using a python script),
it takes only 0.9 seconds per thread.
Ok, the results are not merged together, but what a difference!

I think that pgpool splits the queries through the instancies, but launch them
sequentially.
I guess "1 sec * db + result_merge_time" = 6 seconds more or less...

Is that possible and there is a way to fix it?
Is the dblink function launched in a parallel way (contemporarily) on the 4 db
instancies?

My configuration is the follows, thanks in advance...
listen_addresses = '*'
port = 9999
pcp_port = 9898
socket_dir = '/tmp'
pcp_socket_dir = '/tmp'
backend_socket_dir = '/tmp'
pcp_timeout = 10
num_init_children = 32
max_pool = 4
child_life_time = 0
connection_life_time = 0
child_max_connections = 0
client_idle_limit = 0
authentication_timeout = 60
logdir = '/var/log/pgpool'
pid_file_name = '/var/log/pgpool/pgpool.pid'
replication_mode = true
load_balance_mode = false
replication_stop_on_mismatch = false
failover_if_affected_tuples_mismatch = false
replicate_select = false
reset_query_list = 'ABORT; DISCARD ALL'
white_function_list = ''
black_function_list = 'nextval,setval'
print_timestamp = true
master_slave_mode = false
master_slave_sub_mode = 'stream'
delay_threshold = 100
log_standby_delay = 'if_over_threshold'
connection_cache = true
health_check_timeout = 10
health_check_period = 10
health_check_user = 'postgis'
failover_command = '/usr/lib/pgpool-II/3.0.1/bin/failover.sh %h %H
/tmp/trigger_file0'
failback_command = '/usr/lib/pgpool-II/3.0.1/bin/failback.sh %h %M
/tmp/trigger_file0 %m'
fail_over_on_backend_error = false
insert_lock = true
ignore_leading_white_space = false
log_statement = false
log_per_node_statement = false
log_connections = false
log_hostname = false
parallel_mode = true
enable_query_cache = false
pgpool2_hostname = ''
system_db_hostname = 'localhost'
system_db_port = 5433
system_db_dbname = 'pgpool'
system_db_schema = 'pgpool_catalog'
system_db_user = 'pgpool'
system_db_password = 'gispost'
backend_hostname0 = 'localhost'
backend_port0 = 5433
backend_weight0 = 1
backend_data_directory0 = '/home/database/9.0.2/db_0/data'
backend_hostname1 = 'localhost'
backend_port1 = 5434
backend_weight1 = 1
backend_data_directory1 = '/home/database/9.0.2/db_1/data'
backend_hostname2 = 'localhost'
backend_port2 = 5435
backend_weight2 = 1
backend_data_directory2 = '/home/database/9.0.2/db_2/data'
backend_hostname3 = 'localhost'
backend_port3 = 5436
backend_weight3 = 1
backend_data_directory3 = '/home/database/9.0.2/db_3/data'
enable_pool_hba = true
recovery_user = 'postgis'
recovery_password = 'gispost'
recovery_1st_stage_command = 'basebackup.sh'
recovery_2nd_stage_command = ''
recovery_timeout = 60
client_idle_limit_in_recovery = 0
lobj_lock_table = 'pgpool_lobj_lock'
ssl = false
debug_level = 0

_______________________________________________
Pgpool-general mailing list
Pgpool-general at pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general



      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pgfoundry.org/pipermail/pgpool-general/attachments/20110203/4f60c8e7/attachment.html>


More information about the Pgpool-general mailing list