[Pgpool-hackers] pgpool-II ideas

Jeff Davis jdavis.pgpool at yahoo.com
Sun Jun 10 19:25:09 UTC 2007


I have a few ideas about possible improvements to
PgPool-II. These are just some thoughts, and I'd like
to know how possible or useful they would be.

1. Adding support for 2PC in replication mode

Transform all transactions to be of the form:
BEGIN;
-- statements go here

-- pgpool.transaction is table that exists on all
nodes
INSERT INTO pgpool.transaction(id)
VALUES('some_unique_id');
PREPARE TRANSACTION 'some_unique_id';
-- wait for all nodes to prepare

COMMIT PREPARED 'some_unique_id';

executed on all nodes. If any one node fails to
pre-commit, it can be dropped and an alert raised. If
the pgpool server crashes and dies, we can bring up
all the nodes and find any prepared (but not
committed)
transactions and, if it's ID matches any record in
pgpool.transaction on any node, issue a COMMIT
PREPARED 'ID'. If no node has a matching transaction
committed, we ROLLBACK PREPARED 'ID'. This should
bring all the nodes into sync, and pgpool can resume
normal operation without any data loss.

Currently, pgpool-II is safe from any of the nodes
failing, but unsafe if pgpool-II itself crashes, if I
understand correctly. This would fix that problem, I
think.

This could be a configurable option, since it will
hurt write performance. I think it would need to be
combined with strict-enough write ordering to prevent
deadlocks.

2. More strict write ordering in replication more

Currently, pgpool-II can suffer from inconsistencies
between nodes due to complex transactions getting
different snapshots on different nodes:

http://lists.pgfoundry.org/pipermail/pgpool-general/2007-May/000641.html

The simple solution is to serialize all writing
transactions completely, which would seriously hurt
write performance. 

I think the better solution is to transform all
writing transactions into explicit transactions with a
BEGIN ... COMMIT. Then, make sure all statements (even
from different connections) are executed in the same
order on each node and in the same order across nodes
(to prevent deadlock). I think this can be done in a
safe way with a shared counter for the pgpool
processes. 

This should be configurable or perhaps replace
replication_strict 

3. Unification of replication mode and parallel query
mode

It doesn't seem like these modes are mutually
exclusive. It would be nice if both modes could be
used together; even having replicated partitions, etc.
Is there a reason this won't work or is it just
challenging?

Thoughts? Have these things been discussed already? If
one of these things seem promising, I'll take a look
into the code.

Regards,
        Jeff Davis




       
____________________________________________________________________________________
Yahoo! oneSearch: Finally, mobile search 
that gives answers, not web links. 
http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC



More information about the Pgpool-hackers mailing list