[Pgpool-hackers] pgpool-II ideas

Fri Jun 22 18:29:55 UTC 2007

Hi,

--- Yoshiyuki Asaba <y-asaba at sraoss.co.jp> wrote:
> From: Jeff Davis <jdavis.pgpool at yahoo.com>
> Subject: Re: [Pgpool-hackers] pgpool-II ideas
> Date: Tue, 19 Jun 2007 09:33:56 -0700 (PDT)
> 
> > I think that there's still a problem if the INSERT/UPDATE/DELETE
> only
> > lock the destination table, and not the source table.
> 
> 
> I think your solution needs to serialize all writing query. There is
> a
> performance issue.
> 

I think you're right: my idea has to serialize all statements (which is
only slightly better than serializing all writing transactions).

Is there a way to detect if a query being executed asynchronously has
already obtained it's snapshot without waiting for the entire query to
complete? That would allow us to have much better concurrency.

Perhaps we could force all transactions to be SERIALIZABLE. Then we'd
know that the snapshot has been taken after the "BEGIN TRANSACTION
ISOLATION LEVEL SERIALIZABLE" has been sent. Then, we just order the
begin statements (which would be serializable) and the commit
statements, and postgres would handle the rest. Of course, this
introduces a new kind of failure, and we may need to keep an indefinite
number of statements in memory in case we need to retry the transaction
without the client knowing. I haven't considered this idea before; I
would need to think about it more.

Thoughts?

> So I'm planning to implement the following specification:
> 
>   pgpool checks CommandComplete tags of INSERT/UPDATE/DELETE. They
> are
>   included the number of updated rows. If they differ, a transaction
>   is aborted by pgpool.
> 
> pgpool switches this specification or not by a new parameter.
> 
> What do you think?

Do you mean using PQcmdTuples() to find the number of rows processed?

The inconsistency may have already been committed and you can't just
rollback in all cases.

For instance:

Let t1 be an empty relation, and let t2 be a relation with 5M records.

Client1=> insert into t1 select i from t2; -- statement1
Client2=> update t2 set i = -1 where i = 0; -- statement2

(1) statement1 is started on node0 getting a snapshot of t2 that does
not include the value -1. 
(2) statement2 is executed on node0
(3) statement2 is executed on node1
(4) PQcmdTuples() match for statement2, so statement2 commits on both
node0 and node1
(5) statement1 is started on node1 getting a different snapshot of t2
that does include the value -1.
(6) PQcmdTuples() match for statement1, so it is committed on both
node0 and node1.

Now, node0 and node1 have the same number of tuples in t1, but they
don't match.

    Regards,
        Jeff Davis

____________________________________________________________________________________
We won't tell. Get more on shows you hate to love 
(and love to hate): Yahoo! TV's Guilty Pleasures list.
http://tv.yahoo.com/collections/265