[Pgpool-hackers] pgpool-II ideas

Mon Jun 25 07:55:19 UTC 2007

Hi,

Thank you for your comment.

From: Jeff Davis <jdavis.pgpool at yahoo.com>
Subject: Re: [Pgpool-hackers] pgpool-II ideas
Date: Fri, 22 Jun 2007 11:29:55 -0700 (PDT)

> > I think your solution needs to serialize all writing query. There is
> > a
> > performance issue.
> > 
> 
> I think you're right: my idea has to serialize all statements (which is
> only slightly better than serializing all writing transactions).
> 
> Is there a way to detect if a query being executed asynchronously has
> already obtained it's snapshot without waiting for the entire query to
> complete? That would allow us to have much better concurrency.
> 
> Perhaps we could force all transactions to be SERIALIZABLE. Then we'd
> know that the snapshot has been taken after the "BEGIN TRANSACTION
> ISOLATION LEVEL SERIALIZABLE" has been sent. Then, we just order the
> begin statements (which would be serializable) and the commit
> statements, and postgres would handle the rest. Of course, this
> introduces a new kind of failure, and we may need to keep an indefinite
> number of statements in memory in case we need to retry the transaction
> without the client knowing. I haven't considered this idea before; I
> would need to think about it more.
> 
> Thoughts?

SERIALIZABLE snapshot is created by GetTransactionSnapshot(). BEGIN
command does not call it. We need to research the queries which calls
GetTransactionSnapshot().

* pgsql/src/backend/utils/time/tqual.c:GetTransactionSnapshot()
Snapshot
GetTransactionSnapshot(void)
{
	/* First call in transaction? */
	if (SerializableSnapshot == NULL)
	{
		SerializableSnapshot = GetSnapshotData(&SerializableSnapshotData, true);  <-- *** HERE ***
		return SerializableSnapshot;
	}

	if (IsXactIsoLevelSerializable)
		return SerializableSnapshot;

	LatestSnapshot = GetSnapshotData(&LatestSnapshotData, false);

	return LatestSnapshot;
}

Then, if pgpool serializes some queries, I think there is not only
performance issue but also deadlock problem in pgpool.

Maybe, query serialization is implemented by the following process.

  enter critical section by acquiring a lock
    write to node0
    read result from node0
    write to node1
    read result from node0
    ....
  leave critical section by releasing a lock

When pgpool is blocked in critical section, it occurs deadlock.
What do you think?

> > So I'm planning to implement the following specification:
> > 
> >   pgpool checks CommandComplete tags of INSERT/UPDATE/DELETE. They
> > are
> >   included the number of updated rows. If they differ, a transaction
> >   is aborted by pgpool.
> > 
> > pgpool switches this specification or not by a new parameter.
> > 
> > What do you think?
> 
> Do you mean using PQcmdTuples() to find the number of rows
> > processed?

Yes. 

> The inconsistency may have already been committed and you can't just
> rollback in all cases.
> 
> For instance:
> 
> Let t1 be an empty relation, and let t2 be a relation with 5M records.
> 
> Client1=> insert into t1 select i from t2; -- statement1
> Client2=> update t2 set i = -1 where i = 0; -- statement2
> 
> (1) statement1 is started on node0 getting a snapshot of t2 that does
> not include the value -1. 
> (2) statement2 is executed on node0
> (3) statement2 is executed on node1
> (4) PQcmdTuples() match for statement2, so statement2 commits on both
> node0 and node1
> (5) statement1 is started on node1 getting a different snapshot of t2
> that does include the value -1.
> (6) PQcmdTuples() match for statement1, so it is committed on both
> node0 and node1.
> 
> Now, node0 and node1 have the same number of tuples in t1, but they
> don't match.

Yes, an UPDATE command does not change the number of tuples. So pgpool
cannot detect the inconsistency. However, I think it is the only
case. If it is true, I think we only add restriction list... There is
a trade-off between consistency and scalability.

Any comment?
Regards,
--
Yoshiyuki Asaba
y-asaba at sraoss.co.jp