[Pgpool-hackers] Concurrent BackendInfo read/write access

Fri Sep 5 01:52:36 UTC 2008

It seems health_check() does not update Req_info.

AS for main loop such as:

	Req_info->kind = NODE_DOWN_REQUEST;
	Req_info->node_id[0] = sts;

probably it's ok since the only race conditions are from children
those are detecting another backend errors. In this they put
NODE_DOWN_REQUEST any way.

> There are places like health_check() where backend_status is being set
> without holding any semaphores and also in the main loop after getting the
> health status, the Req_info is updated without holding onto the
> REQUEST_INFO_SEM.  There are other cases where the semaphore isn't held
> while updating shared memory data structures, like send_failback_request()
> which is called by the PCP process.  Is there something I'm missing?

Question is the case where NODE_UP_REQUEST set by
send_failback_request() as you mentioned. What if while processing
NODE_DOWN_REQUEST Req_info->kind is overwitten to NODE_UP_REQUEST?
(Protectition using semaphore is useless anyway).

NODE_UP_REQUEST is requested either by pcp comnands or on line
recovery process. The former is called by human and probably we don't
need to worry about very much. The latter is possible. While doing the
recovery first stage, it's possible that backend fails. I need to look
into more...
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> Thanks.
> 
> On 9/3/08, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> >
> > > Hello,
> > >
> > > I'm looking at the source code for PGPool II 2.1. And I have a question
> > > regarding concurrent read/write access to the BackendInfo structure in
> > > shared memory. There are calls like VALID_BACKEND(i) or
> > > BACKEND_INFO(i).backend_status = CON_DOWN, which seem to be called
> > > concurrently without holding any mutexes. It seems that setting the
> > > backend_status is ok, since it's an enum and is an atomic operation.
> > > However, VALID_BACKEND() isn't really an atomic operation, but may still
> > be
> > > ok:
> > >
> > > ((BACKEND_INFO(backend_id).backend_status == CON_UP) || \
> > > (BACKEND_INFO(backend_id).backend_status == CON_CONNECT_WAIT))
> > > A bigger issue seems to be that without the use of mutexes or semaphores,
> > > how does PGPool guarantee the memory state for the backend status is
> > > consistent across different processor cores?
> >
> >
> > Actually semaphores are used. Try to grep pool_semaphore_lock etc.
> >
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> >