[pgpool-hackers: 3279] Re: Segfault in a race condition

Tue Mar 26 21:29:44 JST 2019

On Thu, 28 Feb 2019 10:46:41 +0900
Yugo Nagata <nagata at sraoss.co.jp> wrote:

> On Thu, 28 Feb 2019 08:14:58 +0900 (JST)
> Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> 
> > > On Wed, 27 Feb 2019 08:25:51 +0900 (JST)
> > > Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> > > 
> > >> > Hi,
> > >> > 
> > >> > I found another race condition in 3.6.15 causing a segfault, which is
> > >> > reported by our customer.
> > >> > 
> > >> > On Tue, 08 Jan 2019 17:04:00 +0900 (JST)
> > >> > Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> > >> > 
> > >> >> I found a segfault could happen in a race condition:
> > >> >> 
> > >> >> 1) frontend tries to connect to Pgpool-II
> > >> >> 
> > >> >> 2) there's no existing connection cache
> > >> >> 
> > >> >> 3) try to create new backend connections by calling connect_backend()
> > >> >> 
> > >> >> 4) inside connect_backend(), pool_create_cp() gets called
> > >> >> 
> > >> >> 5) pool_create_cp() calls new_connection()
> > >> >> 
> > >> >> 6) failover occurs and the global backend status is set to down, but
> > >> >>    the pgpool main does not send kill signal to the child process yet
> > >> >> 
> > >> >> 7) inside new_connection() after checking VALID_BACKEND, it checks the
> > >> >>    global backend status and finds it is set to down status, so that
> > >> >>    it returns without creating new connection slot
> > >> >> 
> > >> >> 8) connect_backend() continues and accesses the downed connection slot
> > >> >>    because local status says it's alive, which results in a segfault.
> > >> >  
> > >> > The situation is almost the same to above except that the segfault
> > >> > occurs in pool_do_auth().  (See backtrace and log below)
> > >> > 
> > >> > I guess pool_do_auth was called before Req_info->master_node_id was updated
> > >> > in failover(), so MASTER_CONNECTION(cp) was referring the downed connection
> > >> > and MASTER_CONNECTION(cp)->sp caused the segfault.
> > >> 
> > >> The situation is different in that the segfault explained in
> > >> [pgpool-hackers: 3214] was caused by local node status was too old
> > >> (the global status was up-to-date), while in this case the global
> > >> status is not yet updated. So we cannot employ the same fix as before.
> > > 
> > > Yes, I agree. The problem is same in the sense that there is inconsistency
> > > between global and private status, but the way to fix should be different.
> > > 
> > >> I think the possible fix would be, checking Req_info->switching = true
> > >> before referring MASTER_CONNECTION macro. If it's true, refuse to
> > >> accept new connection.
> > >> 
> > >> What do you think?
> > > 
> > > This way will be able to avoid the segfault, but I think a better way
> > > is raising a error only when Req_info->master_node_id (or primary_node_id)
> > > is changes due to a failover during accepting a new connection. Attached
> > > is a patch in this way, and I confirmed the segfault disappeared after
> > > applying this ** in native replication mode**.  
> > 
> > Why is this way better than checking Req_info->switching?
> 
> I think that your proposal refuses a new connection whenever
> Req_info->switching is true even when the degenerated node is
> not a master, while my proposal is to refuse a new connection
> only when a master/primary node is changed during a failover
> and this will minimalise possibility of connection errors.

How about this?

Maybe my proposed fix is a bit complicated, and your fix may be
simpler and safer than this, so I am fine with your solution.

I made a mantis ticket for this:
https://www.pgpool.net/mantisbt/view.php?id=481
> 
> > 
> > > # I can reproduce the segfault by executing pcp_detach_node and pcp_attach_node
> > > # for node 0 repeatedly during running pgbench with -C option.  
> > > # I useed child_max_connections = 5 to make pgpool create new connections frequently.
> > > 
> > > However, in streaming-repliction mode, I found another segfault by the
> > > same way to reproduce!  The location was MASTER_CONNECTION or TSTATE, and
> > > the segfault occured after pcp_attach_node instead of pcp_dettach_node.
> > > 
> > > My guess is as below:
> > > 
> > > 1. When a new connection was accepted, the backend of node 0 was in down, so slot[0] is null.
> > > 2. During processing a query, failback happened and node 0 became a new primary.
> > > 3. Then, when accessing slots[MASTER_NODE_ID (=0], this caused the segfault.
> > > 
> > > I don't have good idea for fixing this for now...  Just a report.
> > 
> > Thanks. Looks to be a hard problem...

And for this:
https://www.pgpool.net/mantisbt/view.php?id=482

> 
> I can reproduce this withoug using pgbench.
> 
> 1. deatch the primary node (node 0) by pcp_detach_node to set the primary node id to 1.
> 2. connect to a pgpool and execute pg_sleep(10).
> 3. during executing the query, attach node 0 to get back the primary node id to 0.
> 
> Maybe, the following assumption need some modifications, that is,
> it might not be sufficient if the attached node become a new master
> or a new primary....
> 
> 1925         /*
> 1926          * On 2011/5/2 Tatsuo Ishii says: if mode is streaming replication and
> 1927          * request is NODE_UP_REQUEST (failback case) we don't need to restart
> 1928          * all children. Existing session will not use newly attached node,
> 1929          * but load balanced node is not changed until this session ends, so
> 1930          * it's harmless anyway.
> 1931          */
> 1932 
> 1933         /*
> 1934          * On 2015/9/21 Tatsuo Ishii says: this judgment is not sufficient if
> 1935          * all backends were down. Child process has local status in which all
> 1936          * backends are down. In this case even if new connection arrives from
> 1937          * frontend, the child will not accept it because the local status
> 1938          * shows all backends are down. For this purpose we refer to
> 1939          * "all_backend_down" variable, which was set before updating backend
> 1940          * status.
> 1941          *
> 1942          * See bug 248 for more details.
> 1943          */
> 1944 
> 1945         if (STREAM && reqkind == NODE_UP_REQUEST && all_backend_down == false)
> 1946         {
> 
> Or, we might have to sustain a kind of private status of a primary node id
> and use this for MASTER_CONNECTINO or TSTATE, and so on.... Anyway, the
> master/primary node used in a child process should not change during a
> query processing.
> 
> > 
> > > The backtrace is here:
> > > =======
> > > (gdb) bt
> > > #0  0x00005651d9733872 in backend_cleanup (backend=<optimized out>, frontend_invalid=frontend_invalid at entry=0 '\000', frontend=0x5651d9a7b960 <child_frontend>) at protocol/child.c:468
> > > #1  0x00005651d97366bd in do_child (fds=fds at entry=0x5651dac73340) at protocol/child.c:417
> > > #2  0x00005651d97105a7 in fork_a_child (fds=0x5651dac73340, id=1) at main/pgpool_main.c:659
> > > #3  0x00005651d97110dd in reaper () at main/pgpool_main.c:2690
> > > #4  0x00005651d9717f8f in PgpoolMain (discard_status=<optimized out>, clear_memcache_oidmaps=<optimized out>) at main/pgpool_main.c:451
> > > #5  0x00005651d970eb32 in main (argc=<optimized out>, argv=0x7ffd862f9da8) at main/main.c:349
> > > (gdb) l
> > > 463		bool		cache_connection = false;
> > > 464	
> > > 465		if (backend == NULL)
> > > 466			return false;
> > > 467	
> > > 468		sp = MASTER_CONNECTION(backend)->sp;
> > > 469	
> > > 470		/*
> > > 471		 * cach connection if connection cache configuration parameter is enabled
> > > 472		 * and frontend connection is not invalid
> > > ======
> > > 
> > > ... and another case:
> > > ===================
> > > gdb) bt
> > > #0  0x000055e08d299097 in ReadyForQuery (frontend=frontend at entry=0x55e08e961508, backend=backend at entry=0x7fe0fefdda18, send_ready=send_ready at entry=1 '\001', 
> > >     cache_commit=cache_commit at entry=1 '\001') at protocol/pool_proto_modules.c:1909
> > > #1  0x000055e08d29ae70 in ProcessBackendResponse (frontend=frontend at entry=0x55e08e961508, backend=backend at entry=0x7fe0fefdda18, state=state at entry=0x7ffe62767fdc, 
> > >     num_fields=num_fields at entry=0x7ffe62767fda) at protocol/pool_proto_modules.c:2904
> > > #2  0x000055e08d28cef9 in pool_process_query (frontend=0x55e08e961508, backend=0x7fe0fefdda18, reset_request=reset_request at entry=0) at protocol/pool_process_query.c:321
> > > #3  0x000055e08d2876aa in do_child (fds=fds at entry=0x55e08e960340) at protocol/child.c:414
> > > #4  0x000055e08d2615a7 in fork_a_child (fds=0x55e08e960340, id=5) at main/pgpool_main.c:659
> > > #5  0x000055e08d2620dd in reaper () at main/pgpool_main.c:2690
> > > #6  0x000055e08d268f8f in PgpoolMain (discard_status=<optimized out>, clear_memcache_oidmaps=<optimized out>) at main/pgpool_main.c:451
> > > #7  0x000055e08d25fb32 in main (argc=<optimized out>, argv=0x7ffe62775508) at main/main.c:349
> > > (gdb) l
> > > 1904				return POOL_END;
> > > 1905	
> > > 1906			/*
> > > 1907			 * Set transaction state for each node
> > > 1908			 */
> > > 1909			state = TSTATE(backend,
> > > 1910						   MASTER_SLAVE ? PRIMARY_NODE_ID : REAL_MASTER_NODE_ID);
> > > 1911	
> > > 1912			for (i = 0; i < NUM_BACKENDS; i++)
> > > 1913			{
> > > =========
> > > 
> > > 
> > > 
> > > 
> > >> > Here is the backtrace from core:
> > >> > =================================
> > >> > Core was generated by `pgpool: accept connection                       '.
> > >> > Program terminated with signal 11, Segmentation fault.
> > >> > #0  0x000000000041b993 in pool_do_auth (frontend=0x1678f28, cp=0x1668f18)
> > >> >     at auth/pool_auth.c:77
> > >> > 77		protoMajor = MASTER_CONNECTION(cp)->sp->major;
> > >> > Missing separate debuginfos, use: debuginfo-install libmemcached-0.31-1.1.el6.x86_64
> > >> > (gdb) bt
> > >> > #0  0x000000000041b993 in pool_do_auth (frontend=0x1678f28, cp=0x1668f18)
> > >> >     at auth/pool_auth.c:77
> > >> > #1  0x000000000042377f in connect_backend (sp=0x167ae78, frontend=0x1678f28)
> > >> >     at protocol/child.c:954
> > >> > #2  0x0000000000423fdd in get_backend_connection (frontend=0x1678f28)
> > >> >     at protocol/child.c:2396
> > >> > #3  0x0000000000424b94 in do_child (fds=0x16584f0) at protocol/child.c:337
> > >> > #4  0x000000000040682d in fork_a_child (fds=0x16584f0, id=372)
> > >> >     at main/pgpool_main.c:758
> > >> > #5  0x0000000000409941 in failover () at main/pgpool_main.c:2102
> > >> > #6  0x000000000040cb40 in PgpoolMain (discard_status=<value optimized out>, 
> > >> >     clear_memcache_oidmaps=<value optimized out>) at main/pgpool_main.c:476
> > >> > #7  0x0000000000405c44 in main (argc=<value optimized out>, 
> > >> >     argv=<value optimized out>) at main/main.c:317
> > >> > (gdb) l
> > >> > 72		int authkind;
> > >> > 73		int i;
> > >> > 74		StartupPacket *sp;
> > >> > 75		
> > >> > 76	
> > >> > 77		protoMajor = MASTER_CONNECTION(cp)->sp->major;
> > >> > 78	
> > >> > 79		kind = pool_read_kind(cp);
> > >> > 80		if (kind < 0)
> > >> > 81			ereport(ERROR,
> > >> > =======================================-
> > >> > 
> > >> > Here is a snippet of the pgpool log. PID 5067 has a segfault.
> > >> > ==================
> > >> > (snip)
> > >> > 2019-02-23 18:41:35:MAIN(2743):[No Connection]:[No Connection]: LOG:  starting degeneration. shutdown host xxxxxxxx(xxxx)
> > >> > 2019-02-23 18:41:35:MAIN(2743):[No Connection]:[No Connection]: LOG:  Restart all children
> > >> > 2019-02-23 18:41:35:CHILD(5067):[No Connection]:[No Connection]: LOG:  new connection received
> > >> > 2019-02-23 18:41:35:CHILD(5067):[No Connection]:[No Connection]: DETAIL:  connecting host=xxxxxx port=xxxx
> > >> > (snip)
> > >> > 2019-02-23 18:41:37:MAIN(2743):[No Connection]:[No Connection]: LOG:  child process with pid: 5066 exits with status 0
> > >> > 2019-02-23 18:41:37:MAIN(2743):[No Connection]:[No Connection]: LOG:  child process with pid: 5066 exited with success and will not be restarted
> > >> > 2019-02-23 18:41:37:MAIN(2743):[No Connection]:[No Connection]: WARNING:  child process with pid: 5067 was terminated by segmentation fault
> > >> > 2019-02-23 18:41:37:MAIN(2743):[No Connection]:[No Connection]: LOG:  child process with pid: 5067 exited with success and will not be restarted
> > >> > 2019-02-23 18:41:37:MAIN(2743):[No Connection]:[No Connection]: LOG:  child process with pid: 5068 exits with status 0
> > >> > 2019-02-23 18:41:37:MAIN(2743):[No Connection]:[No Connection]: LOG:  child process with pid: 5068 exited with success and will not be restarted
> > >> > (snip)
> > >> > ===================
> > >> > 
> > >> > 
> > >> > 
> > >> > Regards,
> > >> > -- 
> > >> > Yugo Nagata <nagata at sraoss.co.jp>
> > > 
> > > 
> > > -- 
> > > Yugo Nagata <nagata at sraoss.co.jp>
> 
> 
> -- 
> Yugo Nagata <nagata at sraoss.co.jp>
> _______________________________________________
> pgpool-hackers mailing list
> pgpool-hackers at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-hackers

-- 
Yugo Nagata <nagata at sraoss.co.jp>