[pgpool-general: 5679] Re: Segfault in child process

Jeremiah Penery jpenery at redskytech.com
Thu Aug 3 00:50:10 JST 2017


> -----Original Message-----
> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp]
> Sent: Tuesday, August 01, 2017 5:42 PM
> To: Jeremiah Penery
> Cc: pgpool-general at pgpool.net
> Subject: Re: [pgpool-general: 5670] Re: Segfault in child process
> 
> Thanks for the detailed report. Around line 309 of
> context/pool_query_context.c:309 is like this:
> 
> 	if (sc->query_context)
> 	{
> 		int node_id = sc->query_context->virtual_master_node_id;
> 
> My guess is sc->query_context has some garbage pointer. In fact
> sc->query_context is only meaningful when a query is in progress
> state. The 3.6 stable head has added additional check here:
> 
> 	if (sc->in_progress && sc->query_context)
> 	{
> 		int node_id = sc->query_context->virtual_master_node_id;
> 
> Can you please try this change?
 
Thanks for the reply.

I haven't tried updating to HEAD yet, but I updated to 3.6.5 and I'm not seeing the problem anymore.
In the log from 3.6.4, you can see that it was setting query in progress before the segfault:

> > 2017-08-01 11:52:00: pid 11834: DEBUG:  session context: setting query in progress. DONE
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_session_context.c:226
> > 2017-08-01 11:52:00: pid 10825: DEBUG:  reaper handler
> > 2017-08-01 11:52:00: pid 10825: LOCATION:  pgpool_main.c:2391
> > 2017-08-01 11:52:00: pid 10825: WARNING:  child process with pid: 11834 was terminated by segmentation fault

In 3.6.5, that behavior seems to be different, but I haven't been able to figure out exactly why it changed, or which commit would have done that.

I'm going to try building and testing HEAD later today, but for our production use we will have to use only released versions.

Thanks,
Jeremiah

> > I attached gdb to some of the child processes, and captured a couple of
> segfaults, which look like this:
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > pool_virtual_master_db_node_id () at context/pool_query_context.c:309
> > 309                     int node_id = sc->query_context->virtual_master_node_id;
> >
> > (gdb) bt
> > #0  pool_virtual_master_db_node_id () at
> context/pool_query_context.c:309
> > #1  0x000000000042cbb5 in read_kind_from_backend
> (frontend=0x26343f0, backend=0x2632d50,
> >     decided_kind=0x7ffdd9c3908f "") at protocol/pool_process_query.c:3205
> > #2  0x0000000000431a76 in ProcessBackendResponse (frontend=0x26343f0,
> backend=0x2632d50,
> >     state=0x7ffdd9c39108, num_fields=0x7ffdd9c3910e) at
> protocol/pool_proto_modules.c:2534
> > #3  0x000000000042b93d in pool_process_query (frontend=<value
> optimized out>,
> >     backend=0x2632d50, reset_request=0) at
> protocol/pool_process_query.c:303
> > #4  0x0000000000424630 in do_child (fds=0x262ae90) at protocol/child.c:377
> > #5  0x000000000040659d in fork_a_child (fds=0x262ae90, id=6) at
> main/pgpool_main.c:755
> > #6  0x0000000000408023 in reaper () at main/pgpool_main.c:2525
> > #7  0x000000000040c96d in PgpoolMain (discard_status=<value optimized
> out>,
> >     clear_memcache_oidmaps=<value optimized out>) at
> main/pgpool_main.c:479
> > #8  0x00000000004058eb in main (argc=<value optimized out>, argv=<value
> optimized out>)
> >     at main/main.c:300
> >
> >
> > Thanks,
> > Jeremiah
> >
> >
> > -----Original Message-----
> > From: pgpool-general-bounces at pgpool.net [mailto:pgpool-general-
> bounces at pgpool.net] On Behalf Of Jeremiah Penery
> > Sent: Tuesday, August 01, 2017 12:06 PM
> > To: pgpool-general at pgpool.net
> > Subject: [pgpool-general: 5669] Segfault in child process
> >
> > Pgpool-II 3.6.4, Postgres 9.6.3
> >
> > We're occasionally seeing segfaults in pgpool child processes, but it's not
> clear why.  It's causing us some major problems, because it often happens in
> a transaction, and eventually the application connection pool hangs with all
> connections in use (stuck in transaction).
> >
> > I've attached some logs from the process in question.  Is there some better
> way to get information on why this is happening?
> >
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3196
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  session context: setting query in
> progress. DONE
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_session_context.c:226
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_virtual_master_db_node_id:
> virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_query_context.c:330
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  reading backend data packet kind
> > 2017-08-01 11:52:00: pid 11834: DETAIL:  master node id: 0
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3207
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_virtual_master_db_node_id:
> virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_query_context.c:330
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_read: read 100 bytes from
> backend 0
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_stream.c:190
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_virtual_master_db_node_id:
> virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_query_context.c:330
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  reading backend data packet kind
> > 2017-08-01 11:52:00: pid 11834: DETAIL:  backend:0 kind:'2'
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3267
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  reading backend data packet kind
> > 2017-08-01 11:52:00: pid 11834: DETAIL:  backend:0 of 1 kind = '2'
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3315
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  read_kind_from_backend
> max_count:1.000000 num_executed_nodes:1
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3331
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  read_kind_from_backend:
> pending message was pulled out
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3621
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_pending_message_pull_out:
> message type:2 message len:5 query:SELECT * FROM
> quartz.qrtz_SCHEDULER_STATE WHERE SCHED_NAME = 'scheduler'
> statement: portal: node_ids[0]:0 node_ids[1]:-1
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_session_context.c:1218
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  processing backend response
> > 2017-08-01 11:52:00: pid 11834: DETAIL:  received kind '2'(32) from backend
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_proto_modules.c:2548
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_virtual_master_db_node_id:
> virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_query_context.c:330
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_virtual_master_db_node_id:
> virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_query_context.c:330
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_virtual_master_db_node_id:
> virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_query_context.c:330
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  SimpleForwardToFrontend:
> packet:2 length:0
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:818
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  session context: setting command
> success. DONE
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_session_context.c:773
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  session context: unsetting query in
> progress. DONE
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_session_context.c:237
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  read_kind_from_backend: no
> pending message
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3143
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  read_kind_from_backend: no
> pending message, previous message exists, rows returning
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3167
> > 2017-08-01 11:52:00: pid 11834: DEBUG:  session context: setting query in
> progress. DONE
> > 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_session_context.c:226
> > 2017-08-01 11:52:00: pid 10825: DEBUG:  reaper handler
> > 2017-08-01 11:52:00: pid 10825: LOCATION:  pgpool_main.c:2391
> > 2017-08-01 11:52:00: pid 10825: WARNING:  child process with pid: 11834
> was terminated by segmentation fault
> >
> > Thanks,
> > Jeremiah
> > _______________________________________________
> > pgpool-general mailing list
> > pgpool-general at pgpool.net
> > http://www.pgpool.net/mailman/listinfo/pgpool-general
> > _______________________________________________
> > pgpool-general mailing list
> > pgpool-general at pgpool.net
> > http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list