[pgpool-general: 5671] Re: Segfault in child process

Tatsuo Ishii ishii at sraoss.co.jp
Wed Aug 2 07:42:05 JST 2017


Thanks for the detailed report. Around line 309 of
context/pool_query_context.c:309 is like this:

	if (sc->query_context)
	{
		int node_id = sc->query_context->virtual_master_node_id;

My guess is sc->query_context has some garbage pointer. In fact
sc->query_context is only meaningful when a query is in progress
state. The 3.6 stable head has added additional check here:

	if (sc->in_progress && sc->query_context)
	{
		int node_id = sc->query_context->virtual_master_node_id;

Can you please try this change?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> I attached gdb to some of the child processes, and captured a couple of segfaults, which look like this:
> 
> Program received signal SIGSEGV, Segmentation fault.
> pool_virtual_master_db_node_id () at context/pool_query_context.c:309
> 309                     int node_id = sc->query_context->virtual_master_node_id;
> 
> (gdb) bt
> #0  pool_virtual_master_db_node_id () at context/pool_query_context.c:309
> #1  0x000000000042cbb5 in read_kind_from_backend (frontend=0x26343f0, backend=0x2632d50,
>     decided_kind=0x7ffdd9c3908f "") at protocol/pool_process_query.c:3205
> #2  0x0000000000431a76 in ProcessBackendResponse (frontend=0x26343f0, backend=0x2632d50,
>     state=0x7ffdd9c39108, num_fields=0x7ffdd9c3910e) at protocol/pool_proto_modules.c:2534
> #3  0x000000000042b93d in pool_process_query (frontend=<value optimized out>,
>     backend=0x2632d50, reset_request=0) at protocol/pool_process_query.c:303
> #4  0x0000000000424630 in do_child (fds=0x262ae90) at protocol/child.c:377
> #5  0x000000000040659d in fork_a_child (fds=0x262ae90, id=6) at main/pgpool_main.c:755
> #6  0x0000000000408023 in reaper () at main/pgpool_main.c:2525
> #7  0x000000000040c96d in PgpoolMain (discard_status=<value optimized out>,
>     clear_memcache_oidmaps=<value optimized out>) at main/pgpool_main.c:479
> #8  0x00000000004058eb in main (argc=<value optimized out>, argv=<value optimized out>)
>     at main/main.c:300
> 
> 
> Thanks,
> Jeremiah
> 
> 
> -----Original Message-----
> From: pgpool-general-bounces at pgpool.net [mailto:pgpool-general-bounces at pgpool.net] On Behalf Of Jeremiah Penery
> Sent: Tuesday, August 01, 2017 12:06 PM
> To: pgpool-general at pgpool.net
> Subject: [pgpool-general: 5669] Segfault in child process
> 
> Pgpool-II 3.6.4, Postgres 9.6.3
> 
> We're occasionally seeing segfaults in pgpool child processes, but it's not clear why.  It's causing us some major problems, because it often happens in a transaction, and eventually the application connection pool hangs with all connections in use (stuck in transaction).
> 
> I've attached some logs from the process in question.  Is there some better way to get information on why this is happening?
> 
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3196
> 2017-08-01 11:52:00: pid 11834: DEBUG:  session context: setting query in progress. DONE
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_session_context.c:226
> 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_virtual_master_db_node_id: virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_query_context.c:330
> 2017-08-01 11:52:00: pid 11834: DEBUG:  reading backend data packet kind
> 2017-08-01 11:52:00: pid 11834: DETAIL:  master node id: 0
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3207
> 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_virtual_master_db_node_id: virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_query_context.c:330
> 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_read: read 100 bytes from backend 0
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_stream.c:190
> 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_virtual_master_db_node_id: virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_query_context.c:330
> 2017-08-01 11:52:00: pid 11834: DEBUG:  reading backend data packet kind
> 2017-08-01 11:52:00: pid 11834: DETAIL:  backend:0 kind:'2'
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3267
> 2017-08-01 11:52:00: pid 11834: DEBUG:  reading backend data packet kind
> 2017-08-01 11:52:00: pid 11834: DETAIL:  backend:0 of 1 kind = '2'
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3315
> 2017-08-01 11:52:00: pid 11834: DEBUG:  read_kind_from_backend max_count:1.000000 num_executed_nodes:1
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3331
> 2017-08-01 11:52:00: pid 11834: DEBUG:  read_kind_from_backend: pending message was pulled out
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3621
> 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_pending_message_pull_out: message type:2 message len:5 query:SELECT * FROM quartz.qrtz_SCHEDULER_STATE WHERE SCHED_NAME = 'scheduler' statement: portal: node_ids[0]:0 node_ids[1]:-1
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_session_context.c:1218
> 2017-08-01 11:52:00: pid 11834: DEBUG:  processing backend response
> 2017-08-01 11:52:00: pid 11834: DETAIL:  received kind '2'(32) from backend
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_proto_modules.c:2548
> 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_virtual_master_db_node_id: virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_query_context.c:330
> 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_virtual_master_db_node_id: virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_query_context.c:330
> 2017-08-01 11:52:00: pid 11834: DEBUG:  pool_virtual_master_db_node_id: virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_query_context.c:330
> 2017-08-01 11:52:00: pid 11834: DEBUG:  SimpleForwardToFrontend: packet:2 length:0
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:818
> 2017-08-01 11:52:00: pid 11834: DEBUG:  session context: setting command success. DONE
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_session_context.c:773
> 2017-08-01 11:52:00: pid 11834: DEBUG:  session context: unsetting query in progress. DONE
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_session_context.c:237
> 2017-08-01 11:52:00: pid 11834: DEBUG:  read_kind_from_backend: no pending message
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3143
> 2017-08-01 11:52:00: pid 11834: DEBUG:  read_kind_from_backend: no pending message, previous message exists, rows returning
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_process_query.c:3167
> 2017-08-01 11:52:00: pid 11834: DEBUG:  session context: setting query in progress. DONE
> 2017-08-01 11:52:00: pid 11834: LOCATION:  pool_session_context.c:226
> 2017-08-01 11:52:00: pid 10825: DEBUG:  reaper handler
> 2017-08-01 11:52:00: pid 10825: LOCATION:  pgpool_main.c:2391
> 2017-08-01 11:52:00: pid 10825: WARNING:  child process with pid: 11834 was terminated by segmentation fault
> 
> Thanks,
> Jeremiah
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list