[pgpool-general: 5671] Re: Segfault in child process
Tatsuo Ishii
ishii at sraoss.co.jp
Wed Aug 2 07:42:05 JST 2017
Thanks for the detailed report. Around line 309 of
context/pool_query_context.c:309 is like this:
if (sc->query_context)
{
int node_id = sc->query_context->virtual_master_node_id;
My guess is sc->query_context has some garbage pointer. In fact
sc->query_context is only meaningful when a query is in progress
state. The 3.6 stable head has added additional check here:
if (sc->in_progress && sc->query_context)
{
int node_id = sc->query_context->virtual_master_node_id;
Can you please try this change?
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
> I attached gdb to some of the child processes, and captured a couple of segfaults, which look like this:
>
> Program received signal SIGSEGV, Segmentation fault.
> pool_virtual_master_db_node_id () at context/pool_query_context.c:309
> 309 int node_id = sc->query_context->virtual_master_node_id;
>
> (gdb) bt
> #0 pool_virtual_master_db_node_id () at context/pool_query_context.c:309
> #1 0x000000000042cbb5 in read_kind_from_backend (frontend=0x26343f0, backend=0x2632d50,
> decided_kind=0x7ffdd9c3908f "") at protocol/pool_process_query.c:3205
> #2 0x0000000000431a76 in ProcessBackendResponse (frontend=0x26343f0, backend=0x2632d50,
> state=0x7ffdd9c39108, num_fields=0x7ffdd9c3910e) at protocol/pool_proto_modules.c:2534
> #3 0x000000000042b93d in pool_process_query (frontend=<value optimized out>,
> backend=0x2632d50, reset_request=0) at protocol/pool_process_query.c:303
> #4 0x0000000000424630 in do_child (fds=0x262ae90) at protocol/child.c:377
> #5 0x000000000040659d in fork_a_child (fds=0x262ae90, id=6) at main/pgpool_main.c:755
> #6 0x0000000000408023 in reaper () at main/pgpool_main.c:2525
> #7 0x000000000040c96d in PgpoolMain (discard_status=<value optimized out>,
> clear_memcache_oidmaps=<value optimized out>) at main/pgpool_main.c:479
> #8 0x00000000004058eb in main (argc=<value optimized out>, argv=<value optimized out>)
> at main/main.c:300
>
>
> Thanks,
> Jeremiah
>
>
> -----Original Message-----
> From: pgpool-general-bounces at pgpool.net [mailto:pgpool-general-bounces at pgpool.net] On Behalf Of Jeremiah Penery
> Sent: Tuesday, August 01, 2017 12:06 PM
> To: pgpool-general at pgpool.net
> Subject: [pgpool-general: 5669] Segfault in child process
>
> Pgpool-II 3.6.4, Postgres 9.6.3
>
> We're occasionally seeing segfaults in pgpool child processes, but it's not clear why. It's causing us some major problems, because it often happens in a transaction, and eventually the application connection pool hangs with all connections in use (stuck in transaction).
>
> I've attached some logs from the process in question. Is there some better way to get information on why this is happening?
>
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_process_query.c:3196
> 2017-08-01 11:52:00: pid 11834: DEBUG: session context: setting query in progress. DONE
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_session_context.c:226
> 2017-08-01 11:52:00: pid 11834: DEBUG: pool_virtual_master_db_node_id: virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_query_context.c:330
> 2017-08-01 11:52:00: pid 11834: DEBUG: reading backend data packet kind
> 2017-08-01 11:52:00: pid 11834: DETAIL: master node id: 0
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_process_query.c:3207
> 2017-08-01 11:52:00: pid 11834: DEBUG: pool_virtual_master_db_node_id: virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_query_context.c:330
> 2017-08-01 11:52:00: pid 11834: DEBUG: pool_read: read 100 bytes from backend 0
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_stream.c:190
> 2017-08-01 11:52:00: pid 11834: DEBUG: pool_virtual_master_db_node_id: virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_query_context.c:330
> 2017-08-01 11:52:00: pid 11834: DEBUG: reading backend data packet kind
> 2017-08-01 11:52:00: pid 11834: DETAIL: backend:0 kind:'2'
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_process_query.c:3267
> 2017-08-01 11:52:00: pid 11834: DEBUG: reading backend data packet kind
> 2017-08-01 11:52:00: pid 11834: DETAIL: backend:0 of 1 kind = '2'
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_process_query.c:3315
> 2017-08-01 11:52:00: pid 11834: DEBUG: read_kind_from_backend max_count:1.000000 num_executed_nodes:1
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_process_query.c:3331
> 2017-08-01 11:52:00: pid 11834: DEBUG: read_kind_from_backend: pending message was pulled out
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_process_query.c:3621
> 2017-08-01 11:52:00: pid 11834: DEBUG: pool_pending_message_pull_out: message type:2 message len:5 query:SELECT * FROM quartz.qrtz_SCHEDULER_STATE WHERE SCHED_NAME = 'scheduler' statement: portal: node_ids[0]:0 node_ids[1]:-1
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_session_context.c:1218
> 2017-08-01 11:52:00: pid 11834: DEBUG: processing backend response
> 2017-08-01 11:52:00: pid 11834: DETAIL: received kind '2'(32) from backend
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_proto_modules.c:2548
> 2017-08-01 11:52:00: pid 11834: DEBUG: pool_virtual_master_db_node_id: virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_query_context.c:330
> 2017-08-01 11:52:00: pid 11834: DEBUG: pool_virtual_master_db_node_id: virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_query_context.c:330
> 2017-08-01 11:52:00: pid 11834: DEBUG: pool_virtual_master_db_node_id: virtual_master_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_query_context.c:330
> 2017-08-01 11:52:00: pid 11834: DEBUG: SimpleForwardToFrontend: packet:2 length:0
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_process_query.c:818
> 2017-08-01 11:52:00: pid 11834: DEBUG: session context: setting command success. DONE
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_session_context.c:773
> 2017-08-01 11:52:00: pid 11834: DEBUG: session context: unsetting query in progress. DONE
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_session_context.c:237
> 2017-08-01 11:52:00: pid 11834: DEBUG: read_kind_from_backend: no pending message
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_process_query.c:3143
> 2017-08-01 11:52:00: pid 11834: DEBUG: read_kind_from_backend: no pending message, previous message exists, rows returning
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_process_query.c:3167
> 2017-08-01 11:52:00: pid 11834: DEBUG: session context: setting query in progress. DONE
> 2017-08-01 11:52:00: pid 11834: LOCATION: pool_session_context.c:226
> 2017-08-01 11:52:00: pid 10825: DEBUG: reaper handler
> 2017-08-01 11:52:00: pid 10825: LOCATION: pgpool_main.c:2391
> 2017-08-01 11:52:00: pid 10825: WARNING: child process with pid: 11834 was terminated by segmentation fault
>
> Thanks,
> Jeremiah
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
More information about the pgpool-general
mailing list