[pgpool-hackers: 2907] Re: segmentation fault in pgpool child

Tatsuo Ishii ishii at sraoss.co.jp
Thu Aug 2 11:53:09 JST 2018


The cause of the segfault is cp == 0.

MASTER_CONNECTION referes to the connection to "master"
node. "Master" means the first live backend appearing in
pgpool.conf. The master node is determined at the time of fail over.
Unfortunately with both health check and fail_over_on_backend_error
are disabled, there's no chance of failover, which means the master
node id is remained the default value 0. So the MASTER_CONNECTION
referes to the node 0, and the connection is NULL (cp == 0).

Attached is the proposed patch to fix the problem.

What is does is:

If attempt to connection to backend fails, check the master node id in
the shared memory.  If the master node id is the failed node, then
look for new master node using get_next_master_node (this was a static
function, but now it's made to public) and set the node id to the
master node id in the shared memory area.

All the regression tests passed.

> Hi Ishii-San
> 
> Pgpool-II child process is terminating with segmentation fault and the
> issue seems to be similar to the one you fixed recently with always_master
> mode.
> 
> The crash is happening in the  MASTER_CONNECTION() macro
> 
> pool_do_auth() line 78
> protoMajor = MASTER_CONNECTION(cp)->sp->major;
> 
> 
> Steps to reproduce
> ====
> 1- point the backend0 to slave PostgreSQL node and backend1 to the master
> PostgreSQL node
> 
> backend_hostname0 = slave_PG
> backend_hostname1 = master_PG
> 
> 2- disable health check
> health_check_period = 0
> 
> 3- disable failover on backend error
> fail_over_on_backend_error = off
> 
> 4- shutdown the slave PostgreSQL server.
> 
> 5- start Pgpool
> 
> 6- No try connecting to Pgpool
> 
> ]$ bin/psql postgres -p 9999 -U pgpool
> psql: server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> 
> 
> Related pgpool logs
> ===========
> 2018-08-01 17:17:46: pid 4422: DEBUG:  I am 4422 accept fd 7
> 2018-08-01 17:17:46: pid 4422: LOCATION:  child.c:2187
> 2018-08-01 17:17:46: pid 4422: DEBUG:  reading startup packet
> 2018-08-01 17:17:46: pid 4422: DETAIL:  application_name: psql
> 2018-08-01 17:17:46: pid 4422: LOCATION:  child.c:588
> 2018-08-01 17:17:46: pid 4422: DEBUG:  reading startup packet
> 2018-08-01 17:17:46: pid 4422: DETAIL:  Protocol Major: 3 Minor: 0
> database: postgres user: pgpool
> 2018-08-01 17:17:46: pid 4422: LOCATION:  child.c:630
> 2018-08-01 17:17:46: pid 4422: DEBUG:  creating new connection to backend
> 2018-08-01 17:17:46: pid 4422: DETAIL:  connecting 0 backend
> 2018-08-01 17:17:46: pid 4422: LOCATION:  pool_connection_pool.c:837
> 2018-08-01 17:17:46: pid 4422: LOG:  failed to connect to PostgreSQL server
> on "127.0.0.1:5432", getsockopt() detected error "Connection refused"
> 2018-08-01 17:17:46: pid 4422: LOCATION:  pool_connection_pool.c:660
> 2018-08-01 17:17:46: pid 4422: LOG:  failed to create a backend 0 connection
> 2018-08-01 17:17:46: pid 4422: DETAIL:  skip this backend because because
> fail_over_on_backend_error is off and we are in streaming replication mode
> and node is standby node
> 2018-08-01 17:17:46: pid 4422: LOCATION:  pool_connection_pool.c:888
> 2018-08-01 17:17:46: pid 4422: DEBUG:  creating new connection to backend
> 2018-08-01 17:17:46: pid 4422: DETAIL:  connecting 1 backend
> 2018-08-01 17:17:46: pid 4422: LOCATION:  pool_connection_pool.c:837
> 2018-08-01 17:17:46: pid 4422: DEBUG:  attempting to negotiate a secure
> connection
> 2018-08-01 17:17:46: pid 4422: DETAIL:  sending client->server SSL request
> 2018-08-01 17:17:46: pid 4422: LOCATION:  pool_ssl.c:77
> 2018-08-01 17:17:46: pid 4422: DEBUG:  pool_flush_it: flush size: 8
> 2018-08-01 17:17:46: pid 4422: LOCATION:  pool_stream.c:614
> 2018-08-01 17:17:46: pid 4422: DEBUG:  pool_read: read 1 bytes from backend
> 1
> 2018-08-01 17:17:46: pid 4422: LOCATION:  pool_stream.c:191
> 2018-08-01 17:17:46: pid 4422: DEBUG:  attempting to negotiate a secure
> connection
> 2018-08-01 17:17:46: pid 4422: DETAIL:  client->server SSL response: N
> 2018-08-01 17:17:46: pid 4422: LOCATION:  pool_ssl.c:89
> 2018-08-01 17:17:46: pid 4422: DEBUG:  attempting to negotiate a secure
> connection
> 2018-08-01 17:17:46: pid 4422: DETAIL:  server doesn't want to talk SSL
> 2018-08-01 17:17:46: pid 4422: LOCATION:  pool_ssl.c:105
> 2018-08-01 17:17:46: pid 4422: DEBUG:  pool_flush_it: flush size: 82
> 2018-08-01 17:17:46: pid 4422: LOCATION:  pool_stream.c:614
> 2018-08-01 17:17:46: pid 4421: DEBUG:  reaper handler
> 2018-08-01 17:17:46: pid 4421: LOCATION:  pgpool_main.c:2411
> 2018-08-01 17:17:46: pid 4421: WARNING:  child process with pid: 4422 was
> terminated by segmentation fault
> 2018-08-01 17:17:46: pid 4421: LOCATION:  pgpool_main.c:2462
> 
> backtrace (generated from V3_7_STABLE head)
> =========
> #0  0x000000000041d683 in pool_do_auth (frontend=frontend at entry=0x22dd818,
> cp=cp at entry=0x7f2b29382358) at auth/pool_auth.c:78
> #1  0x0000000000424ccf in connect_backend (sp=sp at entry=0x22df768,
> frontend=frontend at entry=0x22dd818) at protocol/child.c:911
> #2  0x0000000000426c9f in get_backend_connection (frontend=0x22dd818) at
> protocol/child.c:2352
> #3  do_child (fds=fds at entry=0x22d5710) at protocol/child.c:338
> #4  0x0000000000408b85 in fork_a_child (fds=0x22d5710, id=0) at
> main/pgpool_main.c:611
> #5  0x000000000040e88b in PgpoolMain (discard_status=discard_status at entry=1
> '\001', clear_memcache_oidmaps=clear_memcache_oidmaps at entry=0 '\000')
>     at main/pgpool_main.c:367
> #6  0x0000000000407071 in main (argc=<optimized out>, argv=<optimized out>)
> at main/main.c:318
> 
> 
> 
> Thanks
> Best Regards
> Muhammad Usama
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fix_master_node_down.diff
Type: text/x-patch
Size: 2109 bytes
Desc: not available
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20180802/5f5066fc/attachment.bin>


More information about the pgpool-hackers mailing list