[pgpool-general: 1371] Re: 3.2.1 segfaults at startup on Fedora17

Yugo Nagata nagata at sraoss.co.jp
Wed Feb 6 10:40:21 JST 2013


On Tue, 5 Feb 2013 13:18:50 -0800
Lonni J Friedman <netllama at gmail.com> wrote:

> On Thu, Jan 31, 2013 at 6:32 PM, Yugo Nagata <nagata at sraoss.co.jp> wrote:
> > Hello Lonni,Tomas,
> >
> > Fix patch was submitted in bug tracking system.
> > http://www.pgpool.net/mantisbt/view.php?id=48
> >
> > I confirmed that this resolve the problem.
> > I attached the same patch. Could you please try it?
> 
> Confirmed, the patch eliminates the segfault when use_watchdog=on with
> Fedora17.  Thanks for fixing it.
> 
> Can I assume that this patch will be applied to the next official 3.2.x release?

Yes. This fix is applied to 3.2.2 that will be released this week end.

> 
> >
> > Yugo Nagata.
> >
> > On Tue, 8 Jan 2013 12:41:34 -0800
> > Lonni J Friedman <netllama at gmail.com> wrote:
> >
> >> Tatsuo,
> >> Is there any estimate on when this bug will be resolved?
> >>
> >> thanks
> >>
> >> On Sat, Dec 1, 2012 at 5:15 AM,  <maniac at localhost.sk> wrote:
> >> > Hello Tatsuo,
> >> >
> >> > there is additional error shown about failure of execution /bin/ping, while I can see it worked OK (providing normalne results):
> >> >
> >> > 2012-12-01 13:47:55 LOG:   pid 9875: wd_chk_sticy: all commands have sticky bit
> >> > 2012-12-01 13:47:55 LOG:   pid 9875: watchdog might call network commands which using sticky bit.
> >> > 2012-12-01 13:47:57 DEBUG: pid 9875: get_result: ping data: PING x.y.2.1 (x.y.2.1) 56(84) bytes of data.
> >> >
> >> > --- x.y.2.1 ping statistics ---
> >> > 3 packets transmitted, 3 received, 0% packet loss, time 1999ms
> >> > rtt min/avg/max/mdev = 0.307/0.329/0.343/0.026 ms
> >> >
> >> > 2012-12-01 13:47:57 LOG:   pid 9875: wd_create_send_socket: connect() reports failure (Connection refused). You can safely ignore this while starting up.
> >> > 2012-12-01 13:48:00 ERROR: pid 9875: exec_ping: /bin/ping exited abnormaly
> >> > Segmentation fault
> >> >
> >> >
> >> > I am attaching full output of stdout+stderr and the same also for version with strace.
> >> >
> >> > The strace was run under user root (and not as user pgsql) because I think there is issue with strace and suid binary /bin/ping:
> >> > # ls -l /bin/ping
> >> > -rwsr-x--- 1 root pgsql 33384 Jul 18 18:59 /bin/ping
> >> >
> >> >
> >> > But I do not think it's problem of ping as once at least one other pgpool instance is up, then pgpool starts OK. It segfaults only in case when no other pgpool instance is running.
> >> >
> >> >
> >> > Tomas
> >> >
> >> >
> >> >
> >> >
> >> > On Sat, Dec 01, 2012 at 09:30:44AM +0900, Tatsuo Ishii wrote:
> >> >> Did it show any error message before segfault?
> >> >> --
> >> >> Tatsuo Ishii
> >> >> SRA OSS, Inc. Japan
> >> >> English: http://www.sraoss.co.jp/index_en.php
> >> >> Japanese: http://www.sraoss.co.jp
> >> >>
> >> >> > I grabbed pgpool2-860cb3e.tar.gz from your gitweb server, rebuilt on
> >> >> > Fedora17, and attempted to start pgpool.  segfault still happens at
> >> >> > startup, so its not fixed.
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Wed, Nov 28, 2012 at 11:39 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
> >> >> >> I have took a look at watchdog code and made some fixes on it. The
> >> >> >> file was wd_ping.c, which is might or might not related to the problem
> >> >> >> you have. Can you grab the git master head and try it out? Hope error
> >> >> >> reporting codes added to wd_ping.c may reveals the cause of problem at
> >> >> >> least.
> >> >> >> --
> >> >> >> Tatsuo Ishii
> >> >> >> SRA OSS, Inc. Japan
> >> >> >> English: http://www.sraoss.co.jp/index_en.php
> >> >> >> Japanese: http://www.sraoss.co.jp
> >> >> >>
> >> >> >>> Hello,
> >> >> >>>
> >> >> >>> I have simmilar problem on Slacware64 v14 with glibc2.15 and gcc 4.7.1 .
> >> >> >>> Also have tried the latest GIT version of pgpool - the same result.
> >> >> >>>
> >> >> >>> Symptoms:
> >> >> >>> Once I enable watchdog and start it on one server it segfaults when it can't
> >> >> >>> find other pgpool watchdog on network already. Sometimes it eventually start and
> >> >> >>> then all is OK - pgpool on other server starts fine as watchdog on the other server
> >> >> >>> already can connect to watchdog on the first server.
> >> >> >>> Sometimes means X tries to run it, where X is purely random, usually on 5-30 try it starts and work fine then..
> >> >> >>> Of course I have to clean semaphores and shared memory via the ipcrm as
> >> >> >>> those never get unregistered after pgpool segfault crash..
> >> >> >>>
> >> >> >>> When I tried to compile pgpool on older system (Slackware64 v13.1.0) with
> >> >> >>> glibc2.11 and gcc 4.4.4 and copy the pgpool binary on Slackware64 v14 it
> >> >> >>> just works on the Slackware v14.
> >> >> >>>
> >> >> >>> Here is the backtrace of the binary compiled on Slackware64 v14:
> >> >> >>>
> >> >> >>> gsql at wwwpri:~$ /usr/local/bin/pgpool -f /usr/local/etc/pgpool.conf -n -D
> >> >> >>> 2012-11-20 19:57:12 LOG:   pid 10639: wd_chk_sticy: all commands have sticky bit
> >> >> >>> 2012-11-20 19:57:12 LOG:   pid 10639: watchdog might call network commands which using sticky bit.
> >> >> >>> 2012-11-20 19:57:12 LOG:   pid 10639: Backend status file /tmp/pgpool_status discarded
> >> >> >>> 2012-11-20 19:57:14 LOG:   pid 10639: wd_create_send_socket: connect() reports failure (Connection refused). You can safely ignore this while starting up.
> >> >> >>> Segmentation fault (core dumped)
> >> >> >>> pgsql at wwwpri:~$
> >> >> >>>
> >> >> >>> pgsql at wwwpri:~$ gdb /usr/local/bin/pgpool core
> >> >> >>> GNU gdb (GDB) 7.5
> >> >> >>> Copyright (C) 2012 Free Software Foundation, Inc.
> >> >> >>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> >> >> >>> This is free software: you are free to change and redistribute it.
> >> >> >>> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> >> >> >>> and "show warranty" for details.
> >> >> >>> This GDB was configured as "x86_64-slackware-linux".
> >> >> >>> For bug reporting instructions, please see:
> >> >> >>> <http://www.gnu.org/software/gdb/bugs/>...
> >> >> >>> Reading symbols from /usr/local/bin/pgpool...done.
> >> >> >>> [New LWP 10639]
> >> >> >>>
> >> >> >>> warning: Could not load shared library symbols for linux-vdso.so.1.
> >> >> >>> Do you need "set solib-search-path" or "set sysroot"?
> >> >> >>> [Thread debugging using libthread_db enabled]
> >> >> >>> Using host libthread_db library "/lib64/libthread_db.so.1".
> >> >> >>> Core was generated by /usr/local/bin/pgpool -f /usr/local/etc/pgpool.conf -n -D'.
> >> >> >>> Program terminated with signal 11, Segmentation fault.
> >> >> >>> #0  0x00007fc0882db040 in pthread_detach () from /lib64/libpthread.so.0
> >> >> >>> (gdb) backtrace
> >> >> >>> #0  0x00007fc0882db040 in pthread_detach () from /lib64/libpthread.so.0
> >> >> >>> #1  0x0000000000478577 in wd_is_unused_ip (ip=<optimized out>) at wd_ping.c:159
> >> >> >>> #2  0x00000000004764fd in wd_init () at wd_init.c:91
> >> >> >>> #3  0x0000000000475817 in wd_main (fork_wait_time=1) at watchdog.c:117
> >> >> >>> #4  0x000000000040634f in main (argc=<optimized out>, argv=<optimized out>) at main.c:642
> >> >> >>> (gdb)
> >> >> >>>
> >> >> >>> When I disable watchdog in pgpool.conf then pgpool works OK, but I need virtual IP so I have to use watchdog..
> >> >> >>>
> >> >> >>> To me it seems that pgpool-ii 3.2.1 (or git version) is not compatible with latest system (either glibc/pthread or gcc).
> >> >> >>>
> >> >> >>> I can provide additional data if required, but unfortunatelly can't find the root cause by my own :(
> >> >> >>>
> >> >> >>> Thank you,
> >> >> >>>
> >> >> >>> Tomas


-- 
Yugo Nagata <nagata at sraoss.co.jp>


More information about the pgpool-general mailing list