[pgpool-general: 9505] Re: Weird error in pgpool 4.6 logs: WARNING: failed to lock semaphore

TV tvfan2014 at gmail.com
Wed Jun 4 02:25:14 JST 2025


Hi,
Gave it a try, although we went with option #2 (RemoveIPC=no) as it was
quicker to do.  After 20+ minutes, the semaphores are still present and
there are no errors in the log.  Tomorrow I will discuss with our sysadmins
how we can implement option #1, as the UID of postgres user is above
SYS_UID_MAX.  Maybe we won't go through changing postgres user's UID,
changing the SYS_UID_MAX to higher number might be ok as we only have a
couple users above that number.

Also, I forgot to mention - our repo showed the newer (4.6.2) version as
available so I upgraded the packages and restarted pgpool, but that ipv6
error at start up is still present.  It's more of a nuisance more than
anything else, but figured I would let you know:

2025-06-03 18:56:39: pid 1950700: WARNING:  failed to create watchdog
heartbeat receive socket as much intended
2025-06-03 18:56:39: pid 1950700: DETAIL:  only 1 out of 2 socket(s) had
been created
2025-06-03 18:56:39: pid 1950700: LOCATION:  wd_heartbeat.c:428
2025-06-03 18:56:39: pid 1950704: LOG:  failed to create watchdog heartbeat
receive socket
2025-06-03 18:56:39: pid 1950704: LOCATION:  wd_heartbeat.c:408
2025-06-03 18:56:39: pid 1950704: WARNING:  failed to create watchdog
heartbeat receive socket as much intended
2025-06-03 18:56:39: pid 1950704: DETAIL:  only 1 out of 2 socket(s) had
been created
2025-06-03 18:56:39: pid 1950704: LOCATION:  wd_heartbeat.c:428

Earlier, when I had debugging enabled, it was still showing unable to
create this socket:

2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: LOG:  creating
watchdog heartbeat receive socket.

2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: DETAIL:  creating
socket on 127.0.0.1:9694

2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: LOCATION:
wd_heartbeat.c:385

2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: LOG:  set
SO_REUSEPORT option to the socket

2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: LOCATION:
wd_heartbeat.c:891

2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: LOG:  creating
watchdog heartbeat receive socket.

2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: DETAIL:  *creating
socket on ::1:9694*

2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: LOCATION:
wd_heartbeat.c:385

2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: LOG:  failed to
create watchdog heartbeat receive socket. retrying...

2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: DETAIL:  bind failed
with reason: "Cannot assign requested address"
Doesn't seem to affect the pgpool functionality, I'd prefer to disable it
altogether if possible to have 'clean' logs, if possible.  So far, tried to
change listen_addresses and pcp_listen_addresses in pgpool.conf from "*" to
"0.0.0.0" but that didn't resolve it.  Just curious if there's a way to
skip ipv6 functionality?


On Tue, Jun 3, 2025 at 1:57 PM Tatsuo Ishii <ishii at postgresql.org> wrote:

> > Hi,
> >
> > Thank you for the great information!
> >
> >> Hi,
> >> We've made some progress here and I'd like to get your feedback
> regarding
> >> the fix/workaround.  Basically, after some research (incl.
> google/chatgpt
> >> etc), we came across this information:
> >>
> >> 🧠 1. systemd-logind manages user sessions
> >>
> >> In Ubuntu 24.04 (and since ~15.04), systemd manages user sessions. When
> a
> >> user session ends (e.g. when a background process or login shell
> finishes),
> >> the following happens:
> >>
> >>    -
> >>
> >>    The session scope (session-XXXX.scope) is stopped.
> >>    -
> >>
> >>    The user manager (user at UID.service) is also stopped *unless
> lingering is
> >>    enabled*.
> >>
> >> 🧨 If no processes remain and the user is not lingering:
> >>
> >>    -
> >>
> >>    Shared memory and *System V IPC resources* (like semaphores) *may be
> >>    destroyed* automatically if no one is using them.
> >>    -
> >>
> >>    If a process (like pgpool2 or PostgreSQL) relied on those semaphores,
> >>    you may later see:
> >>
> >>    failed to lock semaphore
> >>
> >>
> >> This is a strong sign that the user session ended and IPC resources
> (like
> >> semaphores) were destroyed.
> >>
> >> This does seem to be happening here, as I see these in /var/log/syslog
> >> around that time we discussed in last posting, just prior to semaphore
> >> errors showing in pgpool logs:
> >> 2025-06-02T12:45:17.173486+02:00 db-replica3 systemd[1]:
> user at 3000.service:
> >> Deactivated successfully.
> >> 2025-06-02T12:45:17.173997+02:00 db-replica3 systemd[1]: Stopped
> >> user at 3000.service - User Manager for UID 3000.
> >> 2025-06-02T12:45:17.240314+02:00 db-replica3 systemd[1]: Stopping
> >> user-runtime-dir at 3000.service - User Runtime Directory
> /run/user/3000...
> >> 2025-06-02T12:45:17.248017+02:00 db-replica3 systemd[1]:
> >> run-user-3000.mount: Deactivated successfully.
> >> 2025-06-02T12:45:17.249107+02:00 db-replica3 systemd[1]:
> >> user-runtime-dir at 3000.service: Deactivated successfully.
> >> 2025-06-02T12:45:17.249353+02:00 db-replica3 systemd[1]: Stopped
> >> user-runtime-dir at 3000.service - User Runtime Directory /run/user/3000.
> >> 2025-06-02T12:45:17.251672+02:00 db-replica3 systemd[1]: Removed slice
> >> user-3000.slice - User Slice of UID 3000.
> >> 2025-06-02T12:45:23.581108+02:00 db-replica3 strace[1692927]: [pid
> 1693495]
> >> 12:45:23.580360 semtimedop(41, [{sem_num=6, sem_op=-1,
> sem_flg=SEM_UNDO}],
> >> 1, NULL) = -1 EINVAL (Invalid argument)
> >>
> >> It also advises to add Type=forking to the [Service] section of the
> >> pgpool.service file, which we don't currently have, I think I posted it
> >> earlier, but just in case here is relevant part of our service file:
> >>
> >> [Unit]
> >> Description=Pgpool-II
> >> After=syslog.target network.target postgresql.service
> >> Wants=postgresql.service
> >>
> >> [Service]
> >> User=postgres
> >> Group=postgres
> >>
> >> EnvironmentFile=-/etc/default/pgpool2
> >>
> >> ExecStart=/usr/sbin/pgpool -f /etc/pgpool2/pgpool.conf $OPTS
> >> ExecStop=/usr/sbin/pgpool -f /etc/pgpool2/pgpool.conf $STOP_OPTS stop
> >> ExecReload=/usr/sbin/pgpool -f /etc/pgpool2/pgpool.conf reload
> >>
> >> -------------------
> >> One bit I read that is different is the Type=forking and adding -n is
> >> required (as you can see above, we don't have that right now:
> >> -
> >>
> >> User=postgres: Assumes pgpool should run as the postgres user. Adjust if
> >> different.
> >> -
> >>
> >> ExecStart=/usr/sbin/pgpool -n: Use the *correct path and arguments* for
> >> your installation.
> >> -
> >>
> >> pgpool must be configured to *not daemonize* (-n flag), or systemd will
> >> lose track of it.
> >> -
> >>
> >> You can add environment variables for config files, ports, etc.
> >> -----------------------
> >>
> >> Anyways, our sysadmins suggested changing the user postgres to have
> Linger
> >> option on:
> >> $ sudo loginctl show-user postgres -p Linger
> >> Linger=yes
> >> Once we did that, the semaphore issue disappeared, it's been running for
> >> over an hour now and ipcs shows them:
> >> $ ipcs -s
> >>
> >> ------ Semaphore Arrays --------
> >> key        semid      owner      perms      nsems
> >> 0x00000000 6          postgres   600        8
> >>
> >>
> >> So, what do you think about this fix? I read that letting the user have
> >> linger on is not generally recommended, but that seems to be the only
> way
> >> to keep the semaphores from disappearing right now...
> >
> > I am not familiar with this area. I will discuss with other
> > pgpool/PostgreSQL developers and reply back to you.
>
> I found this in the PostgreSQL manual:
>
> https://www.postgresql.org/docs/current/kernel-resources.html#SYSTEMD-REMOVEIPC
>
> If my reading is correct, we can do one of:
>
> 1) register the user to start pgpool as a "system user".
>
> 2) change the line "RemoveIPC=no" in /etc/systemd/logind.conf .
>
> For me, #1 is better, since #2 affects to other servers in the system.
>
> BTW, I think the doc does not suggest to use Linger option.
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20250603/85c2fffe/attachment-0001.htm>


More information about the pgpool-general mailing list