<div dir="ltr"><div>Hi,</div><div>Gave it a try, although we went with option #2 (RemoveIPC=no) as it was quicker to do. After 20+ minutes, the semaphores are still present and there are no errors in the log. Tomorrow I will discuss with our sysadmins how we can implement option #1, as the UID of postgres user is above <code class="gmail-symbol">SYS_UID_MAX</code>. Maybe we won't go through changing postgres user's UID, changing the SYS_UID_MAX to higher number might be ok as we only have a couple users above that number.</div><div><br></div><div>Also, I forgot to mention - our repo showed the newer (4.6.2) version as available so I upgraded the packages and restarted pgpool, but that ipv6 error at start up is still present. It's more of a nuisance more than anything else, but figured I would let you know:</div><div><br></div><div>2025-06-03 18:56:39: pid 1950700: WARNING: failed to create watchdog heartbeat receive socket as much intended<br>2025-06-03 18:56:39: pid 1950700: DETAIL: only 1 out of 2 socket(s) had been created<br>2025-06-03 18:56:39: pid 1950700: LOCATION: wd_heartbeat.c:428<br>2025-06-03 18:56:39: pid 1950704: LOG: failed to create watchdog heartbeat receive socket<br>2025-06-03 18:56:39: pid 1950704: LOCATION: wd_heartbeat.c:408<br>2025-06-03 18:56:39: pid 1950704: WARNING: failed to create watchdog heartbeat receive socket as much intended<br>2025-06-03 18:56:39: pid 1950704: DETAIL: only 1 out of 2 socket(s) had been created<br>2025-06-03 18:56:39: pid 1950704: LOCATION: wd_heartbeat.c:428<br></div><div><br></div><div>Earlier, when I had debugging enabled, it was still showing unable to create this socket:</div><div><span><p>2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: LOG: creating watchdog heartbeat receive socket.</p><p>2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: DETAIL: creating socket on <a href="http://127.0.0.1:9694">127.0.0.1:9694</a></p><p>2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: LOCATION: wd_heartbeat.c:385</p><p>2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: LOG: set SO_REUSEPORT option to the socket</p><p>2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: LOCATION: wd_heartbeat.c:891</p><p>2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: LOG: creating watchdog heartbeat receive socket.</p><p>2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: DETAIL: <b>creating socket on ::1:9694</b></p><p>2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: LOCATION: wd_heartbeat.c:385</p><p>2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: LOG: failed to create watchdog heartbeat receive socket. retrying...</p><p>2025-06-03 11:57:24.833: heart_beat_receiver pid 4005: DETAIL: bind failed with reason: "Cannot assign requested address"</p></span>Doesn't seem to affect the pgpool functionality, I'd prefer to disable it altogether if possible to have 'clean' logs, if possible. So far, tried to change listen_addresses and pcp_listen_addresses in pgpool.conf from "*" to "0.0.0.0" but that didn't resolve it. Just curious if there's a way to skip ipv6 functionality?</div><div><br></div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Tue, Jun 3, 2025 at 1:57 PM Tatsuo Ishii <<a href="mailto:ishii@postgresql.org">ishii@postgresql.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">> Hi,<br>
> <br>
> Thank you for the great information!<br>
> <br>
>> Hi,<br>
>> We've made some progress here and I'd like to get your feedback regarding<br>
>> the fix/workaround. Basically, after some research (incl. google/chatgpt<br>
>> etc), we came across this information:<br>
>> <br>
>> 🧠 1. systemd-logind manages user sessions<br>
>> <br>
>> In Ubuntu 24.04 (and since ~15.04), systemd manages user sessions. When a<br>
>> user session ends (e.g. when a background process or login shell finishes),<br>
>> the following happens:<br>
>> <br>
>> -<br>
>> <br>
>> The session scope (session-XXXX.scope) is stopped.<br>
>> -<br>
>> <br>
>> The user manager (user@UID.service) is also stopped *unless lingering is<br>
>> enabled*.<br>
>> <br>
>> 🧨 If no processes remain and the user is not lingering:<br>
>> <br>
>> -<br>
>> <br>
>> Shared memory and *System V IPC resources* (like semaphores) *may be<br>
>> destroyed* automatically if no one is using them.<br>
>> -<br>
>> <br>
>> If a process (like pgpool2 or PostgreSQL) relied on those semaphores,<br>
>> you may later see:<br>
>> <br>
>> failed to lock semaphore<br>
>> <br>
>> <br>
>> This is a strong sign that the user session ended and IPC resources (like<br>
>> semaphores) were destroyed.<br>
>> <br>
>> This does seem to be happening here, as I see these in /var/log/syslog<br>
>> around that time we discussed in last posting, just prior to semaphore<br>
>> errors showing in pgpool logs:<br>
>> 2025-06-02T12:45:17.173486+02:00 db-replica3 systemd[1]: user@3000.service:<br>
>> Deactivated successfully.<br>
>> 2025-06-02T12:45:17.173997+02:00 db-replica3 systemd[1]: Stopped<br>
>> user@3000.service - User Manager for UID 3000.<br>
>> 2025-06-02T12:45:17.240314+02:00 db-replica3 systemd[1]: Stopping<br>
>> user-runtime-dir@3000.service - User Runtime Directory /run/user/3000...<br>
>> 2025-06-02T12:45:17.248017+02:00 db-replica3 systemd[1]:<br>
>> run-user-3000.mount: Deactivated successfully.<br>
>> 2025-06-02T12:45:17.249107+02:00 db-replica3 systemd[1]:<br>
>> user-runtime-dir@3000.service: Deactivated successfully.<br>
>> 2025-06-02T12:45:17.249353+02:00 db-replica3 systemd[1]: Stopped<br>
>> user-runtime-dir@3000.service - User Runtime Directory /run/user/3000.<br>
>> 2025-06-02T12:45:17.251672+02:00 db-replica3 systemd[1]: Removed slice<br>
>> user-3000.slice - User Slice of UID 3000.<br>
>> 2025-06-02T12:45:23.581108+02:00 db-replica3 strace[1692927]: [pid 1693495]<br>
>> 12:45:23.580360 semtimedop(41, [{sem_num=6, sem_op=-1, sem_flg=SEM_UNDO}],<br>
>> 1, NULL) = -1 EINVAL (Invalid argument)<br>
>> <br>
>> It also advises to add Type=forking to the [Service] section of the<br>
>> pgpool.service file, which we don't currently have, I think I posted it<br>
>> earlier, but just in case here is relevant part of our service file:<br>
>> <br>
>> [Unit]<br>
>> Description=Pgpool-II<br>
>> After=syslog.target network.target postgresql.service<br>
>> Wants=postgresql.service<br>
>> <br>
>> [Service]<br>
>> User=postgres<br>
>> Group=postgres<br>
>> <br>
>> EnvironmentFile=-/etc/default/pgpool2<br>
>> <br>
>> ExecStart=/usr/sbin/pgpool -f /etc/pgpool2/pgpool.conf $OPTS<br>
>> ExecStop=/usr/sbin/pgpool -f /etc/pgpool2/pgpool.conf $STOP_OPTS stop<br>
>> ExecReload=/usr/sbin/pgpool -f /etc/pgpool2/pgpool.conf reload<br>
>> <br>
>> -------------------<br>
>> One bit I read that is different is the Type=forking and adding -n is<br>
>> required (as you can see above, we don't have that right now:<br>
>> -<br>
>> <br>
>> User=postgres: Assumes pgpool should run as the postgres user. Adjust if<br>
>> different.<br>
>> -<br>
>> <br>
>> ExecStart=/usr/sbin/pgpool -n: Use the *correct path and arguments* for<br>
>> your installation.<br>
>> -<br>
>> <br>
>> pgpool must be configured to *not daemonize* (-n flag), or systemd will<br>
>> lose track of it.<br>
>> -<br>
>> <br>
>> You can add environment variables for config files, ports, etc.<br>
>> -----------------------<br>
>> <br>
>> Anyways, our sysadmins suggested changing the user postgres to have Linger<br>
>> option on:<br>
>> $ sudo loginctl show-user postgres -p Linger<br>
>> Linger=yes<br>
>> Once we did that, the semaphore issue disappeared, it's been running for<br>
>> over an hour now and ipcs shows them:<br>
>> $ ipcs -s<br>
>> <br>
>> ------ Semaphore Arrays --------<br>
>> key semid owner perms nsems<br>
>> 0x00000000 6 postgres 600 8<br>
>> <br>
>> <br>
>> So, what do you think about this fix? I read that letting the user have<br>
>> linger on is not generally recommended, but that seems to be the only way<br>
>> to keep the semaphores from disappearing right now...<br>
> <br>
> I am not familiar with this area. I will discuss with other<br>
> pgpool/PostgreSQL developers and reply back to you.<br>
<br>
I found this in the PostgreSQL manual:<br>
<a href="https://www.postgresql.org/docs/current/kernel-resources.html#SYSTEMD-REMOVEIPC" rel="noreferrer" target="_blank">https://www.postgresql.org/docs/current/kernel-resources.html#SYSTEMD-REMOVEIPC</a><br>
<br>
If my reading is correct, we can do one of:<br>
<br>
1) register the user to start pgpool as a "system user".<br>
<br>
2) change the line "RemoveIPC=no" in /etc/systemd/logind.conf .<br>
<br>
For me, #1 is better, since #2 affects to other servers in the system.<br>
<br>
BTW, I think the doc does not suggest to use Linger option.<br>
<br>
Best regards,<br>
--<br>
Tatsuo Ishii<br>
SRA OSS K.K.<br>
English: <a href="http://www.sraoss.co.jp/index_en/" rel="noreferrer" target="_blank">http://www.sraoss.co.jp/index_en/</a><br>
Japanese:<a href="http://www.sraoss.co.jp" rel="noreferrer" target="_blank">http://www.sraoss.co.jp</a><br>
</blockquote></div>