[pgpool-general: 9504] Re: Weird error in pgpool 4.6 logs: WARNING: failed to lock semaphore

Tatsuo Ishii ishii at postgresql.org
Tue Jun 3 20:57:31 JST 2025


> Hi,
> 
> Thank you for the great information!
> 
>> Hi,
>> We've made some progress here and I'd like to get your feedback regarding
>> the fix/workaround.  Basically, after some research (incl. google/chatgpt
>> etc), we came across this information:
>> 
>> 🧠 1. systemd-logind manages user sessions
>> 
>> In Ubuntu 24.04 (and since ~15.04), systemd manages user sessions. When a
>> user session ends (e.g. when a background process or login shell finishes),
>> the following happens:
>> 
>>    -
>> 
>>    The session scope (session-XXXX.scope) is stopped.
>>    -
>> 
>>    The user manager (user at UID.service) is also stopped *unless lingering is
>>    enabled*.
>> 
>> 🧨 If no processes remain and the user is not lingering:
>> 
>>    -
>> 
>>    Shared memory and *System V IPC resources* (like semaphores) *may be
>>    destroyed* automatically if no one is using them.
>>    -
>> 
>>    If a process (like pgpool2 or PostgreSQL) relied on those semaphores,
>>    you may later see:
>> 
>>    failed to lock semaphore
>> 
>> 
>> This is a strong sign that the user session ended and IPC resources (like
>> semaphores) were destroyed.
>> 
>> This does seem to be happening here, as I see these in /var/log/syslog
>> around that time we discussed in last posting, just prior to semaphore
>> errors showing in pgpool logs:
>> 2025-06-02T12:45:17.173486+02:00 db-replica3 systemd[1]: user at 3000.service:
>> Deactivated successfully.
>> 2025-06-02T12:45:17.173997+02:00 db-replica3 systemd[1]: Stopped
>> user at 3000.service - User Manager for UID 3000.
>> 2025-06-02T12:45:17.240314+02:00 db-replica3 systemd[1]: Stopping
>> user-runtime-dir at 3000.service - User Runtime Directory /run/user/3000...
>> 2025-06-02T12:45:17.248017+02:00 db-replica3 systemd[1]:
>> run-user-3000.mount: Deactivated successfully.
>> 2025-06-02T12:45:17.249107+02:00 db-replica3 systemd[1]:
>> user-runtime-dir at 3000.service: Deactivated successfully.
>> 2025-06-02T12:45:17.249353+02:00 db-replica3 systemd[1]: Stopped
>> user-runtime-dir at 3000.service - User Runtime Directory /run/user/3000.
>> 2025-06-02T12:45:17.251672+02:00 db-replica3 systemd[1]: Removed slice
>> user-3000.slice - User Slice of UID 3000.
>> 2025-06-02T12:45:23.581108+02:00 db-replica3 strace[1692927]: [pid 1693495]
>> 12:45:23.580360 semtimedop(41, [{sem_num=6, sem_op=-1, sem_flg=SEM_UNDO}],
>> 1, NULL) = -1 EINVAL (Invalid argument)
>> 
>> It also advises to add Type=forking to the [Service] section of the
>> pgpool.service file, which we don't currently have, I think I posted it
>> earlier, but just in case here is relevant part of our service file:
>> 
>> [Unit]
>> Description=Pgpool-II
>> After=syslog.target network.target postgresql.service
>> Wants=postgresql.service
>> 
>> [Service]
>> User=postgres
>> Group=postgres
>> 
>> EnvironmentFile=-/etc/default/pgpool2
>> 
>> ExecStart=/usr/sbin/pgpool -f /etc/pgpool2/pgpool.conf $OPTS
>> ExecStop=/usr/sbin/pgpool -f /etc/pgpool2/pgpool.conf $STOP_OPTS stop
>> ExecReload=/usr/sbin/pgpool -f /etc/pgpool2/pgpool.conf reload
>> 
>> -------------------
>> One bit I read that is different is the Type=forking and adding -n is
>> required (as you can see above, we don't have that right now:
>> -
>> 
>> User=postgres: Assumes pgpool should run as the postgres user. Adjust if
>> different.
>> -
>> 
>> ExecStart=/usr/sbin/pgpool -n: Use the *correct path and arguments* for
>> your installation.
>> -
>> 
>> pgpool must be configured to *not daemonize* (-n flag), or systemd will
>> lose track of it.
>> -
>> 
>> You can add environment variables for config files, ports, etc.
>> -----------------------
>> 
>> Anyways, our sysadmins suggested changing the user postgres to have Linger
>> option on:
>> $ sudo loginctl show-user postgres -p Linger
>> Linger=yes
>> Once we did that, the semaphore issue disappeared, it's been running for
>> over an hour now and ipcs shows them:
>> $ ipcs -s
>> 
>> ------ Semaphore Arrays --------
>> key        semid      owner      perms      nsems
>> 0x00000000 6          postgres   600        8
>> 
>> 
>> So, what do you think about this fix? I read that letting the user have
>> linger on is not generally recommended, but that seems to be the only way
>> to keep the semaphores from disappearing right now...
> 
> I am not familiar with this area. I will discuss with other
> pgpool/PostgreSQL developers and reply back to you.

I found this in the PostgreSQL manual:
https://www.postgresql.org/docs/current/kernel-resources.html#SYSTEMD-REMOVEIPC

If my reading is correct, we can do one of:

1) register the user to start pgpool as a "system user".

2) change the line "RemoveIPC=no" in /etc/systemd/logind.conf .

For me, #1 is better, since #2 affects to other servers in the system.

BTW, I think the doc does not suggest to use Linger option.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp


More information about the pgpool-general mailing list