View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000799 | Pgpool-II | Bug | public | 2023-05-11 00:28 | 2023-05-16 23:28 |
| Reporter | bt9201340128 | Assigned To | pengbo | ||
| Priority | normal | Severity | minor | Reproducibility | always |
| Status | assigned | Resolution | open | ||
| Product Version | 4.3.5 | ||||
| Summary | 0000799: health_check_period and health_check_max_retries parameters not respected consitently | ||||
| Description | Hello, I noticed that the health_check_period and health_check_max_retries parameters in pgpool.conf appear to only be respected accurately if the PostgreSQL server that is being checked, running on a different host, times out (which implies it's unreachable) vs responding with a "Connection refused" which implies that PostgreSQL is down but the server is up. Here are the 2 different replies associated with the different Pgpool behaviors: The first behavior can be reproduced by shutting down postgres on a different target server, where the health_check_period and health_check_max_retries are *not* respected and the failover is initiated very quickly. # PostgreSQL service is stopped (running on different host than pgpool) root@pgpoolserver:~# telnet 10.1.1.10 5432 Trying 10.1.1.10... telnet: Unable to connect to remote host: Connection refused root@pgpoolserver:~# The second behavior can be reproduced by blocking port 5432 on the postgres target server or by shutting down the target server (but not while pgpool is actively checking). In this scenario, health_check_period and health_check_max_retries parameters in pgpool.conf *are* respected correctly. # 5432 blocked on PostgreSQL server on a different host than pgpool root@pgpoolserver:~# telnet 10.1.1.10 5432 Trying 10.1.1.10... telnet: Unable to connect to remote host: Connection timed out root@pgpoolserver:~# All hosts in this specific environment are running Debian 11. The independent pgpool server is running "pgpool2/bullseye-pgdg,now 4.3.5-1.pgdg110+1 amd64 [installed]". This issue can also be reproduced on RHEL8 and Rocky Linux 8. Thanks, BT | ||||
| Steps To Reproduce | Please see description | ||||
| Tags | No tags attached. | ||||
|
|
I think you are setting "failover_on_backend_error = on" and after you shutdown PostgreSQL, you tried to connect to Pgpool-II? If "failover_on_backend_error = on", Pgpool-II considers the reading/writing errors on the PostgreSQL backend connection as the backend node failure and trigger failover immediately. |
|
|
So to answer your questions, 1. failover_on_backend_error was unchanged from it's default of on. 2. No, I did not connect to the Pgpool-II port during this test at all. Here is what I did this morning: 1. Based on your comments, I set failover_on_backend_error=off and restarted PgPool-II 2. I reproduced the test where the service was shutoff - it still did not respect the health_check_period and health_check_max_retries parameters I would be happy to demonstrate this happening via screen share sometime if you would like ... let me know. In the meantime, I'm going to attach a log file from this morning, for the steps I mentioned above ... you'll be able to observe the timestamps for the healthchecks not respecting the parameters. Thank you - 20230515_set_failover_on_backend_error_to_off_and_re-test.txt (16,147 bytes)
root@debpgstreama:~# systemctl stop pgpool2 root@debpgstreama:~# ps -ef |grep pool root 728 637 0 08:55 pts/0 00:00:00 grep pool root@debpgstreama:~# systemctl start pgpool2 root@debpgstreama:~# ps -ef |grep pool postgres 731 1 0 08:55 ? 00:00:00 /usr/sbin/pgpool -n postgres 734 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 735 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 736 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 737 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 738 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 739 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 740 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 741 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 742 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 743 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 744 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 745 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 746 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 747 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 748 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 749 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 750 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 751 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 752 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 753 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 754 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 755 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 756 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 757 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 758 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 759 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 760 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 761 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 762 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 763 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 764 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 765 731 0 08:55 ? 00:00:00 pgpool: wait for connection request postgres 766 731 0 08:55 ? 00:00:00 pgpool: PCP: wait for connection request postgres 767 731 0 08:55 ? 00:00:00 pgpool: worker process postgres 768 731 0 08:55 ? 00:00:00 pgpool: health check process(0) postgres 769 731 0 08:55 ? 00:00:00 pgpool: health check process(1) root 771 637 0 08:56 pts/0 00:00:00 grep pool root@debpgstreama:~# May 15 08:55:30 debpgstreama pgpool[731]: 2023-05-15 08:55:30.460: main pid 731: LOG: node status[0]: 1 May 15 08:55:30 debpgstreama pgpool[731]: 2023-05-15 08:55:30.460: main pid 731: LOG: node status[1]: 2 May 15 08:57:12 debpgstreama pgpool[766]: 2023-05-15 08:57:12.438: pcp_main pid 766: LOG: forked new pcp worker, pid=773 socket=7 May 15 08:57:12 debpgstreama pgpool[773]: 2023-05-15 08:57:12.440: pcp_child pid 773: FATAL: authentication failed for user "pcpuser" May 15 08:57:12 debpgstreama pgpool[773]: 2023-05-15 08:57:12.440: pcp_child pid 773: DETAIL: username and/or password does not match May 15 08:57:12 debpgstreama pgpool[766]: 2023-05-15 08:57:12.451: pcp_main pid 766: LOG: PCP process with pid: 773 exit with SUCCESS. May 15 08:57:12 debpgstreama pgpool[766]: 2023-05-15 08:57:12.451: pcp_main pid 766: LOG: PCP process with pid: 773 exits with status 256 May 15 08:57:17 debpgstreama pgpool[766]: 2023-05-15 08:57:17.204: pcp_main pid 766: LOG: forked new pcp worker, pid=778 socket=7 May 15 08:57:17 debpgstreama pgpool[766]: 2023-05-15 08:57:17.238: pcp_main pid 766: LOG: PCP process with pid: 778 exit with SUCCESS. May 15 08:57:17 debpgstreama pgpool[766]: 2023-05-15 08:57:17.238: pcp_main pid 766: LOG: PCP process with pid: 778 exits with status 0 May 15 09:03:00 debpgstreama pgpool[768]: 2023-05-15 09:03:00.901: health_check0 pid 768: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:00 debpgstreama pgpool[768]: 2023-05-15 09:03:00.901: health_check0 pid 768: DETAIL: Operation now in progress May 15 09:03:00 debpgstreama pgpool[768]: 2023-05-15 09:03:00.901: health_check0 pid 768: LOG: health check retrying on DB node: 0 (round:1) May 15 09:03:01 debpgstreama pgpool[767]: 2023-05-15 09:03:01.367: sr_check_worker pid 767: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:01 debpgstreama pgpool[767]: 2023-05-15 09:03:01.367: sr_check_worker pid 767: DETAIL: Operation now in progress May 15 09:03:01 debpgstreama pgpool[768]: 2023-05-15 09:03:01.903: health_check0 pid 768: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:01 debpgstreama pgpool[768]: 2023-05-15 09:03:01.903: health_check0 pid 768: DETAIL: Operation now in progress May 15 09:03:01 debpgstreama pgpool[768]: 2023-05-15 09:03:01.903: health_check0 pid 768: LOG: health check retrying on DB node: 0 (round:2) May 15 09:03:02 debpgstreama pgpool[768]: 2023-05-15 09:03:02.905: health_check0 pid 768: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:02 debpgstreama pgpool[768]: 2023-05-15 09:03:02.905: health_check0 pid 768: DETAIL: Operation now in progress May 15 09:03:02 debpgstreama pgpool[768]: 2023-05-15 09:03:02.905: health_check0 pid 768: LOG: health check retrying on DB node: 0 (round:3) May 15 09:03:03 debpgstreama pgpool[768]: 2023-05-15 09:03:03.907: health_check0 pid 768: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:03 debpgstreama pgpool[768]: 2023-05-15 09:03:03.907: health_check0 pid 768: DETAIL: Operation now in progress May 15 09:03:03 debpgstreama pgpool[768]: 2023-05-15 09:03:03.907: health_check0 pid 768: LOG: health check retrying on DB node: 0 (round:4) May 15 09:03:04 debpgstreama pgpool[768]: 2023-05-15 09:03:04.909: health_check0 pid 768: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:04 debpgstreama pgpool[768]: 2023-05-15 09:03:04.909: health_check0 pid 768: DETAIL: Operation now in progress May 15 09:03:04 debpgstreama pgpool[768]: 2023-05-15 09:03:04.909: health_check0 pid 768: LOG: health check retrying on DB node: 0 (round:5) May 15 09:03:05 debpgstreama pgpool[768]: 2023-05-15 09:03:05.911: health_check0 pid 768: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:05 debpgstreama pgpool[768]: 2023-05-15 09:03:05.911: health_check0 pid 768: DETAIL: Operation now in progress May 15 09:03:05 debpgstreama pgpool[768]: 2023-05-15 09:03:05.911: health_check0 pid 768: LOG: health check retrying on DB node: 0 (round:6) May 15 09:03:06 debpgstreama pgpool[768]: 2023-05-15 09:03:06.913: health_check0 pid 768: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:06 debpgstreama pgpool[768]: 2023-05-15 09:03:06.913: health_check0 pid 768: DETAIL: Operation now in progress May 15 09:03:06 debpgstreama pgpool[768]: 2023-05-15 09:03:06.913: health_check0 pid 768: LOG: health check failed on node 0 (timeout:0) May 15 09:03:06 debpgstreama pgpool[768]: 2023-05-15 09:03:06.913: health_check0 pid 768: LOG: received degenerate backend request for node_id: 0 from pid [768] May 15 09:03:06 debpgstreama pgpool[768]: 2023-05-15 09:03:06.913: health_check0 pid 768: LOG: signal_user1_to_parent_with_reason(0) May 15 09:03:06 debpgstreama pgpool[731]: 2023-05-15 09:03:06.914: main pid 731: LOG: Pgpool-II parent process received SIGUSR1 May 15 09:03:06 debpgstreama pgpool[731]: 2023-05-15 09:03:06.914: main pid 731: LOG: Pgpool-II parent process has received failover request May 15 09:03:06 debpgstreama pgpool[731]: 2023-05-15 09:03:06.914: main pid 731: LOG: starting degeneration. shutdown host 10.0.0.224(5432) May 15 09:03:06 debpgstreama pgpool[731]: 2023-05-15 09:03:06.927: main pid 731: LOG: Restart all children May 15 09:03:06 debpgstreama pgpool[731]: 2023-05-15 09:03:06.927: main pid 731: LOG: execute command: /etc/pgpool2/scripts/failover.sh 0 10.0.0.224 5432 /data/pgcluster01 1 10.0.0.225 0 0 5432 /data/pgcluster01 10.0.0.224 5432 > /etc/pgpool2/scripts/failover.sh.log 2>&1 May 15 09:03:11 debpgstreama pgpool[767]: 2023-05-15 09:03:11.382: sr_check_worker pid 767: ERROR: Failed to check replication time lag May 15 09:03:11 debpgstreama pgpool[767]: 2023-05-15 09:03:11.382: sr_check_worker pid 767: DETAIL: No persistent db connection for the node 0 May 15 09:03:11 debpgstreama pgpool[767]: 2023-05-15 09:03:11.382: sr_check_worker pid 767: HINT: check sr_check_user and sr_check_password May 15 09:03:11 debpgstreama pgpool[767]: 2023-05-15 09:03:11.382: sr_check_worker pid 767: CONTEXT: while checking replication time lag May 15 09:03:31 debpgstreama pgpool[731]: 2023-05-15 09:03:31.936: main pid 731: LOG: find_primary_node_repeatedly: waiting for finding a primary node May 15 09:03:31 debpgstreama pgpool[731]: 2023-05-15 09:03:31.944: main pid 731: LOG: find_primary_node: primary node is 1 May 15 09:03:31 debpgstreama pgpool[731]: 2023-05-15 09:03:31.944: main pid 731: LOG: failover: set new primary node: 1 May 15 09:03:31 debpgstreama pgpool[731]: 2023-05-15 09:03:31.944: main pid 731: LOG: failover: set new main node: 1 May 15 09:03:31 debpgstreama pgpool[767]: 2023-05-15 09:03:31.950: sr_check_worker pid 767: LOG: worker process received restart request May 15 09:03:31 debpgstreama pgpool[731]: failover done. shutdown host 10.0.0.224(5432)2023-05-15 09:03:31.951: main pid 731: LOG: failover done. shutdown host 10.0.0.224(5432) May 15 09:03:32 debpgstreama pgpool[766]: 2023-05-15 09:03:32.951: pcp_main pid 766: LOG: restart request received in pcp child process May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.954: main pid 731: LOG: PCP child 766 exits with status 0 in failover() May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: fork a new PCP child pid 857 in failover() May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 734 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 735 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 736 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 737 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 738 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 739 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 740 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 741 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 742 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 743 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 744 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 745 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 746 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 747 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.955: main pid 731: LOG: child process with pid: 748 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 749 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 750 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 751 exits with status 256 May 15 09:03:32 debpgstreama pgpool[857]: 2023-05-15 09:03:32.956: pcp_main pid 857: LOG: PCP process: 857 started May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 752 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 753 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 754 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 755 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 756 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 757 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 758 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 759 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 760 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 761 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 762 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 763 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 764 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: child process with pid: 765 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.956: main pid 731: LOG: worker child process with pid: 767 exits with status 256 May 15 09:03:32 debpgstreama pgpool[731]: 2023-05-15 09:03:32.957: main pid 731: LOG: fork a new worker child process with pid: 858 May 15 09:03:32 debpgstreama pgpool[858]: 2023-05-15 09:03:32.957: sr_check_worker pid 858: LOG: process started |
|
|
According to your log I think you are setting: health_check_max_retries = 6 health_check_retry_delay = 1 ------------------------------------------------ May 15 09:03:00 debpgstreama pgpool[768]: 2023-05-15 09:03:00.901: health_check0 pid 768: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:00 debpgstreama pgpool[768]: 2023-05-15 09:03:00.901: health_check0 pid 768: DETAIL: Operation now in progress May 15 09:03:00 debpgstreama pgpool[768]: 2023-05-15 09:03:00.901: health_check0 pid 768: LOG: health check retrying on DB node: 0 (round:1) May 15 09:03:01 debpgstreama pgpool[767]: 2023-05-15 09:03:01.367: sr_check_worker pid 767: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:01 debpgstreama pgpool[767]: 2023-05-15 09:03:01.367: sr_check_worker pid 767: DETAIL: Operation now in progress May 15 09:03:01 debpgstreama pgpool[768]: 2023-05-15 09:03:01.903: health_check0 pid 768: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:01 debpgstreama pgpool[768]: 2023-05-15 09:03:01.903: health_check0 pid 768: DETAIL: Operation now in progress May 15 09:03:01 debpgstreama pgpool[768]: 2023-05-15 09:03:01.903: health_check0 pid 768: LOG: health check retrying on DB node: 0 (round:2) May 15 09:03:02 debpgstreama pgpool[768]: 2023-05-15 09:03:02.905: health_check0 pid 768: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:02 debpgstreama pgpool[768]: 2023-05-15 09:03:02.905: health_check0 pid 768: DETAIL: Operation now in progress May 15 09:03:02 debpgstreama pgpool[768]: 2023-05-15 09:03:02.905: health_check0 pid 768: LOG: health check retrying on DB node: 0 (round:3) May 15 09:03:03 debpgstreama pgpool[768]: 2023-05-15 09:03:03.907: health_check0 pid 768: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:03 debpgstreama pgpool[768]: 2023-05-15 09:03:03.907: health_check0 pid 768: DETAIL: Operation now in progress May 15 09:03:03 debpgstreama pgpool[768]: 2023-05-15 09:03:03.907: health_check0 pid 768: LOG: health check retrying on DB node: 0 (round:4) May 15 09:03:04 debpgstreama pgpool[768]: 2023-05-15 09:03:04.909: health_check0 pid 768: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:04 debpgstreama pgpool[768]: 2023-05-15 09:03:04.909: health_check0 pid 768: DETAIL: Operation now in progress May 15 09:03:04 debpgstreama pgpool[768]: 2023-05-15 09:03:04.909: health_check0 pid 768: LOG: health check retrying on DB node: 0 (round:5) May 15 09:03:05 debpgstreama pgpool[768]: 2023-05-15 09:03:05.911: health_check0 pid 768: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:05 debpgstreama pgpool[768]: 2023-05-15 09:03:05.911: health_check0 pid 768: DETAIL: Operation now in progress May 15 09:03:05 debpgstreama pgpool[768]: 2023-05-15 09:03:05.911: health_check0 pid 768: LOG: health check retrying on DB node: 0 (round:6) May 15 09:03:06 debpgstreama pgpool[768]: 2023-05-15 09:03:06.913: health_check0 pid 768: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", getsockopt() failed May 15 09:03:06 debpgstreama pgpool[768]: 2023-05-15 09:03:06.913: health_check0 pid 768: DETAIL: Operation now in progress May 15 09:03:06 debpgstreama pgpool[768]: 2023-05-15 09:03:06.913: health_check0 pid 768: LOG: health check failed on node 0 (timeout:0) ------------------------------------------------ If pgpool detects node failure, pgpool will retry every 1 seconds (default value of health_check_retry_delay). Pgpool use *health_check_retry_delay* setting during retry, not health_check_period. I think above is the expected behavior. |
|
|
No, that is NOT correct. That is exactly the problem/bug. Here are the actual parameters that were set during *all tests*: health_check_period = 10 health_check_max_retries = 6 I am also attaching another log from this morning, displaying how these are working ***correctly*** when traffic is blocked instead of postgres being shut down. Please review the details in the original Bug description to fully understand the test 2 scenarios and how health_check_period and health_check_max_retries work correctly when traffic is blocked (today's log example) and how they ARE NOT RESPECTED when the PostgreSQL service is stopped (instead of blocking traffic) - which is displayed in yesterday's log. Thanks, BT 20230516_health_checks_working_correctly_when_traffic_is_blocked.txt (14,905 bytes)
May 16 08:53:34 debpgstreama pgpool[456]: 2023-05-16 08:53:34.037: main pid 456: LOG: health_check_stats_shared_memory_size: requested size: 12288 May 16 08:53:34 debpgstreama pgpool[456]: 2023-05-16 08:53:34.037: main pid 456: LOG: health_check_stats_shared_memory_size: requested size: 12288 May 16 08:53:34 debpgstreama pgpool[456]: 2023-05-16 08:53:34.037: main pid 456: LOG: memory cache initialized May 16 08:53:34 debpgstreama pgpool[456]: 2023-05-16 08:53:34.037: main pid 456: DETAIL: memcache blocks :64 May 16 08:53:34 debpgstreama pgpool[456]: 2023-05-16 08:53:34.039: main pid 456: LOG: pool_discard_oid_maps: discarded memqcache oid maps May 16 08:53:34 debpgstreama pgpool[456]: 2023-05-16 08:53:34.043: main pid 456: LOG: Setting up socket for 0.0.0.0:9999 May 16 08:53:34 debpgstreama pgpool[456]: 2023-05-16 08:53:34.043: main pid 456: LOG: Setting up socket for :::9999 May 16 08:53:34 debpgstreama pgpool[456]: 2023-05-16 08:53:34.048: main pid 456: LOG: find_primary_node_repeatedly: waiting for finding a primary node May 16 08:53:34 debpgstreama pgpool[456]: 2023-05-16 08:53:34.093: main pid 456: LOG: find_primary_node: primary node is 0 May 16 08:53:34 debpgstreama pgpool[456]: 2023-05-16 08:53:34.093: main pid 456: LOG: find_primary_node: standby node is 1 May 16 08:53:34 debpgstreama pgpool[556]: 2023-05-16 08:53:34.094: pcp_main pid 556: LOG: PCP process: 556 started May 16 08:53:34 debpgstreama pgpool[557]: 2023-05-16 08:53:34.094: sr_check_worker pid 557: LOG: process started May 16 08:53:34 debpgstreama pgpool[558]: 2023-05-16 08:53:34.095: health_check pid 558: LOG: process started May 16 08:53:34 debpgstreama pgpool[559]: 2023-05-16 08:53:34.095: health_check pid 559: LOG: process started May 16 08:53:34 debpgstreama pgpool[456]: 2023-05-16 08:53:34.096: main pid 456: LOG: pgpool-II successfully started. version 4.3.5 (tamahomeboshi) May 16 08:53:34 debpgstreama pgpool[456]: 2023-05-16 08:53:34.096: main pid 456: LOG: node status[0]: 1 May 16 08:53:34 debpgstreama pgpool[456]: 2023-05-16 08:53:34.096: main pid 456: LOG: node status[1]: 2 May 16 08:54:43 debpgstreama pgpool[556]: 2023-05-16 08:54:43.994: pcp_main pid 556: LOG: forked new pcp worker, pid=641 socket=7 May 16 08:54:44 debpgstreama pgpool[556]: 2023-05-16 08:54:44.023: pcp_main pid 556: LOG: PCP process with pid: 641 exit with SUCCESS. May 16 08:54:44 debpgstreama pgpool[556]: 2023-05-16 08:54:44.023: pcp_main pid 556: LOG: PCP process with pid: 641 exits with status 0 May 16 08:56:22 debpgstreama pgpool[558]: 2023-05-16 08:56:22.050: health_check0 pid 558: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 08:56:22 debpgstreama pgpool[558]: 2023-05-16 08:56:22.050: health_check0 pid 558: LOG: health check retrying on DB node: 0 (round:1) May 16 08:56:22 debpgstreama pgpool[557]: 2023-05-16 08:56:22.197: sr_check_worker pid 557: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 08:56:32 debpgstreama pgpool[557]: 2023-05-16 08:56:32.210: sr_check_worker pid 557: ERROR: Failed to check replication time lag May 16 08:56:32 debpgstreama pgpool[557]: 2023-05-16 08:56:32.210: sr_check_worker pid 557: DETAIL: No persistent db connection for the node 0 May 16 08:56:32 debpgstreama pgpool[557]: 2023-05-16 08:56:32.210: sr_check_worker pid 557: HINT: check sr_check_user and sr_check_password May 16 08:56:32 debpgstreama pgpool[557]: 2023-05-16 08:56:32.210: sr_check_worker pid 557: CONTEXT: while checking replication time lag May 16 08:56:33 debpgstreama pgpool[558]: 2023-05-16 08:56:33.061: health_check0 pid 558: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 08:56:33 debpgstreama pgpool[558]: 2023-05-16 08:56:33.062: health_check0 pid 558: LOG: health check retrying on DB node: 0 (round:2) May 16 08:56:42 debpgstreama pgpool[557]: 2023-05-16 08:56:42.220: sr_check_worker pid 557: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 08:56:44 debpgstreama pgpool[558]: 2023-05-16 08:56:44.072: health_check0 pid 558: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 08:56:44 debpgstreama pgpool[558]: 2023-05-16 08:56:44.072: health_check0 pid 558: LOG: health check retrying on DB node: 0 (round:3) May 16 08:56:52 debpgstreama pgpool[557]: 2023-05-16 08:56:52.226: sr_check_worker pid 557: ERROR: Failed to check replication time lag May 16 08:56:52 debpgstreama pgpool[557]: 2023-05-16 08:56:52.226: sr_check_worker pid 557: DETAIL: No persistent db connection for the node 0 May 16 08:56:52 debpgstreama pgpool[557]: 2023-05-16 08:56:52.226: sr_check_worker pid 557: HINT: check sr_check_user and sr_check_password May 16 08:56:52 debpgstreama pgpool[557]: 2023-05-16 08:56:52.226: sr_check_worker pid 557: CONTEXT: while checking replication time lag May 16 08:56:55 debpgstreama pgpool[558]: 2023-05-16 08:56:55.083: health_check0 pid 558: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 08:56:55 debpgstreama pgpool[558]: 2023-05-16 08:56:55.083: health_check0 pid 558: LOG: health check retrying on DB node: 0 (round:4) May 16 08:57:02 debpgstreama pgpool[557]: 2023-05-16 08:57:02.236: sr_check_worker pid 557: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 08:57:06 debpgstreama pgpool[558]: 2023-05-16 08:57:06.094: health_check0 pid 558: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 08:57:06 debpgstreama pgpool[558]: 2023-05-16 08:57:06.094: health_check0 pid 558: LOG: health check retrying on DB node: 0 (round:5) May 16 08:57:12 debpgstreama pgpool[557]: 2023-05-16 08:57:12.251: sr_check_worker pid 557: ERROR: Failed to check replication time lag May 16 08:57:12 debpgstreama pgpool[557]: 2023-05-16 08:57:12.251: sr_check_worker pid 557: DETAIL: No persistent db connection for the node 0 May 16 08:57:12 debpgstreama pgpool[557]: 2023-05-16 08:57:12.251: sr_check_worker pid 557: HINT: check sr_check_user and sr_check_password May 16 08:57:12 debpgstreama pgpool[557]: 2023-05-16 08:57:12.251: sr_check_worker pid 557: CONTEXT: while checking replication time lag May 16 08:57:17 debpgstreama pgpool[558]: 2023-05-16 08:57:17.105: health_check0 pid 558: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 08:57:17 debpgstreama pgpool[558]: 2023-05-16 08:57:17.105: health_check0 pid 558: LOG: health check retrying on DB node: 0 (round:6) May 16 08:57:22 debpgstreama pgpool[557]: 2023-05-16 08:57:22.262: sr_check_worker pid 557: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 08:57:28 debpgstreama pgpool[558]: 2023-05-16 08:57:28.116: health_check0 pid 558: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 08:57:28 debpgstreama pgpool[558]: 2023-05-16 08:57:28.116: health_check0 pid 558: LOG: health check failed on node 0 (timeout:0) May 16 08:57:28 debpgstreama pgpool[558]: 2023-05-16 08:57:28.116: health_check0 pid 558: LOG: received degenerate backend request for node_id: 0 from pid [558] May 16 08:57:28 debpgstreama pgpool[558]: 2023-05-16 08:57:28.116: health_check0 pid 558: LOG: signal_user1_to_parent_with_reason(0) May 16 08:57:28 debpgstreama pgpool[456]: 2023-05-16 08:57:28.117: main pid 456: LOG: Pgpool-II parent process received SIGUSR1 May 16 08:57:28 debpgstreama pgpool[456]: 2023-05-16 08:57:28.117: main pid 456: LOG: Pgpool-II parent process has received failover request May 16 08:57:28 debpgstreama pgpool[456]: 2023-05-16 08:57:28.117: main pid 456: LOG: starting degeneration. shutdown host 10.0.0.224(5432) May 16 08:57:28 debpgstreama pgpool[456]: 2023-05-16 08:57:28.125: main pid 456: LOG: Restart all children May 16 08:57:28 debpgstreama pgpool[456]: 2023-05-16 08:57:28.125: main pid 456: LOG: execute command: /etc/pgpool2/scripts/failover.sh 0 10.0.0.224 5432 /data/pgcluster01 1 10.0.0.225 0 0 5432 /data/pgcluster01 10.0.0.224 5432 > /etc/pgpool2/scripts/failover.sh.log 2>&1 May 16 08:57:32 debpgstreama pgpool[557]: 2023-05-16 08:57:32.268: sr_check_worker pid 557: ERROR: Failed to check replication time lag May 16 08:57:32 debpgstreama pgpool[557]: 2023-05-16 08:57:32.268: sr_check_worker pid 557: DETAIL: No persistent db connection for the node 0 May 16 08:57:32 debpgstreama pgpool[557]: 2023-05-16 08:57:32.268: sr_check_worker pid 557: HINT: check sr_check_user and sr_check_password May 16 08:57:32 debpgstreama pgpool[557]: 2023-05-16 08:57:32.268: sr_check_worker pid 557: CONTEXT: while checking replication time lag May 16 08:58:00 debpgstreama pgpool[456]: 2023-05-16 08:58:00.193: main pid 456: LOG: find_primary_node_repeatedly: waiting for finding a primary node May 16 08:58:00 debpgstreama pgpool[456]: 2023-05-16 08:58:00.199: main pid 456: LOG: find_primary_node: primary node is 1 May 16 08:58:00 debpgstreama pgpool[456]: 2023-05-16 08:58:00.199: main pid 456: LOG: failover: set new primary node: 1 May 16 08:58:00 debpgstreama pgpool[456]: 2023-05-16 08:58:00.199: main pid 456: LOG: failover: set new main node: 1 May 16 08:58:00 debpgstreama pgpool[557]: 2023-05-16 08:58:00.205: sr_check_worker pid 557: LOG: worker process received restart request May 16 08:58:00 debpgstreama pgpool[456]: failover done. shutdown host 10.0.0.224(5432)2023-05-16 08:58:00.205: main pid 456: LOG: failover done. shutdown host 10.0.0.224(5432) May 16 08:58:01 debpgstreama pgpool[556]: 2023-05-16 08:58:01.206: pcp_main pid 556: LOG: restart request received in pcp child process May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: PCP child 556 exits with status 0 in failover() May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: fork a new PCP child pid 757 in failover() May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 480 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 481 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 482 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 483 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 484 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 485 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 486 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 487 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 488 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 489 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 490 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 491 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 492 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 493 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 494 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 495 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 496 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 497 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 498 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 499 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 500 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 501 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 502 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 503 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 504 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 505 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 506 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 507 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 508 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 509 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 510 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: child process with pid: 511 exits with status 256 May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.207: main pid 456: LOG: worker child process with pid: 557 exits with status 256 May 16 08:58:01 debpgstreama pgpool[757]: 2023-05-16 08:58:01.207: pcp_main pid 757: LOG: PCP process: 757 started May 16 08:58:01 debpgstreama pgpool[456]: 2023-05-16 08:58:01.208: main pid 456: LOG: fork a new worker child process with pid: 758 May 16 08:58:01 debpgstreama pgpool[758]: 2023-05-16 08:58:01.208: sr_check_worker pid 758: LOG: process started |
|
|
So after a second look today, I actually implemented the health_check_retry_delay parameter you mentioned with a value of 10, and that creates a reasonable delay for the 2 cases as follows. health_check_period = 10 health_check_max_retries = 6 health_check_retry_delay = 10 # Effect Postgres is shutdown effective delay: 10 seconds per check Traffic is blocked effective delay: 20 seconds per check This behavior is much better than 1 second and 11 seconds from before - the whole point of us implementing these delays is to not cause an unnecessary failover in case of network issues or brief postgres maintenance that requires a shutdown/startup or a server reboot for maintenance for example. It's up to you if you want to pursue this further, I think ultimately the behavior of different delays based on the type of primary failure is definitely happening and reproduceable, however like I mentioned, with health_check_retry_delay = 10 the behavior becomes very manageable. I am happy to help with any questions/logs/testing either way - let me know - Thanks, BT May 16 10:12:18 debpgstreama pgpool[507]: 2023-05-16 10:12:18.504: health_check0 pid 507: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 10:12:18 debpgstreama pgpool[507]: 2023-05-16 10:12:18.504: health_check0 pid 507: LOG: health check retrying on DB node: 0 (round:1) May 16 10:12:18 debpgstreama pgpool[506]: 2023-05-16 10:12:18.904: sr_check_worker pid 506: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 10:12:28 debpgstreama pgpool[506]: 2023-05-16 10:12:28.918: sr_check_worker pid 506: ERROR: Failed to check replication time lag May 16 10:12:28 debpgstreama pgpool[506]: 2023-05-16 10:12:28.918: sr_check_worker pid 506: DETAIL: No persistent db connection for the node 0 May 16 10:12:28 debpgstreama pgpool[506]: 2023-05-16 10:12:28.918: sr_check_worker pid 506: HINT: check sr_check_user and sr_check_password May 16 10:12:28 debpgstreama pgpool[506]: 2023-05-16 10:12:28.918: sr_check_worker pid 506: CONTEXT: while checking replication time lag May 16 10:12:38 debpgstreama pgpool[507]: 2023-05-16 10:12:38.514: health_check0 pid 507: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 10:12:38 debpgstreama pgpool[507]: 2023-05-16 10:12:38.514: health_check0 pid 507: LOG: health check retrying on DB node: 0 (round:2) May 16 10:12:38 debpgstreama pgpool[506]: 2023-05-16 10:12:38.929: sr_check_worker pid 506: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 10:12:48 debpgstreama pgpool[506]: 2023-05-16 10:12:48.942: sr_check_worker pid 506: ERROR: Failed to check replication time lag May 16 10:12:48 debpgstreama pgpool[506]: 2023-05-16 10:12:48.942: sr_check_worker pid 506: DETAIL: No persistent db connection for the node 0 May 16 10:12:48 debpgstreama pgpool[506]: 2023-05-16 10:12:48.942: sr_check_worker pid 506: HINT: check sr_check_user and sr_check_password May 16 10:12:48 debpgstreama pgpool[506]: 2023-05-16 10:12:48.942: sr_check_worker pid 506: CONTEXT: while checking replication time lag May 16 10:12:58 debpgstreama pgpool[507]: 2023-05-16 10:12:58.525: health_check0 pid 507: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 10:12:58 debpgstreama pgpool[507]: 2023-05-16 10:12:58.525: health_check0 pid 507: LOG: health check retrying on DB node: 0 (round:3) May 16 10:12:58 debpgstreama pgpool[506]: 2023-05-16 10:12:58.953: sr_check_worker pid 506: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 10:13:08 debpgstreama pgpool[506]: 2023-05-16 10:13:08.960: sr_check_worker pid 506: ERROR: Failed to check replication time lag May 16 10:13:08 debpgstreama pgpool[506]: 2023-05-16 10:13:08.960: sr_check_worker pid 506: DETAIL: No persistent db connection for the node 0 May 16 10:13:08 debpgstreama pgpool[506]: 2023-05-16 10:13:08.960: sr_check_worker pid 506: HINT: check sr_check_user and sr_check_password May 16 10:13:08 debpgstreama pgpool[506]: 2023-05-16 10:13:08.960: sr_check_worker pid 506: CONTEXT: while checking replication time lag May 16 10:13:18 debpgstreama pgpool[507]: 2023-05-16 10:13:18.536: health_check0 pid 507: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 10:13:18 debpgstreama pgpool[507]: 2023-05-16 10:13:18.536: health_check0 pid 507: LOG: health check retrying on DB node: 0 (round:4) May 16 10:13:18 debpgstreama pgpool[506]: 2023-05-16 10:13:18.970: sr_check_worker pid 506: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out May 16 10:13:28 debpgstreama pgpool[506]: 2023-05-16 10:13:28.986: sr_check_worker pid 506: ERROR: Failed to check replication time lag May 16 10:13:28 debpgstreama pgpool[506]: 2023-05-16 10:13:28.986: sr_check_worker pid 506: DETAIL: No persistent db connection for the node 0 May 16 10:13:28 debpgstreama pgpool[506]: 2023-05-16 10:13:28.986: sr_check_worker pid 506: HINT: check sr_check_user and sr_check_password May 16 10:13:28 debpgstreama pgpool[506]: 2023-05-16 10:13:28.986: sr_check_worker pid 506: CONTEXT: while checking replication time lag May 16 10:13:38 debpgstreama pgpool[507]: 2023-05-16 10:13:38.547: health_check0 pid 507: LOG: failed to connect to PostgreSQL server on "10.0.0.224:5432", timed out |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2023-05-11 00:28 | bt9201340128 | New Issue | |
| 2023-05-11 11:21 | pengbo | Assigned To | => pengbo |
| 2023-05-11 11:21 | pengbo | Status | new => assigned |
| 2023-05-15 16:48 | pengbo | Note Added: 0004377 | |
| 2023-05-15 22:36 | bt9201340128 | Note Added: 0004379 | |
| 2023-05-15 22:36 | bt9201340128 | File Added: 20230515_set_failover_on_backend_error_to_off_and_re-test.txt | |
| 2023-05-16 13:42 | pengbo | Note Added: 0004384 | |
| 2023-05-16 13:43 | pengbo | Status | assigned => feedback |
| 2023-05-16 22:12 | bt9201340128 | Note Added: 0004385 | |
| 2023-05-16 22:12 | bt9201340128 | File Added: 20230516_health_checks_working_correctly_when_traffic_is_blocked.txt | |
| 2023-05-16 22:12 | bt9201340128 | Status | feedback => assigned |
| 2023-05-16 23:28 | bt9201340128 | Note Added: 0004386 |