View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000825 | Pgpool-II | General | public | 2024-01-17 15:30 | 2024-02-06 12:09 |
| Reporter | Srini Balakrishnan | Assigned To | pengbo | ||
| Priority | normal | Severity | minor | Reproducibility | have not tried |
| Status | assigned | Resolution | open | ||
| Product Version | 4.3.5 | ||||
| Summary | 0000825: Auto Failover and Recovery options | ||||
| Description | Hi, I have enabled pgpool 4.3.5 on a 3 node postgresql (version 14.10) -- both postgresql and pgpool installed on the same node. One of the requirements for our POC to move to production is enable auto failover and recovery methods in case of a failover with one of the node. 1) as of now, during my testing, the pgpool VIP moves to another node when the service is stopped. However there is a lag of 10 secs for this movement and VIP creation in another node. is there any other setting that i can tweak to reduce this time? 2) for the auto failover and recovery of postgresql , i am not able to make it work successfully on pgpool autofail method. the commands are not working as expected. eg., if i shutdown the primary node of postgresql, one of the standby gets promoted as primary however the cluster breaks down. i have to manually delete the data folder on original primary and do sync from new primary to start that as standby. i am not able to join the old primary as a standby node automatically to the cluster. 3) also, if i need to promote one of the standby without stopping the postgresql service and demote the current primary to standby - ie, auto switchover. - is this feature supported in pgpool? i couldnt find any reference(s) on this method in the documentation. if it's not supported by pgpool, do you have any recommendations for any other opensource tools like pg_auto_failover or patroni or repmgr etc. can you please recommend an open-source tool that can help us in setting up the auto failover and switch over methods in addition to pgpool (which is important in our setup to seperate the read/write calls). if you can provide me any steps/documents to achieve, it will be very helpful on this setup. | ||||
| Tags | No tags attached. | ||||
|
|
> 1) as of now, during my testing, the pgpool VIP moves to another node when the service is stopped. However there is a lag of 10 secs for this movement and VIP creation in another node. is there any other setting that i can tweak to reduce this time? You can try to decrease the settings of the following parameters: wd_interval wd_heartbeat_deadtime https://www.pgpool.net/docs/43/en/html/runtime-watchdog-config.html#CONFIG-WATCHDOG-LIFECHECK-HEARTBEAT > 2) for the auto failover and recovery of postgresql , i am not able to make it work successfully on pgpool autofail method. the > commands are not working as expected. eg., if i shutdown the primary node of postgresql, one of the standby gets promoted as > primary however the cluster breaks down. i have to manually delete the data folder on original primary and do sync from new > primary to start that as standby. i am not able to join the old primary as a standby node automatically to the cluster. Sorry. Pgpool-II doesn't have the feature to automatically recover the old primary as a standby node. You need to manually restore it as a standby and attach it to cluster. > 3) also, if i need to promote one of the standby without stopping the postgresql service and demote the current primary to standby - ie, auto switchover. > - is this feature supported in pgpool? Yes. You can run "pcp_promote_node --switchover". Please note that to use this feature the settings of "follow_primary_command" and "follow_primary_command" are required. For more information, please see the following documentation: https://www.pgpool.net/docs/43/en/html/pcp-promote-node.html |
|
|
Hi, Thanks for the note and details. i tested the following. 1) i reduced the time interval for both the parameters to 1 sec and 2 secs (deadtime) from the detault 10 and 30 secs. I see no significant difference in the VIP movement between nodes. it still takes around 0000003:0000010-15 secs. wd_interval wd_heartbeat_deadtime 2) I have setup Patroni on my POC environment and able to make both patroni managing the postgresql cluster for failover/switchover features and pgpool only for Load balancing and connection pool. the integration works well except i see some glitches when i test the failover/switchover options on postgres via patroni. to avoid the conflict with pgpool handling this, i disabled the failover steps and slightly modified the follow_primary script to recycle the services of pgpool to establish new connections with new primary and standby nodes of postgres. sometimes the pgpool status is always down and i have to run the pcp_attach_node few times to make it work. since i am completely stopping the pgpool2 service and restarting, why is it pgpool2 status not synchronizing automatically? i have also included in the OPTS -D -n to ignore the status on my pgpool2 service restart but it is keeping the old reference or synchronization of status is not seamless - especially whenever there is change in master/standby nodes change in postgres. any recommendations or inputs on handling this scenario? i am using version 4.3.7 and why pgpool when starting the service, is not fully reinitializing the status field as fresh and consider the current stateset of postgres instead of triggering follow_primary again? despite the ignore flag set at the start with -D to ignore the file, it's still thinks there is change in postgres nodes and trigerring follow_primary script. this complicates or disturbs the status flag and i noticed in few scenarios, it goes down from the UP state. |
|
|
sometimes when i stop the pgpool2 service it doesnt stop immediately and i could see the process display the below details. any reason why and how it can be prevented? postgres@tst2jdc17:~$ ps -ef | grep pgpool postgres 1483773 1 0 08:31 ? 00:00:00 /usr/sbin/pgpool -n postgres 1483774 1483773 0 08:31 ? 00:00:00 pgpool: PgpoolLogger postgres 1483778 1483773 0 08:31 ? 00:00:00 [pgpool] <defunct> postgres 1483807 1483773 0 08:31 ? 00:00:00 [pgpool] <defunct> postgres 1483848 1483773 0 08:31 ? 00:00:00 [pgpool] <defunct> postgres 1483849 1483773 0 08:31 ? 00:00:00 [pgpool] <defunct> postgres 1483850 1483773 0 08:31 ? 00:00:00 [pgpool] <defunct> postgres 1485533 1483773 0 08:36 ? 00:00:00 [pgpool] <defunct> postgres 1485534 1483773 0 08:36 ? 00:00:00 [pgpool] <defunct> postgres 1485535 1483773 0 08:36 ? 00:00:00 [pgpool] <defunct> postgres 1485536 1483773 0 08:36 ? 00:00:00 [pgpool] <defunct> postgres 1485537 1483773 0 08:36 ? 00:00:00 [pgpool] <defunct> postgres 1485538 1483773 0 08:36 ? 00:00:00 [pgpool] <defunct> postgres 1485539 1483773 0 08:36 ? 00:00:00 [pgpool] <defunct> postgres 1485540 1483773 0 08:36 ? 00:00:00 [pgpool] <defunct> postgres 1485541 1483773 0 08:36 ? 00:00:00 [pgpool] <defunct> postgres 1485542 1483773 0 08:36 ? 00:00:00 [pgpool] <defunct> postgres 1485543 1483773 0 08:36 ? 00:00:00 [pgpool] <defunct> postgres 1485544 1483773 0 08:36 ? 00:00:00 [pgpool] <defunct> |
|
|
Hi, is there any alternative for a 3 node pgpool2 cluster to have a common IP without using the delegate IP section? will it work if i use HA Proxy to have a common IP and allow all the 3 pgpool2 nodes to interact via the HA proxy? if i make the above setup, will load balancing of pgpool2 work properly? |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2024-01-17 15:30 | Srini Balakrishnan | New Issue | |
| 2024-01-22 22:33 | pengbo | Assigned To | => pengbo |
| 2024-01-22 22:33 | pengbo | Status | new => assigned |
| 2024-01-22 23:49 | pengbo | Note Added: 0004474 | |
| 2024-01-22 23:49 | pengbo | Status | assigned => feedback |
| 2024-01-30 14:13 | Srini Balakrishnan | Note Added: 0004476 | |
| 2024-01-30 14:13 | Srini Balakrishnan | Status | feedback => assigned |
| 2024-01-30 17:44 | Srini Balakrishnan | Note Added: 0004477 | |
| 2024-02-06 12:09 | Srini Balakrishnan | Note Added: 0004481 |