5.9. Failover and Failback

Failover means automatically detaching PostgreSQLbackend node which is not accessible by Pgpool-II. This happens automatically regardless the configuration parameter settings and is so called automatic failover process. Pgpool-II confirms the inaccessibility of PostgreSQL backend node by using following methods:

If failover_command is configured and a failover happens, failover_command gets executed. failover_command should be provided by user and the major role is choosing new primary server from existing standby servers and promoting it for example. Another example would be let the administrator know that a failover happens by sending a mail.

While a failover could happen when a failure occurs, it is possible to trigger it by hand. This is called a switch over. For instance, switching over a PostgreSQL to take its backup would be possible. Note that switching over just sets the status to down and never bringing PostgreSQL down. A switch over can be triggered by using pcp_detach_node command.

A PostgreSQL node detached by failover or switch over will never return to the previous state (attached state). Restarting Pgpool-II with -D option or running pcp_attach_node makes it to the attached state.

5.9.1. Failover and Failback Settings

failover_command (string)

Specifies a user command to run when a PostgreSQL backend node gets detached. Pgpool-II replaces the following special characters with the backend specific information.

Table 5-6. failover command options

Special characterDescription
%dDB node ID of the detached node
%hHostname of the detached node
%pPort number of the detached node
%DDatabase cluster directory of the detached node
%mNew master node ID
%HHostname of the new master node
%MOld master node ID
%POld primary node ID
%rPort number of the new master node
%RDatabase cluster directory of the new master node
%%'%' character

Note: The "master node" refers to a node which has the "youngest (or the smallest) node id" among live the database nodes. In streaming replication mode, this may be different from primary node. In Table 5-6, %m is the new master node chosen by Pgpool-II. It is the node being assigned the youngest (smallest) node id which is alive. For example if you have 3 nodes, namely node 0, 1, 2. Suppose node 1 the primary and all of them are healthy (no down node). If node 1 fails, failover_command is called with %m = 0. And, if all standby nodes are down and primary node failover happens, failover_command is called with %m = -1 and %H,%R,$r = "".

Note: When a failover is performed, basically Pgpool-II kills all its child processes, which will in turn terminate all the active sessions to Pgpool-II. After that Pgpool-II invokes the failover_command and after the command completion Pgpool-II starts new child processes which makes it ready again to accept client connections.

However from Pgpool-II 3.6, in the steaming replication mode, client sessions will not be disconnected any more when a failover occurs if the session does not use the failed standby server. Please note that if a query is sent while failover is processing, the session will be disconnected. If the primary server goes down, still all sessions will be disconnected. Health check timeout case will also cause the full session disconnection. Other health check error, including retry over case does not trigger full session disconnection.

Note: You can run psql (or whatever command) against backend to retrieve some information in the script, but you cannot run psql against Pgpool-II itself, since the script is called from Pgpool-II and it needs to run while Pgpool-II is working on failover.

This parameter can be changed by reloading the Pgpool-II configurations.

failback_command (string)

Specifies a user command to run when a PostgreSQL backend node gets attached to Pgpool-II. Pgpool-II replaces the following special characters with the backend specific information. before executing the command.

Table 5-7. failback command options

Special characterDescription
%dDB node ID of the attached node
%hHostname of the attached node
%pPort number of the attached node
%DDatabase cluster directory of the attached node
%mNew master node ID
%HHostname of the new master node
%MOld master node ID
%POld primary node ID
%rPort number of the new master node
%RDatabase cluster directory of the new master node
%%'%' character

Note: You can run psql (or whatever command) against backend to retrieve some information in the script, but you cannot run psql against Pgpool-II itself, since the script is called from Pgpool-II and it needs to run while Pgpool-II is working on failover.

This parameter can be changed by reloading the Pgpool-II configurations.

follow_master_command (string)

Specifies a user command to run after failover on the primary node failover. This works only in Master Replication mode with streaming replication. Pgpool-II replaces the following special characters with the backend specific information before executing the command.

Table 5-8. follow master command options

Special characterDescription
%dDB node ID of the detached node
%hHostname of the detached node
%pPort number of the detached node
%DDatabase cluster directory of the detached node
%MOld master node ID
%mNew primary node ID
%HHostname of the new primary node
%POld primary node ID
%rPort number of the new primary node
%RDatabase cluster directory of the new primary node
%%'%' character

Note: If follow_master_command is not empty, then after failover on the primary node gets completed in Master Slave mode with streaming replication, Pgpool-II degenerates all nodes except the new primary and starts new child processes to be ready again to accept connections from the clients. After this, Pgpool-II executes the command configured in the follow_master_command for each degenerated backend nodes.

Typically follow_master_command command is used to recover the slave from the new primary by calling the pcp_recovery_node command.

This parameter can be changed by reloading the Pgpool-II configurations.

fail_over_on_backend_error (boolean)

When set to on, Pgpool-II considers the reading/writing errors on the PostgreSQL backend connection as the backend node failure and trigger the failover on that node after disconnecting the current session. When this is set to off, Pgpool-II only report an error and disconnect the session in case of such errors.

Note: It is recommended to turn on the backend health checking (see Section 5.8) when fail_over_on_backend_error is set to off. Note, however, that Pgpool-II still triggers the failover when it detects the administrative shutdown of PostgreSQL backend server. If you want to avoid a fail over even in this case, you need to specify DISALLOW_TO_FAILOVER on backend_flag.

This parameter can be changed by reloading the Pgpool-II configurations.

search_primary_node_timeout (integer)

Specifies the maximum amount of time in seconds to search for the primary node when a failover scenario occurs. Pgpool-II will give up looking for the primary node if it is not found with-in this configured time. Default is 300 and Setting this parameter to 0 means keep trying forever.

This parameter is only applicable in the streaming replication mode.

This parameter can be changed by reloading the Pgpool-II configurations.

5.9.2. Failover in the raw Mode

Failover can be performed in raw mode if multiple backend servers are defined. Pgpool-II usually accesses the backend specified by backend_hostname0 during normal operation. If the backend_hostname0 fails for some reason, Pgpool-II tries to access the backend specified by backend_hostname1. If that fails, Pgpool-II tries the backend_hostname2, 3 and so on.