5.15. Watchdog

Watchdog configuration parameters are described in pgpool.conf. There is sample configuration in the WATCHDOG section of pgpool.conf.sample file. All following options are required to be specified in watchdog process.

5.15.1. Enable watchdog

use_watchdog (boolean)

If on, activates the watchdog. Default is off

This parameter can only be set at server start.

Pgpool-II 4.1 or earlier, because it is required to specify its own pgpool node information and the destination pgpool nodes information, the settings are different per pgpool node. Since Pgpool-II 4.2, all configuration parameters are identical on all hosts. If watchdog feature is enabled, to distinguish which host is which, a pgpool_node_id file is required. You need to create a pgpool_node_id file and specify the pgpool (watchdog) node number (e.g. 0, 1, 2 ...) to identify pgpool (watchdog) host.

Example 5-10. pgpool_node_id configuration

If you have 3 pgpool nodes with hostname server1, server2 and server3, create the pgpool_node_id file on each host as follows. When installing Pgpool-II using RPM, pgpool.conf is installed under /etc/pgpool-II/.

  • server1

    [server1]# cat /etc/pgpool-II/pgpool_node_id
    0
            
  • server2

    [server2]# cat /etc/pgpool-II/pgpool_node_id
    1
            
  • server3

    [server3]# cat /etc/pgpool-II/pgpool_node_id
    2
            

5.15.2. Watchdog communication

hostnameX (string)

Specifies the hostname or IP address of Pgpool-II server. This is used for sending/receiving queries and packets, and also as an identifier of the watchdog node. The number at the end of the parameter name is referred as "pgpool node id", and it starts from 0 (e.g. hostname0).

This parameter can only be set at server start.

wd_portX (integer)

Specifies the port number to be used by watchdog process to listen for connections. Default is 9000. The number at the end of the parameter name is referred as "pgpool node id", and it starts from 0 (e.g. wd_port0).

This parameter can only be set at server start.

pgpool_portX (integer)

Specifies the Pgpool-II port number. Default is 9999. The number at the end of the parameter name is referred as "pgpool node id", and it starts from 0 (e.g. pgpool_port0).

This parameter can only be set at server start.

Example 5-11. Watchdog configuration

If you have 3 pgpool nodes with hostname server1, server2 and server3, you can configure hostname, wd_port and pgpool_port like below:

hostname0 = 'server1'
wd_port0 = 9000
pgpool_port0 = 9999

hostname1 = 'server2'
wd_port1 = 9000
pgpool_port1 = 9999

hostname2 = 'server3'
wd_port2 = 9000
pgpool_port2 = 9999
    

wd_authkey (string)

Specifies the authentication key used for all watchdog communications. All Pgpool-II must have the same key. Packets from watchdog having different key will get rejected. This authentication is also applied to the heartbeat signals when the heartbeat mode is used as a lifecheck method.

Since in Pgpool-IIV3.5 or beyond wd_authkey is also used to authenticate the watchdog IPC clients, all clients communicating with Pgpool-II watchdog process needs to provide this wd_authkey value for "IPCAuthKey" key in the JSON data of the command.

Default is '' (empty) which means disables the watchdog authentication.

This parameter can only be set at server start.

5.15.3. Upstream server connection

trusted_servers (string)

Specifies the list of trusted servers to check the up stream connections. Each server in the list is required to respond to ping. Specify a comma separated list of servers such as "hostA,hostB,hostC". If none of the server are reachable, watchdog will regard it as failure of the Pgpool-II. Therefore, it is recommended to specify multiple servers. Please note that you should not assign PostgreSQL servers to this parameter.

This parameter can only be set at server start.

trusted_server_command (string)

Specifies a user command to run when Pgpool-II check that trusted servers respond to ping. Any %h in the string is replaced by each host name specified trusted_servers. Default is ping -q -c3 %h.

This parameter can only be set at server start.

5.15.4. Virtual IP control

delegate_ip (string)

Specifies the virtual IP address (VIP) of Pgpool-II that is connected from client servers (application servers etc.). When a Pgpool-II is switched from standby to active, the Pgpool-II takes over this VIP. VIP will not be brought up in case the quorum does not exist. Default is ''(empty): which means virtual IP will never be brought up.

This parameter can only be set at server start.

if_cmd_path (string)

Specifies the path to the command that Pgpool-II will use to switch the virtual IP on the system. Set only the path of the directory containing the binary, such as "/sbin" or such directory. If if_up_cmd or if_down_cmd starts with "/", this parameter will be ignored.

This parameter can only be set at server start.

if_up_cmd (string)

Specifies the command to bring up the virtual IP. Set the command and parameters such as "ip addr add $_IP_$/24 dev eth0 label eth0:0". Since root privilege is required to execute this command, use setuid on ip command or allow Pgpool-II startup user (postgres user by default) to run sudo command without a password, and specify it such as "/usr/bin/sudo /sbin/ip addr add $_IP_$/24 dev eth0 label eth0:0". $_IP_$ will get replaced by the IP address specified in the delegate_ip.

This parameter can only be set at server start.

if_down_cmd (string)

Specifies the command to bring down the virtual IP. Set the command and parameters such as "ip addr del $_IP_$/24 dev eth0". Since root privilege is required to execute this command, use setuid on ip command or allow Pgpool-II startup user (postgres user by default) to run sudo command without a password, and specify it such as "/usr/bin/sudo /sbin/ip addr del $_IP_$/24 dev eth0". $_IP_$ will get replaced by the IP address specified in the delegate_ip.

This parameter can only be set at server start.

arping_path (string)

Specifies the path to the command that Pgpool-II will use to send the ARP requests after the virtual IP switch. Set only the path of the directory containing the binary, such as "/usr/sbin" or such directory. If arping_cmd starts with "/", this parameter will be ignored.

This parameter can only be set at server start.

arping_cmd (string)

Specifies the command to use for sending the ARP requests after the virtual IP switch. Set the command and parameters such as "arping -U $_IP_$ -w 1 -I eth0". Since root privilege is required to execute this command, use setuid on ip command or allow Pgpool-II startup user (postgres user by default) to run sudo command without a password, and specify it such as "/usr/bin/sudo /usr/sbin/arping -U $_IP_$ -w 1 -I eth0". $_IP_$ will get replaced by the IP address specified in the delegate_ip.

This parameter can only be set at server start.

ping_path (string)

Specifies the path of a ping command to check startup of the virtual IP. Set the only path of the directory containing the ping utility, such as "/bin" or such directory.

This parameter can only be set at server start.

5.15.5. Behaviour on escalation and de-escalation

Configuration about behavior when Pgpool-II escalates to active (virtual IP holder)

clear_memqcache_on_escalation (boolean)

When set to on, watchdog clears all the query cache in the shared memory when pgpool-II escalates to active. This prevents the new active Pgpool-II from using old query caches inconsistent to the old active.

Default is on.

This works only if memqcache_method is 'shmem'.

This parameter can only be set at server start.

wd_escalation_command (string)

Watchdog executes this command on the node that is escalated to the leader watchdog.

This command is executed just before bringing up the virtual IP if that is configured on the node.

This parameter can only be set at server start.

wd_de_escalation_command (string)

Watchdog executes this command on the leader Pgpool-II watchdog node when that node resigns from the leader node responsibilities. A leader watchdog node can resign from being a leader node, when the leader node Pgpool-II shuts down, detects a network blackout or detects the lost of quorum.

This command is executed before bringing down the virtual/floating IP address if it is configured on the watchdog node.

wd_de_escalation_command is not available prior to Pgpool-II V3.5.

This parameter can only be set at server start.

5.15.6. Controlling the Failover behavior

These settings are used to control the behavior of backend node failover when the watchdog is enabled. The effect of these configurations is limited to the failover/degenerate requests initiated by Pgpool-II internally, while the user initiated detach backend requests (using PCP command) by-pass these configuration settings.

failover_when_quorum_exists (boolean)

When enabled, Pgpool-II will consider quorum when it performs the degenerate/failover on backend node.

We can say that "quorum exists" if the number of live watchdog nodes (that is number of Pgpool-II nodes) can be a majority against the total number of watchdog nodes. For example, suppose number of watchdog nodes is 5. If number of live nodes is greater than or equal to 3, then quorum exists. On the other hand if number of live nodes is 2 or lower, quorum does not exist since it never be majority.

If the quorum exists, Pgpool-II could work better on failure detection because even if a watchdog node mistakenly detects a failure of backend node, it would be denied by other major watchdog nodes. Pgpool-II works that way when failover_require_consensus is on (the default), but you can change it so that immediate failover happens when a failure is detected. A Pgpool-II node which mistakenly detects failure of backend node will quarantine the backend node.

The existence of quorum can be shown by invoking pcp_watchdog_info command with --verbose option. If Quorum state is QUORUM EXIST or QUORUM IS ON THE EDGE, then the quorum exists. If Quorum state is QUORUM ABSENT, then the quorum does not exist.

In the absence of the quorum, Pgpool-II node that detects the backend failure will quarantine the failed backend node until the quorum exists again.

Although it is possible to force detaching the quarantine node by using pcp_detach_node command, it is not possible to attach the node again by using pcp_attach_node command.

The quarantine nodes behaves similar to the detached backend nodes but unlike failed/degenerated backends the quarantine status is not propagated to the other Pgpool-II nodes in the watchdog cluster, So even if the backend node is in the quarantine state on one Pgpool-II node, other Pgpool-II nodes may still continue to use that backend.

Although there are many similarities in quarantine and failover operations, but they both differ in a very fundamental way. The quarantine operations does not executes the failover_command and silently detaches the problematic node, so in the case when the primary backend node is quarantined, the Pgpool-II will not promote the standby to take over the primary node responsibilities and until the primary node is quarantined the Pgpool-II will not have any usable primary backend node.

Moreover, unlike for the failed nodes, Pgpool-II keeps the health-check running on the quarantined nodes and as soon as the quarantined node becomes reachable again it gets automatically re-attached. Note that this is only applied to Pgpool-II V4.1 or greater. If you are using previous versions you need to re-attach the quarantined node by using pcp_attach_node when the connectivity issue is solved.

From Pgpool-II V4.1 onward, if the watchdog-leader node fails to build the consensus for primary backend node failover and the primary backend node gets into a quarantine state, then it resigns from its leader/coordinator responsibilities and lowers its wd_priority for next leader election and let the cluster elect some different new leader.

Note: When the leader node fails to build the consensus for standby backend node failure, it takes no action and similarly quarantined standby backend nodes on watchdog-leader do not trigger a new leader election.

If this parameter is off, failover will be triggered even if quorum does not exist.

Default is on.

failover_when_quorum_exists is not available prior to Pgpool-II V3.7.

This parameter can only be set at server start.

Note: enabling failover_when_quorum_exists is not allowed in native replication mode.

failover_require_consensus (boolean)

When enabled, Pgpool-II will perform the degenerate/failover on a backend node if the watchdog quorum exists and at-least minimum number of nodes necessary for the quorum vote for the failover.

For example, in a three node watchdog cluster, the failover will only be performed until at least two nodes ask for performing the failover on the particular backend node.

If this parameter is off, failover will be triggered even if there's no consensus.

Default is on.

Caution

When failover_require_consensus is enabled, Pgpool-II does not execute the failover until it get enough votes from other Pgpool-II nodes. So it is strongly recommended to enable the backend health check on all Pgpool-II nodes to ensure proper detection of backend node failures. For more details of health check, see Section 5.9.

Note: enabling failover_require_consensus is not allowed in native replication mode.

failover_require_consensus is not available prior to Pgpool-II V3.7. and it is only effective when failover_when_quorum_exists is enabled

This parameter can only be set at server start.

allow_multiple_failover_requests_from_node (boolean)

This parameter works in connection with the failover_require_consensus. When enabled, a single Pgpool-II node can cast multiple votes for the failover.

For example, in a three node watchdog cluster, if one Pgpool-II node sends two failover requests for a particular backend node failover, Both requests will be counted as a separate vote in the favor of the failover and Pgpool-II will execute the failover, even if it does not get the vote from any other Pgpool-II node.

For example, if an error found in a health check round does not get enough vote and the error still persists, next round of health check will give one more vote. This parameter is useful if you want to detect a persistent error which might not be found by other watchdog nodes.

Default is off.

allow_multiple_failover_requests_from_node is not available prior to Pgpool-II V3.7. and it is only effective when both failover_when_quorum_exists and failover_require_consensus are enabled

This parameter can only be set at server start.

enable_consensus_with_half_votes (boolean)

This parameter configures how the majority rule computation is made by Pgpool-II for calculating the quorum and resolving the consensus for failover.

Note: This parameter affects not only the failover behavior of the backend but the quorum and the failover behavior of Pgpool-II itself.

When enabled the existence of quorum and consensus on failover requires only half of the total number of votes configured in the cluster. Otherwise, both of these decisions require at least one more vote than half of the total number of votes. For failover, this parameter works in conjunction with the failover_require_consensus. In both cases, whether making a decision of quorum existence or building the consensus on failover this parameter only comes into play when the watchdog cluster is configured for even number of Pgpool-II nodes. The majority rule decision in the watchdog cluster having an odd number of participants. It is not affected by the value of this configuration parameter.

For example, when this parameter is enabled in a two node watchdog cluster, one Pgpool-II node needs to be alive to make the quorum exist. If the parameter is off, two nodes need to be alive to make quorum exist.

When this parameter is enabled in a four node watchdog cluster, two Pgpool-II node needs to be alive to make the quorum exist. If the parameter is off, three nodes need to be alive to make quorum exist.

By enabling this parameter, you should aware that you take a risk to make split-brain happen. For example, in four node cluster consisted of node A, B, C and D, it is possible that the cluster goes into two separated networks (A, B) and (C, D). For (A, B) and (C, D) the quorum still exist since for both groups there are two live nodes out of 4. The two groups choose their own leader watchdog, which is a split-brain.

Default is off.

enable_consensus_with_half_votes is not available prior to Pgpool-II V4.1. The prior versions work as if the parameter is on.

This parameter can only be set at server start.

5.15.7. Controlling the watchdog cluster membership

By default the watchdog cluster consists of all watchdog nodes that are defined in the pgpool.conf file irrespective of the current state of the node. Whether the node is LOST, SHUTDOWN or never started, the node is considered as the member of the watchdog cluster definition as long as it is configured in the configuration file. All the majority rule computations for identifying the existence of a quorum and resolving the consensus are made based on the number of watchdog nodes that makes up the watchdog cluster.

Pgpool-II V4.3 enables dynamic cluster definition by introducing the concept of Member and Nonmember. If the node's membership gets revoked from the watchdog cluster, then the cluster re-calibrate itself dynamically to adjust all subsequent majority rule computations.

All majority rule computations are done based on the number of member watchdog nodes instead of total number of configured nodes.

For example: In a five node cluster (pgpool.conf has five watchdog nodes defined) at-least three nodes need to be alive to make the quorum. With the dynamic cluster membership mechanism the cluster can re-adjust itself to only count the MEMBER nodes (Member node doesn't necessarily need to be alive). That means effectively a single alive node can also fulfill the quorum requirements (depending on the membership criteria settings) if at some point in time the cluster only has one or two member nodes.

Caution

Using the dynamic cluster membership has an associated risk of causing a split-brain. So it is strongly recommended to carefully review if the setup requires the dynamic cluster membership and consider using conservative values for related settings.

These settings configures when the node is marked as Nonmember. Leaving all these settings to default values retains the pre V4.3 behaviour.

wd_remove_shutdown_nodes (boolean)

When enabled, the SHUTDOWN nodes are immediately marked as Nonmember and removed from the cluster. If the previously shutdown node starts again, it gets added to cluster automatically.

Default is off.

wd_lost_node_removal_timeout (integer)

Timeout in seconds to mark the LOST watchdog node as Nonmember and remove from the cluster. When LOST node re-connects to the cluster, its cluster membership is restored.

Setting the timeout to 0 disables the removal of LOST nodes from cluster.

Default is 0.

wd_no_show_node_removal_timeout (integer)

Timeout in seconds to mark the node as Nonmember if it doesn't show up at cluster initialisation. Nonmember node becomes the cluster Member as soon as it starts up and connects to the cluster.

Setting the timeout to 0 disables the removal of NO-SHOW nodes from cluster.

Default is 0.

5.15.8. Life checking Pgpool-II

Watchdog checks pgpool-II status periodically. This is called "life check".

wd_lifecheck_method (string)

Specifies the method of life check. This can be either of 'heartbeat' (default), 'query' or 'external'.

heartbeat: In this mode, watchdog sends the heartbeat signals (UDP packets) periodically to other Pgpool-II. Similarly watchdog also receives the signals from other Pgpool-II . If there are no signal for a certain period, watchdog regards is as failure of the Pgpool-II .

query: In this mode, watchdog sends the monitoring queries to other Pgpool-II and checks the response. When installation location between Pgpool-II servers is far, query may be useful.

Caution

In query mode, you need to set num_init_children large enough if you plan to use watchdog. This is because the watchdog process connects to Pgpool-II as a client.

external: This mode disables the built in lifecheck of Pgpool-II watchdog and relies on external system to provide node health checking of local and remote watchdog nodes.

external mode is not available in versions prior to Pgpool-II V3.5.

This parameter can only be set at server start.

wd_monitoring_interfaces_list (string)

Specify a comma separated list of network device names, to be monitored by the watchdog process for the network link state. If all network interfaces in the list becomes inactive (disabled or cable unplugged), the watchdog will consider it as a complete network failure and the Pgpool-II node will commit the suicide. Specifying an ''(empty) list disables the network interface monitoring. Setting it to 'any' enables the monitoring on all available network interfaces except the loopback. Default is '' empty list (monitoring disabled).

wd_monitoring_interfaces_list is not available in versions prior to Pgpool-II V3.5.

This parameter can only be set at server start.

wd_interval (integer)

Specifies the interval between life checks of Pgpool-II in seconds. (A number greater than or equal to 1) Default is 10.

This parameter can only be set at server start.

wd_priority (integer)

This parameter can be used to elevate the local watchdog node priority in the elections to select leader watchdog node. The node with the higher wd_priority value will get selected as leader watchdog node when cluster will be electing its new leader node in the event of old leader watchdog node failure. wd_priority is also valid at the time of cluster startup. When some watchdog nodes start up at same time,a node with the higher wd_priority value is selected as a leader node. So we should start watchdog nodes in order of wd_priority priority to prevent unintended nodes from being selected as leader. If wd_priority of each nodes has the same value, leader node will be decided on base of starting time of watchdog.

wd_priority is not available in versions prior to Pgpool-II V3.5.

This parameter can only be set at server start.

wd_ipc_socket_dir (string)

The directory where the UNIX domain socket accepting Pgpool-II watchdog IPC connections will be created. Default is '/tmp'. Be aware that this socket might be deleted by a cron job. We recommend to set this value to '/var/run' or such directory.

wd_ipc_socket_dir is not available in versions prior to Pgpool-II V3.5.

This parameter can only be set at server start.

5.15.9. Lifecheck Heartbeat mode configuration

heartbeat_hostnameX (string)

Specifies the IP address or hostname for sending and receiving the heartbeat signals. The number at the end of the parameter name is referred as "pgpool node id", and it starts from 0 (e.g. heartbeat_hostname0). You can specify multiple IP address or hostname by separating them using semicolon (;).

heartbeat_hostnameX is only applicable if the wd_lifecheck_method is set to 'heartbeat'

This parameter can only be set at server start.

heartbeat_portX (integer)

Specifies the port number for sending and receiving the heartbeat signals. Specify only one port number here. Default is 9694. The number at the end of the parameter name is referred as "pgpool node id", and it starts from 0 (e.g. heartbeat_port0).

heartbeat_portX is only applicable if the wd_lifecheck_method is set to 'heartbeat'

This parameter can only be set at server start.

heartbeat_deviceX (string)

Specifies the network device name for sending and receiving the heartbeat signals. The number at the end of the parameter name is referred as "pgpool node id", and it starts from 0 (e.g. heartbeat_device0). You can specify multiple network devices by separating them using semicolon (;).

heartbeat_deviceX is only applicable if Pgpool-II is started with root privilege. If not, leave it as an empty string ('').

heartbeat_deviceX is only applicable if the wd_lifecheck_method is set to 'heartbeat'

This parameter can only be set at server start.

Example 5-12. Heartbeat configuration

If you have 3 pgpool nodes with hostname server1, server2 and server3, you can configure heartbeat_hostname, heartbeat_port and heartbeat_device like below:

     heartbeat_hostname0 = 'server1'
     heartbeat_port0 = 9694
     heartbeat_device0 = ''

     heartbeat_hostname1 = 'server2'
     heartbeat_port1 = 9694
     heartbeat_device1 = ''

     heartbeat_hostname2 = 'server3'
     heartbeat_port2 = 9694
     heartbeat_device2 = ''
	

wd_heartbeat_keepalive (integer)

Specifies the interval time in seconds between sending the heartbeat signals. Default is 2. wd_heartbeat_keepalive is only applicable if the wd_lifecheck_method is set to 'heartbeat'

This parameter can only be set at server start.

wd_heartbeat_deadtime (integer)

Specifies the time in seconds before marking the remote watchdog node as failed/dead node, if no heartbeat signal is received within that time. Default is 30 wd_heartbeat_deadtime is only applicable if the wd_lifecheck_method is set to 'heartbeat'

This parameter can only be set at server start.

5.15.10. Lifecheck Query mode configuration

wd_life_point (integer)

Specifies the number of times to retry a failed life check of pgpool-II. Valid value could be a number greater than or equal to 1. Default is 3.

wd_life_point is only applicable if the wd_lifecheck_method is set to 'query'

This parameter can only be set at server start.

wd_lifecheck_query (string)

Specifies the query to use for the life check of remote Pgpool-II. Default is "SELECT 1".

wd_lifecheck_query is only applicable if the wd_lifecheck_method is set to 'query'

This parameter can only be set at server start.

wd_lifecheck_dbname (string)

Specifies the database name for the connection used for the life check of remote Pgpool-II. Default is "template1".

wd_lifecheck_dbname is only applicable if the wd_lifecheck_method is set to 'query'

This parameter can only be set at server start.

wd_lifecheck_user (string)

Specifies the user name for the connection used for the life check of remote Pgpool-II. Default is "nobody".

wd_lifecheck_user is only applicable if the wd_lifecheck_method is set to 'query'

This parameter can only be set at server start.

wd_lifecheck_password (string)

Specifies the password for the user used for the life check of remote Pgpool-II.

If wd_lifecheck_password is left blank Pgpool-II will first try to get the password for wd_lifecheck_user from pool_passwd file before using the empty password.

You can also specify AES256-CBC encrypted password in wd_lifecheck_password field. To specify the AES encrypted password, password string must be prefixed with AES after encrypting (using aes-256-cbc algorithm) and encoding to base64.

To specify the unencrypted clear text password, prefix the password string with TEXT. For example if you want to set mypass as a password, you should specify TEXTmypass in the password field. In the absence of a valid prefix, Pgpool-II will considered the string as a plain text password.

You can also use pg_enc utility to create the correctly formatted AES encrypted password strings.

Note: Pgpool-II will require a valid decryption key at the startup to use the encrypted passwords. see Section 6.4.2 for more details on providing the decryption key to Pgpool-II

wd_lifecheck_password is only applicable if the wd_lifecheck_method is set to 'query'

This parameter can only be set at server start.

Default is ''(empty).