3.3. Configuring Pgpool-II

3.3.1. Configuring pgpool.conf

pgpool.conf is the main configuration file of Pgpool-II. You need to specify the path to the file when starting Pgpool-II using -f option. pgpool.conf is located at $prefix/etc/pgpool.conf by default, if it installed from source code.

To specify the Pgpool-II clustering mode, set backend_clustering_mode parameter to the value explained below.

Table 3-1. backend_clustering_mode value in pgpool.conf

Clustering modevalue
Streaming replication modestreaming_replication
Replication modenative_replication
Logical replication modelogical_replication
Slony modeslony
Snapshot isolation modesnapshot_isolation
Raw moderaw

These configuration files are located at /usr/local/etc with default installation from source code. You can copy one of them as pgpool.conf. (probably you need root privilege for this)

# cd /usr/local/etc
# cp pgpool.conf.sample pgpool.conf
    

3.3.2. Clustering mode of Pgpool-II

There are six different clustering modes in Pgpool-II: streaming replication mode, logical replication mode, main replica mode (slony mode), native replication mode, raw mode and snapshot isolation mode. In any mode, Pgpool-II provides connection pooling, and automatic fail over. Online recovery can be used only with streaming replication mode, snapshot isolation mode and native replication mode. See Section 5.11 for more details of online recovery.

Those modes are exclusive each other and cannot be changed after starting the server. You should make a decision which to use in the early stage of designing the system. If you are not sure, it is recommended to use the streaming replication mode or the snapshot isolation mode.

The streaming replication mode can be used with PostgreSQL servers operating streaming replication. In this mode, PostgreSQL is responsible for synchronizing databases. This mode is widely used and most recommended way to use Pgpool-II. Load balancing is possible in the mode. visibility consistency among nodes is not guaranteed.

In the snapshot isolation mode Pgpool-II is responsible for synchronizing databases. The advantage for the mode is the synchronization is done in synchronous way: writing to the database does not return until all of PostgreSQL servers finish the write operation. Also it guarantees the visibility consistency among nodes. To put it simply, it means that the visibility rule of transactions on single server is applied to a cluster consisting of multiple servers as well. This is a remarkable feature of the snapshot isolation mode in Pgpool-II. In fact, the snapshot isolation mode in Pgpool-II is the only system which guarantees the visibility consistency among nodes without modifications to PostgreSQL at the moment. Because of this, applications do not need to recognize that they are using a cluster consisting of PostgreSQL servers, rather than a single PostgreSQL system. However in this mode the transaction isolation level must be REPEATABLE READ. You need to set postgresql.conf like this:

default_transaction_isolation = 'repeatable read'
    

Also you need to aware that performance in the mode may be worse than the streaming replication mode and native replication mode due to the overhead to keep the consistency in transactions.

In the native replication mode, Pgpool-II is responsible for synchronizing databases. The advantage for the mode is the synchronization is done in synchronous way: writing to the database does not return until all of PostgreSQL servers finish the write operation. As visibility consistency among nodes is not guaranteed, it is recommended to use the snapshot isolation mode except you want to use other than REPEATABLE READ isolation mode. Load balancing is possible in the mode.

The logical replication mode can be used with PostgreSQL servers operating logical replication. In this mode, PostgreSQL is responsible for synchronizing tables. Load balancing is possible in the mode. Since logical replication does not replicate all tables, it's user's responsibility to replicate the table which could be load balanced. Pgpool-II load balances all tables. This means that if a table is not replicated, Pgpool-II may lookup outdated tables in the subscriber side.

The main replica mode (slony mode) can be used with PostgreSQL servers operating Slony. In this mode, Slony/PostgreSQL is responsible for synchronizing databases. Since Slony-I is being obsoleted by streaming replication, we do not recommend to use this mode unless you have specific reason to use Slony. Load balancing is possible in the mode.

In the raw mode, Pgpool-II does not care about the database synchronization. It's user's responsibility to make the whole system does a meaningful thing. Load balancing is not possible in the mode.

3.3.3. Process management modes

Pgpool-II implements a multi-process architecture where each child process can handle exactly one client connection at any time. The total number of concurrent client connections Pgpool-II can handle is configured by the num_init_children config parameter. Pgpool-II supports two child process management modes. Dynamic and Static. In static process management mode, Pgpool-II pre-forks the num_init_children number of child process at startup, and each child process keeps listening for incoming client connection. While with dynamic process management mode, Pgpool-II keeps track of idle processes and forks or kills processes to keep this number within the specified boundaries.

process_management_mode is not available prior to Pgpool-II V4.4.