Release Date: 2019-10-31
This version implements long awaited features including statement_level_load_balance and auto_failback. Also it enhances number of areas related to performance. Finally it imports PostgreSQL 12's new SQL parser.
Major enhancements in Pgpool-II 4.1 include:
Statement level load balancing. Previously Pgpool-II only allows session level load balancing. This version allows to use statement level load balancing, which is useful for frontends permanently connecting to Pgpool-II but want to use existing standby server resources.
Auto failback allows to automatically attach streaming replication standby servers, which are considered safe enough to failback.
Enhance performance in number of areas.
Shared relation cache allows to reuse relation cache among sessions to reduce internal queries against PostgreSQL system catalogs.
Have separate SQL parser for DML statements to eliminate unnecessary parsing effort.
Load balancing control for specific queries.
Import PostgreSQL 12 SQL parser.
Version 4.1 contains some changes that may affect compatibility with previous releases. Observe the following incompatibilities:
Add replication_state and replication_sync_state columns of SHOW POOL NODES and friends. (Tatsuo Ishii)
This allows to show important information from pg_stat_replication, which is available from PostgreSQL 9.1 (also with replication_state_sync. it's only available since 9.2 however). For this purpose new backend_application_name parameter is added to each backend_host configuration parameters. pg_stat_replication is called from streaming replication delay checking process. So if sr_check_period is 0, those new columns are not available.
Add parameter enable_consensus_with_half_votes to configure majority rule calculations. (Muhammad Usama, Tatsuo Ishii)
This changes the behavior of the decision of quorum existence and failover consensus on even number (i.e. 2, 4, 6...) of watchdog clusters. Odd number of clusters (3, 5, 7...) are not affected. When this parameter is off (the default), a 2 node watchdog cluster needs to have both 2 nodes are alive to have a quorum. If the quorum does not exist and 1 node goes down, then 1) VIP will be lost, 2) failover script is not executed and 3) no watchdog master exists. Especially #2 could be troublesome because no new primary PostgreSQL exists if existing primary goes down. Probably 2 node watchdog cluster users want to turn on this parameter to keep the existing behavior. On the other hand 4 or more even number of watchdog cluster users will benefit from this parameter is off because now it prevents possible split brain when a half of watchdog nodes go down.
If installing from RPMs, by default Pgpool-II is started by postgres user. (Bo Peng)
Because of the security reason, the Pgpool-II default startup user is changed to postgres user.
If installing from RPMs, postrges user will be allowed to run if_up/down_cmd and arping_cmd with sudo without a password. If if_up/down_cmd or arping_cmd starts with "/", the setting specified in if_cmd_path or arping_path will be ignored.
Down grade LOG to DEBUG5 in sent message module. (Tatsuo Ishii)
Allow to use statement level load balancing. (Bo Peng)
This feature enables selecting load balancing node per statement. The current feature for load balancing, the load balancing node is decided at the session start time and will not be changed until the session ends. When set to statement_level_load_balance = on, the load balancing node is decided for each read query. For example, in applications that use connection pooling remain connections open to the backend server, because the session may be held for a long time, the load balancing node does not change until the session ends. In such applications, when statement_level_load_balance is enabled, it is possible to decide load balancing node per query, not per session.
Add auto_failback (Takuma Hoshiai).
This allows to reattach backend node automatically that is in DOWN status but actually it is running normally.
To use this feature it is required that PostgreSQL is 9.1 or later and new configuration variable auto_failback is enabled. Also Pgpool-II must be operating in streaming-replication mode, with sr_check and health_check are enabled. Pgpool-II calls pg_stat_replication on the PostgreSQL primary server to make sure that the standby node in question is running and receiving replication stream from the primary server.
This feature is useful in the case that a standby server fails over due to a temporary network failure.
Add new enable_shared_relcache parameter. (Takuma Hoshiai)
The relation cache were stored in local cache of child processes, so all child processes executed same query to get relation cache. If enable_shared_relcache is on, the relation cache is stored in memory cache and all child process share it. It will expect to reduce the load that same query is executed.
Add new parameter check_temp_table to check temporary tables. (Tatsuo Ishii)
Checking temporary tables is slow because it needs to lookup system catalogs. To eliminate the lookup, new method to trace CREATE TEMP TABLE/DROP TABLE is added. To use the new method, set check_temp_table to trace.
Note that it is impossible to trace tables created in functions and triggers. In this case existing method should be used.
Reduce internal queries against system catalogs. (Tatsuo Ishii)
Currently the relcache module issues 7+ queries to obtain various info from PostgreSQL system catalogs. Some of them are necessary for Pgpool-II to work with multiple version of PostgreSQL. To reduce such internal queries, get PostgreSQL version to know what kind of queries are needed. For example, we need to know if pg_namespace exists and for this purpose we send a query against pg_class. But if we know that pg_namespace was introduced in PostgreSQL 7.3, we do not need to inquire pg_class.
Performance enhancements for the large INSERT and UPDATE statements. (Muhammad Usama)
Pgpool-II only needs very little information, especially for the INSERT and UPDATE statements to decide where it needs to send the query. For example: In master-slave mode, for the INSERT statements Pgpool-II only requires the relation name referenced in the statement while it doesn't care much about the column values and other parameters. But since the parser we use in Pgpool-II is taken from PostgreSQL source which parses the complete query including the value lists which seems harmless for smaller statements but in case of INSERT and UPDATE with lots of column values and large data in value items, consumes significant time.
So the idea here is to short circuit the INSERT and UPDATE statement parsing as soon as we have the required information. For that purpose, the commit adds the second minimal parser that gets invoked in master-slave mode and tries to extract the performance for large INSERT and UPDATE statements.
Apart from the second parser addition, changes aiming towards the performance enhancements are also part of the commit. See the commit log for more details.
Import PostgreSQL 12 beta2 new parser. (Bo Peng)
Major changes of PostgreSQL 12 parser include:
Add new VACUUM options:SKIP_LOCKED, INDEX_CLEANUP and TRUNCATE.
Add COMMIT AND CHAIN and ROLLBACK AND CHAIN commands.
Add a WHERE clause to COPY FROM.
Allow to use CREATE OR REPLACE AGGREGATE command.
Allow to use mcv (most-common-value) in CREATE STATISTICS.
ADD REINDEX option CONCURRENTLY.
Add EXPLAIN option SETTINGS.
Allow to route relcache queries to load balance node. (Tatsuo Ishii)
Queries to build relcache entries were always sent to master (primary) node. This is usually good because we could eliminate the bad effect of replication delay. However if users want to lower the load of master node, it would be nice if we could route the queries to other than master node. This patch introduces new parameter relcache_query_target. If it is set to load_balance_node, relcache queries will be routed to load balance node. If it is set to master, the queries are routed to master node, which is same as before (this is the default).
Disable load balance after a SELECT having functions specified in black function list or not specified in white function list. (Bo Peng)
In Pgpool-II 4.0 or earlier, if we set disable_load_balance_on_write = transaction, when a write query is issued inside an explicit truncation, subsequent queries should be sent to primary only until the end of this transaction in order to avoid the replication delay. However, the SELECTs having write functions specified in black_function_list or not specified in white_function_list are not regarded as a write query and the subsequent read queries are still load balanced. This commit will disable load balance after a SELECT having functions specified in black function list or not specified in white function list.
Implement new feature to not accept incoming connections. (Tatsuo Ishii)
Pgpool-II accepts up to num_init_children frontends and queues up more connection requests until one of child process becomes free. This mostly works well, but if each session takes long time, the queue grows longer and the whole system does not work smoothly. To overcome the problem, a new way to deal with many connection requests from frontend is implemented: When reserved_connections is set to 1 or greater, incoming connections from clients are not accepted with error message "Sorry, too many clients already", rather than blocked if the number of current connections from clients is more than (num_init_children - reserved_connections). This is exactly the same behavior as PostgreSQL.
Enhance performance by eliminating select(2) system calls when they are not necessary. (Tatsuo Ishii, Jesper Pedersen)
Enhance performance while sending message to frontend. (Tatsuo Ishii)
SimpleForwardToFrontend(), which is responsible for sending message to frontend, does write buffering only if it is either 'D' (DataRow) or 'd' (CopyData). Other message types were immediately written to socket. But actually this was not necessary. So if the messages are not critical, just write to buffer. With this 10-17% performance enhance was observed.
Avoid error or notice message analysis if it's not necessary. (Tatsuo Ishii)
After sending query to backend, Pgpool-II always calls pool_extract_error_message() via per_node_error_log(). In the function memory allocation is performed even if error or notice message is returned from backend. To avoid the waste of CPU cycle, check message kind and avoid calling pool_extract_error_message() if it's not error or notice message.
Enhance performance of CopyData message handling. (Tatsuo Ishii)
When COPY XX FROM STDIN gets executed (typical client is pg_dump), each copy row data is sent from Pgpool-II to frontend using CopyData message. Previously, one CopyData message was followed by a flush, which costed a lot. Instead, now flush is done in subsequent Command Complete, Notice message or Error message. A quick test reveals that this change brings x2.5 speed up.
Support ECDH key exchange with SSL (Takuma Hoshiai)
Add backend_application_name to "pgpool show backend" group. (Tatsuo Ishii)
Deal with PostgreSQL 12. (Tatsuo Ishii)
recovery.conf cannot be used anymore. Standby's recovery configuration is now in postgresql.conf. Also "standby.signal" file is needed in PostgreSQL database cluster directory to start postmaster as a standby server.
HeapTupleGetOid() is not available any more in PostgreSQL 12. Use GETSTRUCT() and refer to oid column of Form_pg_proc.
Change pgpool_adm extension. Now that oid is gone, the signature of CreateTemplateTupleDesc() has been changed.
Speed up failover when all of backends are down. (Tatsuo Ishii)
Pgpool-II tries to find primary node till search_primary_node_timeout expires even if all of the backend are in down status. This is not only a waste of time but makes Pgpool-II looked like hanged because while searching primary node failover process is suspended and all of the Pgpool-II child process are in defunct state, thus there's no process which accepts connection requests from clients. Since the default value of searching primary is 300 seconds, typically this keeps on for 300 seconds. This is not comfortable for users.
To fix this immediately give up finding primary node regardless search_primary_node_timeout and promptly finish the failover process if all of the backend are in down status.
Resign the master watchdog node from master responsibilities if the primary backend node gets into quarantine state on that. (Muhammad Usama)
By doing this, we could avoid the situation on which there's no primary PostgreSQL server exists. To implement this, make the master/coordinator watchdog node resign from its status if it fails to get the consensus for the quarantined primary node failover, within FAILOVER_COMMAND_FINISH_TIMEOUT(15) seconds.
When the watchdog master resigns, because of quarantined primary node its wd_priority is decreased to (-1), so that it should get the least preference in the next election for the master/coordinator node selection. And once the election is concluded the wd_priority for the node gets restored to the original configured value.
In case of failed consensus for standby node failover no action is taken.
Add parameter enable_consensus_with_half_votes to configure majority rule calculations. (Muhammad Usama, Tatsuo Ishii)
Pgpool-II takes the decision of quorum existence and failover consensus after receiving the exact 50% of votes when the watchdog cluster is configured with an even number of nodes. With enable_consensus_with_half_votes parameter, users can tell Pgpool-II, whether the distributed consensus in an even number of nodes cluster requires (n/2) or ((n/2) +1) votes to decide the majority. Odd number of clusters (3, 5, 7...) are not affected. Extra caution is needed for 2 node watchdog cluster users. See Section A.13.2 for more details.
Allow to specify absolute path in pool_passwd. (Bo Peng)
Patch is provided by Danylo Hlynskyi.
Add various sample scripts. (Bo Peng)
Allow failover.sh.sample, follow_master.sh.sample, recovery_1st_stage.sample, recovery_2nd_stage.sample, pgpool_remote_start.sample scripts to be included in distributions.
Add performance chapter (Chapter 7). (Tatsuo Ishii)
Enhance 'getting started' of 'tutorial' chapter, 'watchdog' of 'tutorial' and some sections of 'server administration'(takuma hoshiai)
Update configuration example "Pgpool-II + watchdog setup example". (bo peng)
Mention that schema qualifications cannot be used in white/black_function_list. (tatsuo Ishii)
Add note to detach_false_primary configuration parameter. (tatsuo ishii)
Add more explanation to follow_master_command. (tatsuo ishii)
Enhance watchdog/pgpool-ii example so that it mentions about pg_monitor role. (tatsuo ishii)
Mention that multi-statement queries are sent to primary node only. (tatsuo ishii)
Add load balancing description. (tatsuo ishii)
Add useful link how to create pcp.conf in the pcp reference page. (tatsuo ishii)
Add more description to pcp_node_info manual. (tatsuo ishii)
Add description to pg_md5 man page how to show pool_passwd ready string. (tatsuo ishii)
Enhance client authentication docs. (tatsuo ishii)
Enhance watchdog documents regarding quorum failover. (tatsuo ishii)
Mention that in raw mode or load_balance_mode = off case for relation cache. (tatsuo ishii)
Add general description about failover. (tatsuo ishii)