[pgpool-hackers: 3907] Re: Proposal: If replication delay exceeds delay_threshold, elect a new load balance node with less delay

Tatsuo Ishii ishii at sraoss.co.jp
Wed May 26 19:22:59 JST 2021


>> Hi Peng.
>> 
>> I modified my patch.
>> 
>> The select_lower_delay_load_balance_node.patch_r3 includes documentation update.
> 
> It seems you coding sytle is against our standard. We inherit the
> coding style of PostgreSQL:
> https://www.postgresql.org/docs/13/source-format.html

Sorry, I was looking into the previous patch. Please disregard my comment.

- comment to the docs

> +      This parameter is only valid when delay_threshold is set to larger than 0.

+      This parameter is valid only when delay_threshold is set greater than 0.

> +      When set to on, if load balancing node is delayed over delay_threshold
> +      Pgpool-II send the query to not the primary node but the least delay standby which is set backend_weight to larger than 0.

+      When set to on, if the delay of the load balancing node is greater than delay_threshold,
+      <productname>Pgpool-II</productname> does not send read queries to the primary node but to the least delay standby node with backend_weight greater than 0.

> +      If all standby nodes are delayed over delay_threshold, Pgpool-II send to the primary. Default is off.

+      If delay of all standby nodes are greater than delay_threshold, <productname>Pgpool-II</productname> sends read queries to the primary. Default is off.


>> The test.sh is the regression test script.

The test script only works with PostgreSQL 12 or later because the
test rely on the feature that only exists in PostgreSQL 12 or later:
postgresql.conf's include_if_exists feature.

You need patches for src/util/pool_process_reporting.c so that "show
pool_status" includes prefer_lower_delay_standby.

Finally I have briefly checked the feature using pgbench.

$ pgpool_setup -n 3

$ echo "prefer_lower_delay_standby = on" >> etc/pgpool.conf 
$ echo "statement_level_load_balance = on" >> etc/pgpool.conf 
$ ./startall
$ psql -p 11003 -c "select pg_wal_replay_pause()" test
$ pgbench -i -p 11000 test
$ psql -p 11000 -c "show pool_nodes" test
 node_id | hostname | port  | status | pg_status | lb_weight |  role   | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | /tmp     | 11002 | up     | up        | 0.333333  | primary | primary | 0          | false             | 0                 |                   |                        | 2021-05-26 18:55:13
 1       | /tmp     | 11003 | up     | up        | 0.333333  | standby | standby | 0          | false             | 13188800          | streaming         | async                  | 2021-05-26 18:55:13
 2       | /tmp     | 11004 | up     | up        | 0.333333  | standby | standby | 0          | true              | 0                 | streaming         | async                  | 2021-05-26 18:55:13
(3 rows)

$ pgbench -p 11000 -n -S -t 100 test

t-ishii$ psql -p 11000 -c "show pool_nodes" test
 node_id | hostname | port  | status | pg_status | lb_weight |  role   | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | /tmp     | 11002 | up     | up        | 0.333333  | primary | primary | 29         | true              | 0                 |                   |                        | 2021-05-26 18:55:13
 1       | /tmp     | 11003 | up     | up        | 0.333333  | standby | standby | 0          | false             | 13188800          | streaming         | async                  | 2021-05-26 18:55:13
 2       | /tmp     | 11004 | up     | up        | 0.333333  | standby | standby | 73         | false             | 0                 | streaming         | async                  | 2021-05-26 18:55:13
(3 rows)

Good news is, with prefer_lower_delay_standby, SELECT is not sent to
standby node 1 because its replication delay 13188800 exceeds
delay_threshold 10000000. However, select_cnt of primary and standby
node 2 looks strange since lb_weight of both nodes are
identical. Because pgbench issues 100 SELECTs, select_cnt of both
nodes should be close to 50 and 50, no?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


More information about the pgpool-hackers mailing list