[pgpool-hackers: 4389] statement_level_load_balance doesn't consider delay_threshold

NINGWEI CHEN chen at sraoss.co.jp
Tue Aug 29 13:49:02 JST 2023


Dear pgpool-hackers,

In versions of Pgpool-II prior to 4.2, it appears that the feature 
statement_level_load_balance does not take delay_threshold into consideration.

When statement_level_load_balance is set to off, if the delay of standby replication 
exceeds the delay_threshold, that standby node will no longer be selected as a target node. 
However, when turning statement_level_load_balance to on, nodes that surpass the delay_threshold 
will still be chosen, and resulting in errors during reference queries.

Between versions 4.1 and 4.2, you can replicate the error as follows. 
It seems that this issue no longer occurs starting from version 4.3.

I am going to look into this.

* settings in pgpool.conf
======
statement_level_load_balance = on
delay_threshold = 10000000
======

* reproduction steps
======
$ pgpool_setup -n 4
$ ./startall
$ psql -p 11003 -c "select pg_wal_replay_pause()" test
$ pgbench -i -p 11000 test

$ psql -p 11000 -c "show pool_nodes" test
 node_id | hostname  | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
---------+-----------+-------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | localhost | 11002 | up     | 0.250000  | primary | 6          | false             | 0                 |                   |                        | 2023-08-29 12:56:14
 1       | localhost | 11003 | up     | 0.250000  | standby | 1          | false             | 13270144          | streaming         | async                  | 2023-08-29 12:56:14
 2       | localhost | 11004 | up     | 0.250000  | standby | 1          | false             | 0                 | streaming         | async                  | 2023-08-29 12:56:14
 3       | localhost | 11005 | up     | 0.250000  | standby | 0          | true              | 0                 | streaming         | async                  | 2023-08-29 12:56:14
(4 rows)

$ pgbench -p 11000 -n -S -t 100 test
pgbench (16beta3)
pgbench: error: client 0 script 0 aborted in command 1 query 0: ERROR:  relation "pgbench_accounts" does not exist
LINE 1: SELECT abalance FROM pgbench_accounts WHERE aid = 53948;
                             ^
transaction type: <builtin: select only>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
number of transactions per client: 100
number of transactions actually processed: 3/100
number of failed transactions: 0 (0.000%)
latency average = 1.987 ms
initial connection time = 0.724 ms
tps = 503.355705 (without initial connection time)
pgbench: error: Run was aborted; the above results are incomplete.
======


Best Regards.
-- 
SRA OSS LLC
Chen Ningwei<chen at sraoss.co.jp>



More information about the pgpool-hackers mailing list