[pgpool-general: 6912] Re: Inconsistency between DB nodes with native replication

Thu Mar 5 18:37:36 JST 2020

Hello,

> Hello,
> 
> After some further testing it seems that the source of our issue is somewhere else than in one of the nodes becoming quarantined. After applying the latest patch, the test script managed to create inconsistency after a few days of running. Based on that I decided to change the configuration in pgpool.conf so it will degenerate one of the nodes in case inconsistency is found by turning on these parameters:
> 
> replication_stop_on_mismatch = on
> failover_if_affected_tuples_mismatch = on
> 
> Since some of our developers are using tools, which send SELECT queries with comments (such as pgAdmin), I also had to turn on this parameter in load-balancing section, so it doesn't cause a backend failover in the case it queries something like a DB object by OID (which can differ between the nodes):
> 
> allow_sql_comments = on
> 
> So far it seems to be doing what I would expect - inconsistency happens, next UPDATE or DELETE discovers the fact, pgPool degenerates the second (non-master) node and next commands proceed normally on the remaining DB node. I can then run Online Recovery on the degenerated node (which seems to be working without an issue). I have a few questions though:
> 
> The  replication_stop_on_mismatch covers all the situations as failover_if_affected_tuples_mismatch does plus some others or do I need to have both of these parameters turned on, if I want to cover any possible case of inconsistency? 

Those parameters cover different
situation. replication_stop_on_mismatch covers the case the packets
and their kind are not same on DB nodes. For example, if SELECT is
sent to all nodes and some of returns different number of rows,
replication_stop_on_mismatch will detect it since the least row
returning node will then send "Command Complete" packet which is
different from the packet for row data.

failover_if_affected_tuples_mismatch covers when DML
(update/insert/delete) returns different number of affected rows. In
this case the number of packets and kind is same but the contents
(the number of affected rows is deffer).

So if you want the maximum protection, you would want to turn on both
parameters.

> From what I saw so far, it seems that the pgPool always degenerates the second (non-master) node, which actually works well for our case, since it seems the command, which causes the inconsistency is always skipping the second node. Can we always expect this behavior in the case we have two DB nodes? 

Pgpool does voting when those situations above happens to choose the
majority nodes. However if there are even number of DB nodes (e.g. 2,
4 ...) it is possible that no one can get the majority. In this case
Pgpool regards the master node is correct. If there are only 2 DB
nodes, apparently no one gets the majority and always the master wins.

> As for the testing of a possible inconsistency with my script...
> 
> Since the change in the pgpool.conf I changed the script, so that it doesn't send the control SELECT with a comment to the pgPool service, since it would just get load-balanced now, but I connect directly to each of the DB nodes and compare the result of the SELECT between them. Last case of inconsistency happened yesterday, I am sending the specific chunk of the log as an attachment. From what I can tell, it happens the same way as before - one of the INSERT commands gets performed on the master node (node 0) only, so each node has a different number of rows in the table, which gets picked up by the next UPDATE command. However, what I also noticed - in the case that said INSERT is performed on both nodes, there is also a SELECT on pg_catalog.pg_attrdef and pg_catalog.pg_class beforehand (on master node):
> 
> SELECT count(*) FROM pg_catalog.pg_attrdef AS d, pg_catalog.pg_class AS c WHERE d.adrelid = c.oid AND pg_get_expr(d.adbin, d.adrelid) ~ 'nextval' AND c.oid = pg_catalog.to_regclass('"replication_test"')
> 
> This SELECT, however, is not present in the case the INSERT is not performed on the second (non-master) node. Do you think this could be some kind of lead to where the issue comes from? Please, let me know in the case you have any idea where to look next for the source of the issue. I am also sending the latest pgpool.conf as an attachment in case there is some parameter setup that could be the cause of the behavior after the latest patch.

Not sure at this point. However I noticed that the pgpool process
(29684) sends DISCARD command to only 1 node. This is unusual since
pgpool should send DISCARD at the end of session to all DB nodes. The
only explanation is either 1) process 29684 believed that there is
only 1 DB node 2) node 1 is dead. I still think about the possibilty
that nodes 1 status was dead (or qurantine) at the time when the
process started to process clients request (INSERT) from process
29684's view. Do you have more log of process 29684 before this?

Mar  2 13:29:01 AISGPGP01 pgpool[29684]: [102356-1] 2020-03-02 13:29:01 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |LOG:  new connection received
Mar  2 13:29:01 AISGPGP01 pgpool[29684]: [102356-2] 2020-03-02 13:29:01 ::: APP - [No Connection] :A: DB - [No Connection] :D: USER - [No Connection] :U: |DETAIL:  connecting host=dbstest.cca.cz port=57116
Mar  2 13:29:02 AISGPGP01 pgpool[29684]: [102357-1] 2020-03-02 13:29:02 ::: APP - psql :A: DB - aisgdvyv :D: USER - aisg :U: |LOG:  DB node id: 0 backend pid: 27333 statement: SELECT attname, pg_get_expr(d.adbin, d.adrelid), coalesce((pg_get_expr(d.adbin, d.adrelid) LIKE '%now()%' OR pg_get_expr(d.adbin, d.adrelid) LIKE '%''now''::text%' OR pg_get_expr(d.adbin, d.adrelid) LIKE '%CURRENT_TIMESTAMP%' OR pg_get_expr(d.adbin, d.adrelid) LIKE '%CURRENT_TIME%' OR pg_get_expr(d.adbin, d.adrelid) LIKE '%CURRENT_DATE%' OR pg_get_expr(d.adbin, d.adrelid) LIKE '%LOCALTIME%' OR pg_get_expr(d.adbin, d.adrelid) LIKE '%LOCALTIMESTAMP%') AND (a.atttypid = 'timestamp'::regtype::oid OR a.atttypid = 'timestamp with time zone'::regtype::oid OR a.atttypid = 'date'::regtype::oid OR a.atttypid = 'time'::regtype::oid OR a.atttypid = 'time with time zone'::regtype::oid) , false) FROM pg_catalog.pg_class c, pg_catalog.pg_attribute a  LEFT JOIN
Mar  2 13:29:02 AISGPGP01 pgpool[29684]: [102357-2]  pg_catalog.pg_attrdef d ON (a.attrelid = d.adrelid AND a.attnum = d.adnum) WHERE c.oid = a.attrelid AND a.attnum >= 1 AND a.attisdropped = 'f' AND c.oid = to_regclass('"replication_test"') ORDER BY a.attnum
Mar  2 13:29:02 AISGPGP01 pgpool[29684]: [102358-1] 2020-03-02 13:29:02 ::: APP - psql :A: DB - aisgdvyv :D: USER - aisg :U: |LOG:  DB node id: 0 backend pid: 27333 statement: INSERT INTO "replication_test"("pk_id", "int_number", "custom_text", "custom_date") VALUES (2,"pg_catalog"."date_part"('dow',"pg_catalog"."timestamptz"('2020-03-02 13:29:02.016628+01'::text)), ('Current date/time is '||"pg_catalog"."timestamptz"('2020-03-02 13:29:02.016628+01'::text) ),"pg_catalog"."timestamptz"('2020-03-02 13:29:02.016628+01'::text))
Mar  2 13:29:02 AISGPGP01 pgpool[29684]: [102359-1] 2020-03-02 13:29:02 ::: APP - psql :A: DB - aisgdvyv :D: USER - aisg :U: |LOG:  DB node id: 0 backend pid: 27333 statement:  DISCARD ALL

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp