View Issue Details

IDProjectCategoryView StatusLast Update
0000284Pgpool-IIBugpublic2017-03-15 16:26
Reportergmartinez Assigned ToMuhammad Usama  
PrioritynormalSeveritymajorReproducibilityrandom
Status closedResolutionno change required 
Platformx86_64OSOracle LinuxOS Version6.8
Product Version3.6.1 
Summary0000284: pgpool-II native replication loses sync every 1-2 days
DescriptionI've got a 3-node pgpool cluster running with watchdog and using pgpool's replication feature. I've noticed that every 1-3 days, one of the nodes will fall out of sync with a tuple mismatch. None of the nodes went offline during this period or failed healthchecks.

But shouldn't they be in sync with replication enabled?

2017-02-05 20:28:23: pid 3984111: LOG: pgpool detected difference of the number of inserted, updated or deleted tuples. Possible last query was: "DELETE FROM stats WHERE node_id = $1"
2017-02-05 20:28:23: pid 3984111: LOG: processing command complete
2017-02-05 20:28:23: pid 3984111: DETAIL: CommandComplete: Number of affected tuples are: 0 1 0
2017-02-05 20:28:23: pid 3984111: LOG: processing ready for query message
2017-02-05 20:28:23: pid 3984111: DETAIL: ReadyForQuery: Degenerate backends: 1
2017-02-05 20:28:23: pid 3984111: LOG: processing ready for query message
2017-02-05 20:28:23: pid 3984111: DETAIL: ReadyForQuery: Number of affected tuples are: 0 1 0
2017-02-05 20:28:23: pid 3984111: LOG: received degenerate backend request for node_id: 1 from pid [3984111]

I use pgpool's replication because it is fully synchronous and I can take advantage of load_balance that way. Streaming replication + load_balance did not work for me - the app writes too often.
Steps To Reproduce* Enable pgpool replication in a cluster + failover_if_affected_tuples_mismatch
* Send a lot of traffic to it
* Wait for one of the nodes to eventually degenerate
Additional Informationpgpool lives on the same hardware as the db backend. These are real machines - not VMs.
Tagsfailover, load balancing, replication

Activities

gmartinez

2017-02-06 06:55

reporter  

pgpool.conf (6,792 bytes)

gmartinez

2017-02-06 06:55

reporter  

postgresql.conf (15,350 bytes)

gmartinez

2017-02-06 06:55

reporter  

pgpool_recovery_pitr (1,021 bytes)   
pgpool_recovery_pitr (1,021 bytes)   

gmartinez

2017-02-06 06:55

reporter  

pgpool_remote_start (531 bytes)   
pgpool_remote_start (531 bytes)   

gmartinez

2017-02-06 06:55

reporter  

pgpool_copy_base_backup (1,252 bytes)   
pgpool_copy_base_backup (1,252 bytes)   

t-ishii

2017-02-06 08:20

developer   ~0001327

Apart from the problem, you could use streaming replication with "synchronous_commit = remote_apply" of PostgreSQL 6, which has the same effect as Pgpool-II's native replication.

gmartinez

2017-02-06 16:01

reporter   ~0001328

Oh, thank you for the information. I'll have to investigate that as an alternative! I didn't know that flag was added to 9.6. :)

Still, it would be nice to get pgpool replication working as well.

Muhammad Usama

2017-02-15 22:30

developer   ~0001342

Hi

I have been trying to reproduce the issue but still with no luck.
Do you always get always get the "difference in the number of rows" error on stats table or is it random?
Also can you verify if you INSERT and UPDATE the tables through Pgpool-II and using the INSERT/UPDATE statements, or do you also have some writing functions / triggers that can alter the data in the stats table (The one that is producing the error)?

gmartinez

2017-03-01 06:58

reporter   ~0001372

It seems to always be this table. This may just be something the application does internally that's incompatible with pgpool's replication, so I ended up switching to streaming replication instead. It's a third party application so changing the code would be difficult.

We can probably just close this ticket.

Issue History

Date Modified Username Field Change
2017-02-06 06:55 gmartinez New Issue
2017-02-06 06:55 gmartinez File Added: pgpool.conf
2017-02-06 06:55 gmartinez Tag Attached: failover
2017-02-06 06:55 gmartinez Tag Attached: load balancing
2017-02-06 06:55 gmartinez Tag Attached: replication
2017-02-06 06:55 gmartinez File Added: postgresql.conf
2017-02-06 06:55 gmartinez File Added: pgpool_recovery_pitr
2017-02-06 06:55 gmartinez File Added: pgpool_remote_start
2017-02-06 06:55 gmartinez File Added: pgpool_copy_base_backup
2017-02-06 08:20 t-ishii Note Added: 0001327
2017-02-06 16:01 gmartinez Note Added: 0001328
2017-02-08 14:26 t-ishii Assigned To => Muhammad Usama
2017-02-08 14:26 t-ishii Status new => assigned
2017-02-15 22:30 Muhammad Usama Note Added: 0001342
2017-03-01 06:58 gmartinez Note Added: 0001372
2017-03-15 16:26 Muhammad Usama Status assigned => closed
2017-03-15 16:26 Muhammad Usama Resolution open => no change required