[pgpool-general: 3886] Re: issue with master-slave streaming replication

Piotr Synak piotr.synak at infobright.com
Wed Jul 22 18:18:45 JST 2015


Unfortunately, my workaround doesn't work. Calling pool_read for the second time fails.

Regards
Piotr

-----Original Message-----
From: pgpool-general-bounces at pgpool.net [mailto:pgpool-general-bounces at pgpool.net] On Behalf Of Piotr Synak
Sent: Monday, July 20, 2015 10:11 AM
To: Tatsuo Ishii
Cc: pgpool-general at pgpool.net
Subject: [pgpool-general: 3880] Re: issue with master-slave streaming replication

Hi,

> BTW if you disallow load balancing, maybe pgpool-II does not need to
> connect to standbys at all and your problem could gone. Let me think
> about it.

Im looking forward to see such change. 

In the meantime I am thinking about some workaround. What comes to my mind is to modify function pool_read_kind in such a way that if it sees that "kind" returned from backend is different from master's "kind0" it sleeps for a second an tries again, for a few times. So just after master's kind is read, i.e., after
		if (IS_MASTER_NODE_ID(i))
		{
			kind0 = kind;
		}
		else
		{

I plan to add the following:

			int no_trials = 5;
			int trial;
			for(trial = 0; trial < no_trials; trial++) {
				if(kind != kind0) {
					sleep(1);
					pool_read(CONNECTION(cp, i), &kind, sizeof(kind));
				} else 
					break;
				
			}

I assume that calling pool_read for the same backend a few times should have no and drawbacks

My initial experiments show that it works fine. What do you think about it?

Thanks,
Piotr

-----Original Message-----
From: Tatsuo Ishii [mailto:ishii at postgresql.org] 
Sent: Friday, July 17, 2015 6:26 PM
To: Piotr Synak
Cc: pgpool-general at pgpool.net
Subject: Re: [pgpool-general: 3874] issue with master-slave streaming replication

> Hi
> 
> Thanks for the answer.
> 
>> Yes. Because PostgreSQL streaming replication does not care about
>> replication delay. pgpool-II needs to care about entire cluster, not
>> only the replication master. Even without pgpool-II, you need to think
>> about the replication delay problem if your app wants to connect to
>> standby right after DDL is executed.
> 
> Hmm, that would be true for synchronous replication but streaming replication in PG is asynchronous by default. If I assume that I want to connect to standby right after DDL is executed on primary I would choose synchronous option. Actually, in my setup as I don't use load balancer I connect to primary only and standby is solely for HA purpose. 

Even if you enable synchronous replication in PostgreSQL, the
situation would not be changed because it does not guarantee that redo
of WAL of the DDL is completed when the DDL returns commit status on
primary (remember that PostgreSQL's synchronous replication only
guarantee that the WAL has been successfully sent and safely saved on
disk).

BTW if you disallow load balancing, maybe pgpool-II does not need to
connect to standbys at all and your problem could gone. Let me think
about it.

>> Maybe we could mitigate the problem by adding a switch to wait for
>> standby catches up master but this will require non trivial
>> development.
> 
> That would be very helpful. I believe that even if we assume that CREATE DATABASE (or anything) should wait for standby to be synchronized then it should be handled internally, i.e., the operation should not finish until all gets synchronized.

That's not easy because currently there's no way in PostgreSQL to make
sure that paticular DDL is applied on standby.

> Thanks,
> Piotr
> 
> -----Original Message-----
> From: Tatsuo Ishii [mailto:ishii at postgresql.org] 
> Sent: Friday, July 17, 2015 2:51 AM
> To: Piotr Synak
> Cc: pgpool-general at pgpool.net
> Subject: Re: [pgpool-general: 3874] issue with master-slave streaming replication
> 
>> Hi,
>> 
>> I have a simple pgpool configuration with two nodes, each hosting pgpool and backend. It is master-slave configuration with watchdog and postgres streaming replication. No load balancing.
>> I am doing simple test in a script:
>> 
>> drop database test;
>> create database test;
>> \c test;
>> 
>> When running above script directly on PG backend it works fine.
>> 
>> When running it against pgpool via delegate ip I get an error when connecting to database "test":
>> DROP DATABASE
>> CREATE DATABASE
>> psql:./test.sql:3: \connect: ERROR:  unable to read message kind
>> DETAIL:  kind does not match between master(53) slot[1] (45)
>> 
>> I noticed that when I introduce some delay between "Create database" and "Connect" then it works fine. So it looks like pgpool is waiting for replication to finish or some kind of confirmation from standby node (or anything else?). This is for me contradictory with the idea of asynchronous replication (and such is streaming replication) where we don't wait for any confirmation but execute immediately. Fact that it runs well when executed directly by backend proves that the issue is not in the streaming replication but in pgpool. How to overcome this problem? Is it known issue?
> 
> Yes. Because PostgreSQL streaming replication does not care about
> replication delay. pgpool-II needs to care about entire cluster, not
> only the replication master. Even without pgpool-II, you need to think
> about the replication delay problem if your app wants to connect to
> standby right after DDL is executed.
> 
>> Adding "sleep" after each dml is not a solution for me.
> 
> Maybe we could mitigate the problem by adding a switch to wait for
> standby catches up master but this will require non trivial
> development.
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
_______________________________________________
pgpool-general mailing list
pgpool-general at pgpool.net
http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list