[pgpool-general: 1483] Re: [Benchmarks] Replication mode VS Master/Slave mode

Thomas Martin tmartincpp at gmail.com
Thu Mar 14 01:06:56 JST 2013


Hey! It's me again.

I just realized (shame on me) that my pgbench with SELECT (only
selects) have the same results: master/slave is mostly twice better
than replication.
As the select are load balanced with replication I don't why I have
such results in that case.


2013/3/13 Thomas Martin <tmartincpp at gmail.com>:
> Thanks a lot for your time and your answers; they are explaining my
> benchmarks results.
>
> Thomas
>
>
> 2013/3/13 Tatsuo Ishii <ishii at sraoss.co.jp>:
>>> Hi Tatsuo.
>>>
>>> Sorry for the delay.
>>>
>>> 2013/3/10 Tatsuo Ishii <ishii at postgresql.org>:
>>>> Not sure how much numbers you got, but I would think the difference
>>>> between those mode comes from the way database gets replicated.  (I
>>>> assme you are using the Master/Slave mode with streaming replication).
>>>
>>> Good point, benchmarks was done with two nodes only.
>>>
>>>
>>>> Streaming replication(SR) is essentially a synchrnonous replication
>>>> system. That means when client is noticed that UPDATE command
>>>> completes, it is possible that standbys are not silll get updated. So
>>>> SR is fast. The price for this is, if you send SELECT to standbys, you
>>>> may get older results.
>>>
>>> You are right, I will have to try with PostgreSQL 9.2 someday (which
>>> seems to have a real synchronous method).
>>
>> Unfortunately no. It just confirms that standbys completes to write
>> data to WAL. Data updates are done after WAL redo and you still have a
>> chance to get older results.
>>
>>>> On the other hand with pgpool's replication mode, when client is
>>>> noticed that UPDATE command completes, it is guaranteed that slaves
>>>> are get updated. The price is speed, of course. If you have two DB
>>>> nodes, because pgpool-II needs to write to the first node then the
>>>> second node, the performance is expected 1/2 comparing with single
>>>> node. If you have third and fource node, there are written in
>>>> parallel. Included diagram(write-query-performance.png) shows the
>>>> relationship between performance (Y axis) and number of nodes(X
>>>> axis). Also I include anoter diagram(read-query-performance.png) which
>>>> is similar one but for read query.
>>>
>>> I'm a bit confused about the differences between two or more nodes.
>>>
>>> With two nodes, is the Pgpool really waiting for the first node to
>>> complete the UPDATE before sending it to the second node?
>>
>> Yes.
>>
>>> Before your explanations I was thinking that the Pgpool was starting
>>> the UPDATE on both nodes at the same time (and next he had to wait
>>> until he got the slower response before processing next requests).
>>
>> No, this will result in cross session dead lock situation very easily.
>> (this is a well known classical problem).
>>
>> Note that this only happens when there are multiple concurrent
>> sessions.  In other word, if there is only one session, pgpool can do
>> UPDATE on both nodes at the same time (in fact pgpool does this way
>> when num_init_children = 1).
>>
>>> Could you please confirm which is the actually method used and explain
>>> why this is different with three of four node?
>>
>> Ok, I will explain four node case.
>>
>> 1) send update command to the first node and wait for its completion.
>>
>> 2) send update command to the second, the third and the fourth node at
>>    the same time.
>>
>> 3) wait for completion of the update on all of the second, the third
>>    and the fourth node.
>>
>> 4) reply back to client.
>>
>>>> If you issue read/write queries, total performance would be somewhat
>>>> synthetic of those two diagrams.
>>>>
>>>> Hope this helps,
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese: http://www.sraoss.co.jp
>>>
>>>
>>> Thanks a lot!
>>>
>>> Thomas


More information about the pgpool-general mailing list