[Pgpool-general] [PgPool-II] Node degenerating

Tue Jun 28 07:23:24 UTC 2011

Tatsuo,
As I understand to get option
replication_stop_on_mismatch = true
to work properly PostgreSQL clusters in pgpool-II should all have the same
log level if logging to stdout.
Why there is no information about it in pgpool-II docs?

On Tue, Jun 28, 2011 at 1:56 AM, Anton Koldaev <koldaevav at gmail.com> wrote:

> All that I needed to do is to DISABLE DEBUG LOG on the 2th node! :)
>
> On Tue, Jun 28, 2011 at 1:29 AM, Anton Koldaev <koldaevav at gmail.com>wrote:
>
>> Even 'BEGIN;' statement:
>> *2011-06-28 01:28:19 LOG:   pid 4146: statement: BEGIN;*
>> *2011-06-28 01:28:19 LOG:   pid 4146: DB node id: 0 backend pid: 4184
>> statement: BEGIN;*
>> *2011-06-28 01:28:19 LOG:   pid 4146: DB node id: 1 backend pid: 10117
>> statement: BEGIN;*
>> *2011-06-28 01:28:19 LOG:   pid 4146: DB node id: 2 backend pid: 29861
>> statement: BEGIN;*
>> *2011-06-28 01:28:19 ERROR: pid 4146: read_kind_from_backend: 2 th kind N
>> does not match with master or majority connection kind Z*
>> *2011-06-28 01:28:19 ERROR: pid 4146: kind mismatch among backends.
>> Possible last query was: "BEGIN;" kind details are: 0[Z] 1[Z] 2[N: duration:
>> 0.000 ms  statement: BEGIN;]*
>>
>> On Tue, Jun 28, 2011 at 1:06 AM, Anton Koldaev <koldaevav at gmail.com>wrote:
>>
>>> I reconfigured something and now I'm getting kind mismatch on any query
>>> right after online-recovery:
>>> *> 2011-06-28 01:00:18 LOG:   pid 696: DB node id: 0 backend pid: 1414
>>> statement: SET NAMES 'windows-1251'*
>>> *> 2011-06-28 01:00:18 LOG:   pid 696: DB node id: 1 backend pid: 8610
>>> statement: SET NAMES 'windows-1251'*
>>> *> 2011-06-28 01:00:18 LOG:   pid 696: DB node id: 2 backend pid: 28953
>>> statement: SET NAMES 'windows-1251'*
>>> *> 2011-06-28 01:00:18 ERROR: pid 696: read_kind_from_backend: 2 th kind
>>> N does not match with master or majority connection kind Z*
>>> *> 2011-06-28 01:00:18 ERROR: pid 696: kind mismatch among backends.
>>> Possible last query was: "SET NAMES 'windows-1251'" kind details are: 0[Z]
>>> 1[Z] 2[N: duration: 0.000 ms  statement: SET NAMES 'windows-1251']*
>>>
>>> postgresql.log on 2th backend:
>>> *> 2011-06-28 01:03:00 MSD [29038]: [1-1] LOG:  00000: duration: 0.000
>>> ms  statement: BEGIN*
>>> *> 2011-06-28 01:03:00 MSD [29038]: [2-1] LOCATION:  exec_simple_query,
>>> postgres.c:1068*
>>> *> 2011-06-28 01:03:00 MSD [29038]: [3-1] LOG:  00000: duration: 0.000
>>> ms  statement: SET NAMES 'windows-1251'*
>>> *> 2011-06-28 01:03:00 MSD [29038]: [4-1] LOCATION:  exec_simple_query,
>>> postgres.c:1068*
>>> *> 2011-06-28 01:03:00 MSD [28690]: [1093-1] DEBUG:  00000: server
>>> process (PID 29038) exited with exit code 0*
>>> *> 2011-06-28 01:03:00 MSD [28690]: [1094-1] LOCATION:  LogChildExit,
>>> postmaster.c:2720*
>>>
>>> What's wrong? Where can I find this [N]-message? What does [Z]-messages
>>> mean? Where to take a list of message types?
>>>
>>>
>>> On Sun, Jun 26, 2011 at 1:53 PM, Tatsuo Ishii <ishii at sraoss.co.jp>wrote:
>>>
>>>> I was unable to reproduce your problem. What I did was:
>>>>
>>>> - create 3 database clusters
>>>> - run the same query (global3=# VACUUM ANALYZEmetadata."RC_Metadata";)
>>>>  via psql->pgpool
>>>>
>>>> No degeneration occurred. Just pgpool reported syntax error.
>>>>
>>>> Is there any specific condition to reproduce your problem?
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese: http://www.sraoss.co.jp
>>>>
>>>> > Hello
>>>> > I've got three nodes in pgpool-II:
>>>> > sO21 - id:0
>>>> > sO22 - id:1
>>>> > db00 - id:2
>>>> > replication_mode = true
>>>> > load_balance_mode = true
>>>> > replication_stop_on_mismatch = true
>>>> > failover_if_affected_tuples_mismatch = true
>>>> >
>>>> > I'm getting an error after somebody of developers enters wrong syntax
>>>> query:
>>>> >
>>>> >
>>>> >    1.  2011-06-23 15:24:26 LOG:   pid 13894: SimpleQuery: Unable TO
>>>> parse
>>>> >    the query: global3=# VACUUM ANALYZEmetadata."RC_Metadata";
>>>> >    2.  2011-06-23 15:24:26 LOG:   pid 13894: DB node id: 0 backend
>>>> pid:
>>>> >    13906 STATEMENT: BEGIN
>>>> >    3.  2011-06-23 15:24:26 LOG:   pid 13894: DB node id: 1 backend
>>>> pid:
>>>> >    13788 STATEMENT: BEGIN
>>>> >    4.  2011-06-23 15:24:26 LOG:   pid 13894: DB node id: 0 backend
>>>> pid:
>>>> >    13906 STATEMENT: global3=# VACUUMANALYZE metadata."RC_Metadata";
>>>> >    5.  2011-06-23 15:24:26 LOG:   pid 13894: pool_send_and_wait: Error
>>>> > OR notice
>>>> >    message FROM backend: : DB node id: 0 backend pid: 13906 STATEMENT:
>>>> >    global3=# vacu
>>>> >    6.  um ANALYZE metadata."RC_Metadata"; message: syntax error AT OR
>>>> near
>>>> >    "global3"
>>>> >
>>>> >
>>>> > As you can see somebody copy-pasted vacuum analyze query with
>>>> "database=#
>>>> > ...".
>>>> > After this pgpool degrades node with id=2. Why is this happening?
>>>> > I know that:
>>>> >
>>>> > *"you should not execute VACUUM via pgpool. Connect*>>>* PostgreSQL
>>>> > directry and execute VACUUM".*
>>>> >
>>>> > But it is the second problem when syntax error causes degenerating of
>>>> > the second node.
>>>> >
>>>> >
>>>> > Here is the log from pgpool-II:
>>>> > http://pastebin.com/N3uKAG7k
>>>> >
>>>> > I will appreciate any help.
>>>> >
>>>> > PS: Tatsuo, maybe you have some scripts with test queries to check if
>>>> > replication works correctly after online recovery?
>>>> >
>>>> > --
>>>> > Best regards,
>>>> > Koldaev Anton
>>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Koldaev Anton
>>>
>>
>>
>>
>> --
>> Best regards,
>> Koldaev Anton
>>
>
>
>
> --
> Best regards,
> Koldaev Anton
>

-- 
Best regards,
Koldaev Anton
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pgfoundry.org/pipermail/pgpool-general/attachments/20110628/a6e8eedf/attachment.html>