[pgpool-general: 114] Re: [Pgpool-general] pgpool limitations

Thu Dec 22 21:34:32 JST 2011

Interesting. There should be no way other than exit() once "failback
event found. restart myself" happens. I think process 675 to 702 hang
somewhere before calling exit(). Can you confirm this by attaching
gdb?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> There is an interesting thing happened today. I found only 5 of the 32 PIDs changed after attaching the new node. Here is the pgpool.log:
> 
> ---------
> ......
> 
> 2011-12-22 08:16:31 LOG:   pid 511: failback done. reconnect host localhost(5447)
> 2011-12-22 08:16:32 LOG:   pid 703: pcp child process received restart request
> 2011-12-22 08:16:32 LOG:   pid 511: worker child 669 exits with status 256
> 2011-12-22 08:16:32 LOG:   pid 511: fork a new worker child pid 786
> 2011-12-22 08:20:56 LOG:   pid 670: do_child: failback event found. restart myself.
> 2011-12-22 08:20:56 LOG:   pid 671: do_child: failback event found. restart myself.
> 2011-12-22 08:20:56 LOG:   pid 672: do_child: failback event found. restart myself.
> 2011-12-22 08:20:56 LOG:   pid 674: do_child: failback event found. restart myself.
> 2011-12-22 08:20:56 LOG:   pid 673: do_child: failback event found. restart myself.
> 2011-12-22 08:20:56 LOG:   pid 511: PCP child 703 exits with status 256
> 2011-12-22 08:20:56 LOG:   pid 675: do_child: failback event found. restart myself.
> 2011-12-22 08:20:56 LOG:   pid 511: fork a new PCP child pid 1154
> 2011-12-22 08:20:56 LOG:   pid 676: do_child: failback event found. restart myself.
> 2011-12-22 08:20:56 LOG:   pid 677: do_child: failback event found. restart myself.
> ....
> ....
> 
> 2011-12-22 08:20:56 LOG:   pid 702: do_child: failback event found. restart myself.
> 2011-12-22 08:20:56 LOG:   pid 699: do_child: failback event found. restart myself.
> 2011-12-22 08:20:56 LOG:   pid 701: do_child: failback event found. restart myself.
> 2011-12-22 08:20:56 LOG:   pid 700: do_child: failback event found. restart myself.
> 2011-12-22 08:20:56 DEBUG: pid 1159: key: listen_addresses
> 2011-12-22 08:20:56 DEBUG: pid 1159: value: '*' kind: 4
> .....
> -----------
> 
> 
> The Processes with PIDs 670 to 674 got restarted and were assigned new PIDs... rest of the processes with PIDs from 675 to 702 remained same.Can you see some catch here? 
> 
> The pgpool.log when there are no restarts looks like this:
> -----------
> 
> .....
> 
>  2011-12-22 10:39:49 LOG:   pid 28224: failback done. reconnect host localhost(5447)
> 2011-12-22 10:39:49 LOG:   pid 28372: worker process received restart request
> 2011-12-22 10:39:50 LOG:   pid 28224: worker child 28372 exits with status 256
> 2011-12-22 10:39:50 LOG:   pid 28405: pcp child process received restart request
> 2011-12-22 10:39:50 LOG:   pid 28224: fork a new worker child pid 28485
> 2011-12-22 10:39:50 LOG:   pid 28224: PCP child 28405 exits with status 256
> 2011-12-22 10:39:50 LOG:   pid 28224: fork a new PCP child pid 28486
> 2011-12-22 10:39:52 DEBUG: pid 28490: key: listen_addresses
> 2011-12-22 10:39:52 DEBUG: pid 28490: value: '*' kind: 4
> ....
> --------------
> 
> Thanks.
> 
> 
> ________________________________
>  From: Sandeep Thakkar <sandeeptt at yahoo.com>
> To: Tatsuo Ishii <ishii at postgresql.org> 
> Cc: "pgpool-general at pgpool.net" <pgpool-general at pgpool.net> 
> Sent: Monday, December 5, 2011 4:00 PM
> Subject: [pgpool-general: 32] Re: [Pgpool-general] pgpool limitations
>  
> 
> Sorry for the confusion. Yes, they are both different test cases. 
> 
> 
> 
> ________________________________
>  From: Tatsuo Ishii <ishii at postgresql.org>
> To: sandeeptt at yahoo.com 
> Cc: pgpool-general at pgpool.net 
> Sent: Monday, December 5, 2011 3:28 PM
> Subject: Re: [pgpool-general: 28] Re: [Pgpool-general] pgpool limitations
>  
> I want to clarify. You said you have one connected psql client, but
> now you are saying that no connected psql client. Are they different
> test cases?
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
> 
>> Well, what I have seen is that if there is no established connection and all pgpool client processes are idle.. and in post attachment of the new client,  sometimes all pgpool clients are restarted and sometimes they are not at all.. why this random behaviour?
>> 
>> 
>> 
>> ________________________________
>>  From: Sandeep Thakkar <sandeeptt at yahoo.com>
>> To: Tatsuo Ishii <ishii at postgresql.org> 
>> Cc: "pgpool-general at pgpool.net" <pgpool-general at pgpool.net> 
>> Sent: Monday, December 5, 2011 2:39 PM
>> Subject: [pgpool-general: 28] Re: [Pgpool-general] pgpool limitations
>>  
>> 
>> Oh.. I see.. Why then it behaves randomly?
>> 
>> 
>> I did the following test:
>> the number of pgpool client processes are 32 and only one of them is connected to psql client (one session), and I add new node (take basebackup, create recovery.conf, start new server, get the client PIDs using pcp_proc_count, edit pgpool.conf, reload pgpool.conf, pcp_attach_node, get the client PIDs again using pcp_proc_count).. and I found that when the psql client exits, only one pgpool client gets restarted and now has new PID... rest of the idle pgpool client processes had the same PIDs after
>  attaching the node.
>> 
>> 
>> 
>> 
>> ________________________________
>>  From: Tatsuo Ishii <ishii at postgresql.org>
>> To: sandeeptt at yahoo.com 
>> Cc: pgpool-general at pgpool.net 
>> Sent: Monday, December 5, 2011 11:59 AM
>> Subject: Re: [pgpool-general: 8] Re: [Pgpool-general] pgpool limitations
>>  
>> Good catch. I forgot about this. From pgpool-II 3.1, in streaming
>> replication mode, after failback event, existing sessions are not
>> disconnected any more. However afte the session exits, pgpool child
>> restarts to take care of failback node info to, for example, use the
>> node for load balancing.
>>
>  --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
>> 
>>> I just see some additional statements like "failback event found. restart myself"...
>>> 
>>> 2011-11-30 10:39:46 LOG:   pid 7398: find_primary_node_repeatedly: waiting for finding a primary node
>>> 2011-11-30 10:39:46 LOG:   pid 7398: find_primary_node: primary node id is 1
>>> 2011-11-30 10:39:46 LOG:   pid 7398: failover: set new primary node: 1
>>> 2011-11-30 10:39:46 LOG:   pid 7398: failover: set new master node: 0
>>> 2011-11-30 10:39:46
>>  LOG:   pid 7398: failback done. reconnect host localhost(5447)
>>> 2011-11-30 10:39:46 LOG:   pid
>  7532: worker process received restart request
>>> 2011-11-30 10:39:47 LOG:   pid 7565: pcp child process received restart request
>>> 2011-11-30 10:39:47 LOG:   pid 7398: worker child 7532 exits with status 256
>>> 2011-11-30 10:39:47 LOG:   pid 7398: fork a new worker child pid 7648
>>> 2011-11-30 10:44:10 LOG:   pid 7533: do_child: failback event found. restart myself.
>>> 2011-11-30 10:44:10 LOG:   pid 7534: do_child: failback event found. restart myself.
>>> ....
>>> ....
>>> 
>>>  
>>> 
>>> 
>>> ________________________________
>>>  From: Tatsuo Ishii <ishii at postgresql.org>
>>> To: sandeeptt at yahoo.com 
>>> Cc: pgpool-general at pgpool.net 
>>> Sent: Tuesday, November 29, 2011 3:24 PM
>>> Subject: Re: [pgpool-general: 8] Re: [Pgpool-general] pgpool limitations
>>>  
>>> I can't think of any other reasons. Can you find anything special in
>>> the pgpool log when pgpool child exits?
>>> --
>>> Tatsuo Ishii
>>> SRA OSS, Inc. Japan
>>> English: http://www.sraoss.co.jp/index_en.php
>>> Japanese: http://www.sraoss.co.jp
>>> 
>>>> client_idle_limit is set to '0'. Here is the other related settings:
>>>> ....
>>>> pcp_timeout = 10
>>>> num_init_children = 32
>>>> max_pool = 4
>>>> child_life_time
>  =
>>  300
>>>> connection_life_time = 0
>>>> child_max_connections = 0
>>>> client_idle_limit = 0
>>>> ....
>>>>  
>>>> 
>>>> 
>>>> ________________________________
>>>>  From: Tatsuo Ishii <ishii at sraoss.co.jp>
>>>> To: sandeeptt at yahoo.com 
>>>> Cc: singh.gurjeet at gmail.com; pgpool-general at pgfoundry.org; pgpool-hackers at pgfoundry.org 
>>>> Sent: Wednesday, November 23, 2011
>  8:21 PM
>>>> Subject: Re: [Pgpool-general] pgpool
>>  limitations
>>>>  
>>>> One possibility is client_idle_limit.
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese: http://www.sraoss.co.jp
>>>> 
>>>>> I have found that sometimes the client connections get disconnected and the new ones are established. What I do is I get the PIDs using "pcp_proc_count" before running "pcp_attach_node" and then run "pcp_proc_count" to check if the PIDs remain same. I found that the behaviour is random. When can this happen?
>>>>> 
>>>>> 
>>>>> ________________________________
>>>>>  From: Tatsuo Ishii
>  <ishii at sraoss.co.jp>
>>>>> To: singh.gurjeet at gmail.com 
>>>>> Cc: pgpool-general at pgfoundry.org; pgpool-hackers at pgfoundry.org 
>>>>> Sent: Thursday, August 11, 2011 6:11 AM
>>>>> Subject: Re: [Pgpool-general] pgpool limitations
>>>>>  
>>>>>>> > > Is there something in the works to enable this, or is this feature
>>>>>>> still in
>>>>>>> > > design phase? If it is already being/been developed, I wish to know if
>>>>>>>
>  this
>>>>>>> > > can be back-patched to a point release of pgpool 3.0.x.
>>>>>>> >
>>>>>>> > It has been already in
>>  pgpool-II 3.1 alpha version.
>>>>>>> > Currently there's no plan to back-patching to 3.0.x.
>>>>>>>
>>>>>>> I certainly hope we won't backpatch a new feature. That would be insane.
>>>>>>>
>>>>>> 
>>>>>> I don't consider this a new feature. I'd say this is unexpected side-effect
>>>>>> (a.k.a bug) of pcp_attach_node, since nowhere in the docs does is say that
>>>>>> invoking pcp_attach_node would drop all client connections.
>>>>> 
>>>>> This behavior has not been changed since pcp_attach_node was born in
>>>>> 2006. Moreover, the enhancement in 3.1 is only for
>  steaming
>>>>> replication mode. Other modes including replication mode does not take
>>>>> advantage of this.
>>>>> --
>>>>> Tatsuo Ishii
>>>>> SRA OSS, Inc. Japan
>>>>> English: http://www.sraoss.co.jp/index_en.php
>>>>> Japanese: http://www.sraoss.co.jp
>>>>> _______________________________________________
>>>>> Pgpool-general mailing list
>>>>> Pgpool-general at pgfoundry.org
>>>>> http://pgfoundry.org/mailman/listinfo/pgpool-general
>> 
>> 
>> 
>> _______________________________________________
>> pgpool-general mailing list
>> pgpool-general at pgpool.net
>> http://www.pgpool.net/mailman/listinfo/pgpool-general
> 
> 
> 
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general