[pgpool-general: 1610] Re: Failover in raw mode

Бородин Владимир root at simply.name
Fri Apr 12 03:22:24 JST 2013


I've built 3.3.0-alpha1 (tokakiboshi) from master and it works fine with two and three backends. Great! But there is one more thing I expect from raw mode. I have such configuration parameters in config:

health_check_period = 1
health_check_timeout = 5
health_check_max_retries = 100
health_check_retry_delay = 5

If backend0 fails down for a short period of time (10-15 seconds) and then comes back, shouldn't the queries go to backend1 for this 10-15 seconds and then go back to backend0? Now they continue to go to backend1 even if backend0 comes up. If backend1 fails, they go to backend2. If backend0 and backend1 comes up and backend2 fails, clients can't connect until pgpool restart. Is that expected behavior?

11.04.2013, в 19:25, Tatsuo Ishii <ishii at postgresql.org> написал(а):

> One thing I noticed is you are using 3.2.0. There are known issues
> with 3.2.0 to, even 3.2.3 with connecting to backend. That has been
> already fixed in 3.2-stable tree. Could you try the 3.2-stable tree?
> 
> http://git.postgresql.org/gitweb/?p=pgpool2.git;a=shortlog;h=refs/heads/V3_2_STABLE
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
> 
>> I have left two backends in config and restarted pgpool. It still does't work for me :( But after turning the debug off I have seen next thing (full log is below):
>> All ERROR lines appear when I try to establish a new connection to pgpool. And every time I do it, it is starting degeneration. But still nothing works. It seems that after shutting down backend0 it has seen it (according to lines "failover done") but all new connection still tried to connect to backend0.
>> Any ideas? Should I give you more complex diagnostics?
>> 
>> 2013-04-11 18:47:29 LOG:   pid 485: pgpool-II successfully started. version 3.2.0 (namameboshi)
>> 2013-04-11 18:47:41 LOG:   pid 518: postmaster on DB node 0 was shutdown by administrative command
>> 2013-04-11 18:47:41 LOG:   pid 518: degenerate_backend_set: 0 fail over request from pid 518
>> 2013-04-11 18:47:41 LOG:   pid 485: starting degeneration. shutdown host loadtest01g.domain.com(5432)
>> 2013-04-11 18:47:41 LOG:   pid 485: Restart all children
>> 2013-04-11 18:47:41 LOG:   pid 485: failover: set new primary node: -1
>> 2013-04-11 18:47:41 LOG:   pid 485: failover: set new master node: 0
>> 2013-04-11 18:47:41 LOG:   pid 485: failover done. shutdown host loadtest01g.domain.com(5432)
>> 2013-04-11 18:47:41 LOG:   pid 520: worker process received restart request
>> 2013-04-11 18:47:42 LOG:   pid 519: pcp child process received restart request
>> 2013-04-11 18:47:42 LOG:   pid 485: worker child 520 exits with status 256
>> 2013-04-11 18:47:42 LOG:   pid 485: fork a new worker child pid 597
>> 2013-04-11 18:47:42 LOG:   pid 485: PCP child 519 exits with status 256
>> 2013-04-11 18:47:42 LOG:   pid 485: fork a new PCP child pid 598
>> 2013-04-11 18:47:44 ERROR: pid 563: connect_inet_domain_socket: connect() failed: Connection refused
>> 2013-04-11 18:47:44 ERROR: pid 563: connection to loadtest01g.domain.com(5432) failed
>> 2013-04-11 18:47:44 ERROR: pid 563: new_connection: create_cp() failed                                                                                                         
>> 2013-04-11 18:47:44 LOG:   pid 563: degenerate_backend_set: 0 fail over request from pid 563                                                                                   
>> 2013-04-11 18:47:44 LOG:   pid 485: starting degeneration. shutdown host loadtest01g.domain.com(5432)                                                                          
>> 2013-04-11 18:47:44 LOG:   pid 485: Restart all children                                                                                                                       
>> 2013-04-11 18:47:44 LOG:   pid 485: failover: set new primary node: -1                                                                                                         
>> 2013-04-11 18:47:44 LOG:   pid 485: failover: set new master node: 0                                                                                                          
>> 2013-04-11 18:47:44 LOG:   pid 485: failover done. shutdown host loadtest01g.domain.com(5432)                                                                                 
>> 2013-04-11 18:47:44 LOG:   pid 597: worker process received restart request                                                                                                   
>> 2013-04-11 18:47:45 LOG:   pid 598: pcp child process received restart request                                                                                                
>> 2013-04-11 18:47:45 LOG:   pid 485: worker child 597 exits with status 256                                                                                                    
>> 2013-04-11 18:47:45 LOG:   pid 485: fork a new worker child pid 637                                                                                                            
>> 2013-04-11 18:47:45 LOG:   pid 485: PCP child 598 exits with status 256                                                                                                       
>> 2013-04-11 18:47:45 LOG:   pid 485: fork a new PCP child pid 638                                                                                                               
>> 2013-04-11 18:47:55 ERROR: pid 603: connect_inet_domain_socket: connect() failed: Connection refused                                                                          
>> 2013-04-11 18:47:55 ERROR: pid 603: connection to loadtest01g.domain.com(5432) failed                                                                                         
>> 2013-04-11 18:47:55 ERROR: pid 603: new_connection: create_cp() failed                                                                                                         
>> 2013-04-11 18:47:55 LOG:   pid 603: degenerate_backend_set: 0 fail over request from pid 603                                                                                   
>> 2013-04-11 18:47:55 LOG:   pid 485: starting degeneration. shutdown host loadtest01g.domain.com(5432)                                                                          
>> 2013-04-11 18:47:55 LOG:   pid 485: Restart all children                                                                                                                       
>> 2013-04-11 18:47:55 LOG:   pid 485: failover: set new primary node: -1                                                                                                         
>> 2013-04-11 18:47:55 LOG:   pid 485: failover: set new master node: 0
>> 2013-04-11 18:47:55 LOG:   pid 485: failover done. shutdown host loadtest01g.domain.com(5432)
>> 2013-04-11 18:47:55 LOG:   pid 637: worker process received restart request
>> 2013-04-11 18:47:56 LOG:   pid 638: pcp child process received restart request
>> 2013-04-11 18:47:56 LOG:   pid 485: worker child 637 exits with status 256
>> 2013-04-11 18:47:56 LOG:   pid 485: fork a new worker child pid 675
>> 2013-04-11 18:47:56 LOG:   pid 485: PCP child 638 exits with status 256
>> 2013-04-11 18:47:56 LOG:   pid 485: fork a new PCP child pid 676
>> 2013-04-11 18:48:41 ERROR: pid 674: connect_inet_domain_socket: connect() failed: Connection refused
>> 2013-04-11 18:48:41 ERROR: pid 674: connection to loadtest01g.domain.com(5432) failed
>> 2013-04-11 18:48:41 ERROR: pid 674: new_connection: create_cp() failed
>> 2013-04-11 18:48:41 LOG:   pid 674: degenerate_backend_set: 0 fail over request from pid 674
>> 2013-04-11 18:48:41 LOG:   pid 485: starting degeneration. shutdown host loadtest01g.domain.com(5432)
>> 2013-04-11 18:48:41 LOG:   pid 485: Restart all children
>> 2013-04-11 18:48:41 LOG:   pid 485: failover: set new primary node: -1
>> 2013-04-11 18:48:41 LOG:   pid 485: failover: set new master node: 0
>> 2013-04-11 18:48:41 LOG:   pid 485: failover done. shutdown host loadtest01g.domain.com(5432)
>> 2013-04-11 18:48:41 LOG:   pid 675: worker process received restart request
>> 2013-04-11 18:48:42 LOG:   pid 676: pcp child process received restart request
>> 2013-04-11 18:48:42 LOG:   pid 485: worker child 675 exits with status 256
>> 2013-04-11 18:48:42 LOG:   pid 485: fork a new worker child pid 720
>> 2013-04-11 18:48:42 LOG:   pid 485: PCP child 676 exits with status 256
>> 2013-04-11 18:48:42 LOG:   pid 485: fork a new PCP child pid 721
>> 2013-04-11 18:48:46 ERROR: pid 718: connect_inet_domain_socket: connect() failed: Connection refused
>> 2013-04-11 18:48:46 ERROR: pid 718: connection to loadtest01g.domain.com(5432) failed
>> 2013-04-11 18:48:46 ERROR: pid 718: new_connection: create_cp() failed
>> 2013-04-11 18:48:46 LOG:   pid 718: degenerate_backend_set: 0 fail over request from pid 718
>> 2013-04-11 18:48:46 LOG:   pid 485: starting degeneration. shutdown host loadtest01g.domain.com(5432)
>> 2013-04-11 18:48:46 LOG:   pid 485: Restart all children
>> 2013-04-11 18:48:46 LOG:   pid 485: failover: set new primary node: -1
>> 2013-04-11 18:48:46 LOG:   pid 485: failover: set new master node: 0
>> 2013-04-11 18:48:46 LOG:   pid 485: failover done. shutdown host loadtest01g.domain.com(5432)
>> 2013-04-11 18:48:46 LOG:   pid 720: worker process received restart request
>> 2013-04-11 18:48:47 LOG:   pid 721: pcp child process received restart request
>> 2013-04-11 18:48:47 LOG:   pid 485: worker child 720 exits with status 256
>> 2013-04-11 18:48:47 LOG:   pid 485: fork a new worker child pid 781
>> 2013-04-11 18:48:47 LOG:   pid 485: PCP child 721 exits with status 256
>> 2013-04-11 18:48:47 LOG:   pid 485: fork a new PCP child pid 782
>> 
>> 11.04.2013, в 18:31, Tatsuo Ishii <ishii at postgresql.org> написал(а):
>> 
>>> I didn't notice you are using raw mode. Sorry.
>>> 
>>>> Thanks for reply. Here [1] I found such information:
>>>> 
>>>> "Failover can be performed in raw mode if multiple servers are defined. pgpool-II usually accesses the backend specified by backend_hostname0 during normal operation. If the backend_hostname0 fails for some reason, pgpool-II tries to access the backend specified by backend_hostname1. If that fails, pgpool-II tries the backend_hostname2, 3 and so on."
>>>> 
>>>> And I want to see the described above behavior. Could you give an example of what should I write to failover_command in that case?
>>> 
>>> No, in raw mode, you don't need to set it. I just tried raw mode/2-
>>> node configuration and it seems to work as expected: if node 0 is
>>> down, then pgpool allows to access node 1. If you try with 2
>>> backends(not 3 backends), does it work?
>>> --
>>> Tatsuo Ishii
>>> SRA OSS, Inc. Japan
>>> English: http://www.sraoss.co.jp/index_en.php
>>> Japanese: http://www.sraoss.co.jp
>>> 
>>>> [1] http://pgpool.projects.pgfoundry.org/pgpool-II/doc/pgpool-en.html
>>>> 
>>>> 11.04.2013, в 17:51, Tatsuo Ishii <ishii at postgresql.org> написал(а):
>>>> 
>>>>> You need to set failover_command. Yours is empty, so nothing will
>>>>> happen.
>>>>> --
>>>>> Tatsuo Ishii
>>>>> SRA OSS, Inc. Japan
>>>>> English: http://www.sraoss.co.jp/index_en.php
>>>>> Japanese: http://www.sraoss.co.jp
>>>>> 
>>>>>> ping. Can anybody help?
>>>>>> 
>>>>>> 09.04.2013, в 15:24, Бородин Владимир <root at simply.name> написал(а):
>>>>>> 
>>>>>>> Hi all.
>>>>>>> 
>>>>>>> I've read much information about failover in raw mode but I haven't found the solution yet.
>>>>>>> 
>>>>>>> I have 3 PostgreSQL nodes running in streaming replication mode - loadtest01g.domain.com (master), loadtest02g.domain.com andloadtest04g.domain.com (two replicas in hot_standby mode). All of them are listening 0.0.0.0:5432. All of them have the line below in pg_hba.conf (that is a disgusting idea but these hosts are for testing):
>>>>>>> host    all             all             0.0.0.0/0               trust
>>>>>>> I  can connect to any of them with psql and issue select queries.
>>>>>>> 
>>>>>>> On loadtest01g I have installed pgpool-II-3.2 and taken the sample config with some modifications (backend addresses, debug_level and timeouts). Config - http://simply.name/tmp/pgpool.conf and log - http://simply.name/tmp/pgpool.log.
>>>>>>> 
>>>>>>> I get the following problem: 
>>>>>>> pgpool does not failover to backend1 (loadtest02g) if backend0 (loadtest01g) fails down. According to log it catches the primary failure but doesn't send queries to one of the replicas.
>>>>>>> 
>>>>>>> The question is what do I do wrong?
>>>>>>> 
>>>>>>> --
>>>>>>> Vladimir
>>>>>>> _______________________________________________
>>>>>>> pgpool-general mailing list
>>>>>>> pgpool-general at pgpool.net
>>>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Vladimir
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Vladimir
>>>> 
>> 
>> 
>> --
>> Vladimir
>> 
>> 
>> 
>> 


--
Vladimir




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20130411/124e5f74/attachment-0001.html>


More information about the pgpool-general mailing list