[pgpool-general: 232] Re: VM nodes marked down after snapshot

Tatsuo Ishii ishii at postgresql.org
Thu Feb 16 10:59:12 JST 2012


Sounds like a bug with vmware. Pgpool does nothing special when
issuing connect(2) system call. connect() sends SYN to peer. Peer
should reply with SYN+ACK. If SYN+ACK is not returned, the local
TPC/IP stack keeps on sending SYN until timeout reaches. If timed out,
connect() fails with "Connection timed out" error. As far as I know,
the timeout value is 189 seconds on Linux system.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Hi,
> 
> I'm bringing back this thread as promised once I've found something.
> 
> I managed to reproduce my problem by delete a snapshot of the vm hosting postgresql ; pgpool runs on another machine.
> 
> To summarize my problem, pgpool loses connection with a postgersql on a vm when there's a snapshot or when a snapshot is being deleted. We're using vmware by the way. An odd part of this problem is  that it doesn't always occur, it's not systematic, probably once in every 3-4 snapshots created/deleted. I thought that modifying the health connection would help but nothing happened.
> 
> Here's what I've found on my logs :
> 
> 2012-02-15 16:07:05 ERROR: pid 7768: connect_inet_domain_socket: connect() failed: Connection timed out
> 2012-02-15 16:07:05 ERROR: pid 7768: connection to 192.168.0.5(5432) failed
> 2012-02-15 16:07:05 ERROR: pid 7768: new_connection: create_cp() failed
> 2012-02-15 16:07:05 LOG:   pid 7768: notice_backend_error: 1 fail over request from pid 7768
> 2012-02-15 16:07:05 LOG:   pid 20836: starting degeneration. shutdown host 192.168.0.5 (5432)
> 
> The only way I found to work around this is by running a small script, after the snapshot, that checks if the node is still up or not ; But that's not a solution, it's a work around.
> 
> Has anybody stumbled on this kind of problem before ?
> 
> ____________________________________________________
> Guillaume Douté
> Administrateur Activités Transversales
> ----------------------------------------------------
> LINKBYNET
> Columbia 
> 32 boulevard Vincent Gâche - 44000 Nantes
> Tel direct : +33 (0)2 40 71 61 64
> Tel : +33 (0)1 48 13 00 00 - Fax : +33 (0)1 48 13 31 21
> Email : g.doute at linkbynet.com - Web : www.linkbynet.com
> _____________________________________________________
> Astreinte : http://www.linkbynet.com/astreinte/
> 
> Avant d'imprimer cet e-mail, pensez à l'environnement.
> 
> -----Message d'origine-----
> De : pgpool-general-bounces at pgpool.net [mailto:pgpool-general-bounces at pgpool.net] De la part de Guillaume DOUTE
> Envoyé : mercredi 25 janvier 2012 11:26
> À : Guillaume Lelarge
> Cc : pgpool-general at pgpool.net
> Objet : [pgpool-general: 195] Re: VM nodes marked down after snapshot
> 
> Hello,
> 
> Sorry for the late reply.
> 
> You were right, I missed that option and it was set on 1. I put it to 0 and things went better. Needless to say that I felt silly.
> 
> For an odd reason, pgpool stopped logging at a certain point in time last Friday, and my problem happened again during the Weekend. So unfortunately, I still have no logs.
> I will post again when I'll have something.
> 
> Thanks again for your help.
> 
> ____________________________________________________
> Guillaume Douté
> Administrateur Activités Transversales
> ----------------------------------------------------
> LINKBYNET
> Columbia
> 32 boulevard Vincent Gâche - 44000 Nantes Tel direct : +33 (0)2 40 71 61 64 Tel : +33 (0)1 48 13 00 00 - Fax : +33 (0)1 48 13 31 21 Email : g.doute at linkbynet.com - Web : www.linkbynet.com _____________________________________________________
> Astreinte : http://www.linkbynet.com/astreinte/
> 
> Avant d'imprimer cet e-mail, pensez à l'environnement.
> 
> -----Message d'origine-----
> De : Guillaume Lelarge [mailto:guillaume at lelarge.info] Envoyé : dimanche 22 janvier 2012 15:21 À : Guillaume DOUTE Cc : pgpool-general at pgpool.net Objet : Re: [pgpool-general: 174] Re: VM nodes marked down after snapshot
> 
> On Tue, 2012-01-17 at 17:58 +0100, Guillaume DOUTE wrote:
>> Thanks for your reply and your explanations,
>> 
>> I can't understand why but I can't reproduce my problem. Things seems 
>> quite stable, fortunately. I will reply with logs when I'll encounter 
>> the problem again
>> 
>> On a side question : I don't understand however why I keep getting "DEBUG" lines in my logs although I didn't launch pgpool with "-d". Logs are too verbose and get too big, so I can't enable logging all the time. Any particular reasons as to why pgpool behaves this way ?
>> 
> 
> You surely have debug_level set to a value higher than 0.
> 
> 
> --
> Guillaume
> http://blog.guillaume.lelarge.info
> http://www.dalibo.com
> PostgreSQL Sessions #3: http://www.postgresql-sessions.org
> 
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list