[pgpool-general: 765] Re: pgpool dropping backends too much

Tatsuo Ishii ishii at postgresql.org
Thu Jul 19 12:44:34 JST 2012


When you see this;
> pool_process_query: discard E packet from backend 1

Is there any error reported in backend 1's log?
I'm asking because E corresponds to "ERROR" of PostgreSQL.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> We are running pgpool with 3 backend servers (9.0, streaming
> replication). We are running non-SSL client-pgpool and SSL
> pgpool-server (my previous email was in error, we do not appear to be
> using SSL client-pgpool).
> 
> I have set the primary server to not support fail-over, and that
> works, it doesn't failover. However our slaves failover once or twice
> a day, when the slave has not in fact failed. I have to reattach the
> node, and it continues happily.
> 
> The syslog always contains this note about E packets:
> 
> Jul 17 00:11:03 app2 pgpool: 2012-07-17 00:11:03 LOG: pid 31871:
> pool_process_query: discard E packet from backend 1
> Jul 17 00:11:03 app2 pgpool: 2012-07-17 00:11:03 ERROR: pid 31871:
> pool_ssl: SSL_read: no SSL error reported
> Jul 17 00:11:03 app2 pgpool: 2012-07-17 00:11:03 ERROR: pid 31871:
> pool_read: read failed (Success)
> Jul 17 00:11:03 app2 pgpool: 2012-07-17 00:11:03 LOG: pid 31871:
> degenerate_backend_set: 1 fail over request from pid 31871
> Jul 17 00:11:03 app2 pgpool: 2012-07-17 00:11:03 LOG: pid 30346:
> starting degeneration. shutdown host db2(5432)
> Jul 17 00:11:03 app2 pgpool: 2012-07-17 00:11:03 LOG: pid 30346:
> Restart all children
> Jul 17 00:11:03 app2 pgpool: 2012-07-17 00:11:03 LOG: pid 30346:
> find_primary_node_repeatedly: waiting for finding a primary node
> Jul 17 00:11:03 app2 pgpool: 2012-07-17 00:11:03 LOG: pid 30346:
> find_primary_node: primary node id is 0
> Jul 17 00:11:03 app2 pgpool: 2012-07-17 00:11:03 LOG: pid 30346:
> failover: set new primary node: 0
> Jul 17 00:11:03 app2 pgpool: 2012-07-17 00:11:03 LOG: pid 30346:
> failover: set new master node: 0
> Jul 17 00:11:03 app2 pgpool: 2012-07-17 00:11:03 LOG: pid 923: worker
> process received restart request
> Jul 17 00:11:03 app2 pgpool: 2012-07-17 00:11:03 LOG: pid 30346:
> failover done. shutdown host db2(5432)
> Jul 17 00:11:04 app2 pgpool: 2012-07-17 00:11:04 LOG: pid 30346:
> worker child 923 exits with status 256
> Jul 17 00:11:04 app2 pgpool: 2012-07-17 00:11:04 LOG: pid 924: pcp
> child process received restart request
> Jul 17 00:11:04 app2 pgpool: 2012-07-17 00:11:04 LOG: pid 30346: fork
> a new worker child pid 9434
> Jul 17 00:11:04 app2 pgpool: 2012-07-17 00:11:04 LOG: pid 30346: PCP
> child 924 exits with status 256
> Jul 17 00:11:04 app2 pgpool: 2012-07-17 00:11:04 LOG: pid 30346: fork
> a new PCP child pid 9435
> 
> Sometimes the proceeding syslog entry is a LOG notice about a
> statement that failed, eg:
> Jul 17 00:11:03 app2 pgpool: 2012-07-17 00:11:03 LOG: pid 26664:
> pool_send_and_wait: Error or notice message from backend: : DB node
> id: 1 backend pid: 15682 statement: <SNIP> message: canceling
> statement due to conflict with recovery
> 
> I don't want to mark our slaves as no-failover but it seems that
> pgpool is either experiencing a fault and interpreting it as a
> failover, or is a bit sensitive? I'm happy to test patches!
> 
> Best regards,
> Karl
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list