[Pgpool-general] Cannot add node after failure

Fernando Morgenstern fernando at consultorpc.com
Sat Dec 12 00:34:43 UTC 2009


Hello Marcos,

Thanks for your answer. Here it is some additional info about my  
problem:

pgpool log:

# cat /tmp/pgpool.log
2009-12-11 17:21:19 DEBUG: pid 3262: key: listen_addresses
2009-12-11 17:21:19 DEBUG: pid 3262: value: '*' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: port
2009-12-11 17:21:19 DEBUG: pid 3262: value: 5432 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: pcp_port
2009-12-11 17:21:19 DEBUG: pid 3262: value: 9898 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: socket_dir
2009-12-11 17:21:19 DEBUG: pid 3262: value: '/tmp' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: pcp_socket_dir
2009-12-11 17:21:19 DEBUG: pid 3262: value: '/tmp' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_socket_dir
2009-12-11 17:21:19 DEBUG: pid 3262: value: '/tmp' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: pcp_timeout
2009-12-11 17:21:19 DEBUG: pid 3262: value: 10 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: num_init_children
2009-12-11 17:21:19 DEBUG: pid 3262: value: 300 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: max_pool
2009-12-11 17:21:19 DEBUG: pid 3262: value: 2 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: child_life_time
2009-12-11 17:21:19 DEBUG: pid 3262: value: 300 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: connection_life_time
2009-12-11 17:21:19 DEBUG: pid 3262: value: 0 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: child_max_connections
2009-12-11 17:21:19 DEBUG: pid 3262: value: 0 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: client_idle_limit
2009-12-11 17:21:19 DEBUG: pid 3262: value: 0 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: authentication_timeout
2009-12-11 17:21:19 DEBUG: pid 3262: value: 60 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: logdir
2009-12-11 17:21:19 DEBUG: pid 3262: value: '/tmp' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: pid_file_name
2009-12-11 17:21:19 DEBUG: pid 3262: value: '/var/run/pgpool/ 
pgpool.pid' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: replication_mode
2009-12-11 17:21:19 DEBUG: pid 3262: value: true kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: load_balance_mode
2009-12-11 17:21:19 DEBUG: pid 3262: value: true kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: replication_stop_on_mismatch
2009-12-11 17:21:19 DEBUG: pid 3262: value: false kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: replication_stop_on_mismatch: 0
2009-12-11 17:21:19 DEBUG: pid 3262: key: replicate_select
2009-12-11 17:21:19 DEBUG: pid 3262: value: false kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: replicate_select: 0
2009-12-11 17:21:19 DEBUG: pid 3262: key: reset_query_list
2009-12-11 17:21:19 DEBUG: pid 3262: value: 'ABORT; DISCARD ALL' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: extract_string_tokens: token: ABORT
2009-12-11 17:21:19 DEBUG: pid 3262: extract_string_tokens: token:   
DISCARD ALL
2009-12-11 17:21:19 DEBUG: pid 3262: key: print_timestamp
2009-12-11 17:21:19 DEBUG: pid 3262: value: true kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: master_slave_mode
2009-12-11 17:21:19 DEBUG: pid 3262: value: false kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: connection_cache
2009-12-11 17:21:19 DEBUG: pid 3262: value: true kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: health_check_timeout
2009-12-11 17:21:19 DEBUG: pid 3262: value: 20 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: health_check_period
2009-12-11 17:21:19 DEBUG: pid 3262: value: 30 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: health_check_user
2009-12-11 17:21:19 DEBUG: pid 3262: value: 'postgres' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: failover_command
2009-12-11 17:21:19 DEBUG: pid 3262: value: 'echo host:%h, new master  
id:%m, old master id:%M > /tmp/failover.log' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: failback_command
2009-12-11 17:21:19 DEBUG: pid 3262: value: 'echo host:%h, new master  
id:%m, old master id:%M > /tmp/failback.log' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: fail_over_on_backend_error
2009-12-11 17:21:19 DEBUG: pid 3262: value: true kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: insert_lock
2009-12-11 17:21:19 DEBUG: pid 3262: value: true kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: ignore_leading_white_space
2009-12-11 17:21:19 DEBUG: pid 3262: value: true kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: log_statement
2009-12-11 17:21:19 DEBUG: pid 3262: value: false kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: log_per_node_statement
2009-12-11 17:21:19 DEBUG: pid 3262: value: false kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: log_connections
2009-12-11 17:21:19 DEBUG: pid 3262: value: false kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: log_hostname
2009-12-11 17:21:19 DEBUG: pid 3262: value: false kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: parallel_mode
2009-12-11 17:21:19 DEBUG: pid 3262: value: false kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: enable_query_cache
2009-12-11 17:21:19 DEBUG: pid 3262: value: false kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: pgpool2_hostname
2009-12-11 17:21:19 DEBUG: pid 3262: value: '' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: system_db_hostname
2009-12-11 17:21:19 DEBUG: pid 3262: value: 'localhost' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: system_db_port
2009-12-11 17:21:19 DEBUG: pid 3262: value: 4002 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: system_db_dbname
2009-12-11 17:21:19 DEBUG: pid 3262: value: 'pgpool' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: system_db_schema
2009-12-11 17:21:19 DEBUG: pid 3262: value: 'public' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: system_db_user
2009-12-11 17:21:19 DEBUG: pid 3262: value: 'postgres' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: system_db_password
2009-12-11 17:21:19 DEBUG: pid 3262: value: '' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: enable_pool_hba
2009-12-11 17:21:19 DEBUG: pid 3262: value: true kind: 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: client_idle_limit_in_recovery
2009-12-11 17:21:19 DEBUG: pid 3262: value: 0 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: replication_timeout
2009-12-11 17:21:19 DEBUG: pid 3262: value: 5000 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_hostname0
2009-12-11 17:21:19 DEBUG: pid 3262: value: 'localhost' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_port0
2009-12-11 17:21:19 DEBUG: pid 3262: value: 4002 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: pool_config: port slot number 0
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_weight0
2009-12-11 17:21:19 DEBUG: pid 3262: value: 10 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: pool_config: weight slot number 0  
weight: 10.000000
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_data_directory0
2009-12-11 17:21:19 DEBUG: pid 3262: value: '/usr/local/pgpool/data- 
pp2' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_hostname1
2009-12-11 17:21:19 DEBUG: pid 3262: value: 'im-pp3' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_port1
2009-12-11 17:21:19 DEBUG: pid 3262: value: 4003 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: pool_config: port slot number 1
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_weight1
2009-12-11 17:21:19 DEBUG: pid 3262: value: 10 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: pool_config: weight slot number 1  
weight: 10.000000
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_data_directory1
2009-12-11 17:21:19 DEBUG: pid 3262: value: '/usr/local/pgpool/data- 
pp3' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_hostname2
2009-12-11 17:21:19 DEBUG: pid 3262: value: 'im-pp1' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_port2
2009-12-11 17:21:19 DEBUG: pid 3262: value: 4001 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: pool_config: port slot number 2
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_weight2
2009-12-11 17:21:19 DEBUG: pid 3262: value: 10 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: pool_config: weight slot number 2  
weight: 10.000000
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_data_directory2
2009-12-11 17:21:19 DEBUG: pid 3262: value: '/usr/local/pgpool/data- 
pp1' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_hostname3
2009-12-11 17:21:19 DEBUG: pid 3262: value: 'im-pp4' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_port3
2009-12-11 17:21:19 DEBUG: pid 3262: value: 4004 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: pool_config: port slot number 3
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_weight3
2009-12-11 17:21:19 DEBUG: pid 3262: value: 10 kind: 2
2009-12-11 17:21:19 DEBUG: pid 3262: pool_config: weight slot number 3  
weight: 10.000000
2009-12-11 17:21:19 DEBUG: pid 3262: key: backend_data_directory3
2009-12-11 17:21:19 DEBUG: pid 3262: value: '/usr/local/pgpool/data- 
pp4' kind: 4
2009-12-11 17:21:19 DEBUG: pid 3262: num_backends: 4 num_backends: 4  
total_weight: 40.000000
2009-12-11 17:21:19 DEBUG: pid 3262: backend 0 weight: 536870911.750000
2009-12-11 17:21:19 DEBUG: pid 3262: backend 1 weight: 536870911.750000
2009-12-11 17:21:19 DEBUG: pid 3262: backend 2 weight: 536870911.750000
2009-12-11 17:21:19 DEBUG: pid 3262: backend 3 weight: 536870911.750000
2009-12-11 17:21:19 DEBUG: pid 3262: loading "/usr/local/etc/ 
pool_hba.conf" for client authentication configuration file
2009-12-11 17:21:19 DEBUG: pid 3263: I am 3263
2009-12-11 17:21:19 DEBUG: pid 3264: I am 3264
... duplicated messsages, but for different pids
2009-12-11 17:21:19 LOG:   pid 3262: pgpool successfully started
2009-12-11 17:21:19 DEBUG: pid 3563: I am PCP 3563
2009-12-11 17:21:19 DEBUG: pid 3262: starting health checking
2009-12-11 17:21:19 DEBUG: pid 3262: health_check: 0 th DB node  
status: 3
2009-12-11 17:21:19 DEBUG: pid 3262: health_check: 1 th DB node  
status: 3
2009-12-11 17:21:19 DEBUG: pid 3262: health_check: 2 th DB node  
status: 3
2009-12-11 17:21:19 DEBUG: pid 3262: health_check: 3 th DB node  
status: 3

Them it just keep repeating the health_check message.


postgres running on 0th db:

[postgres at im-pp2 ~]$ psql  -p 4002
psql (8.4.1)
Type "help" for help.

postgres=# \q


pgpool.conf:

#
# pgpool-II configuration file sample
# $Header: /cvsroot/pgpool/pgpool-II/pgpool.conf.sample,v 1.29  
2009/12/06 08:46:34 t-ishii Exp $

# Host name or IP address to listen on: '*' for all, '' for no TCP/IP
# connections
listen_addresses = '*'

# Port number for pgpool
port = 5432

# Port number for pgpool communication manager
pcp_port = 9898

# Unix domain socket path.  (The Debian package defaults to
# /var/run/postgresql.)
socket_dir = '/tmp'

# Unix domain socket path for pgpool communication manager.
# (Debian package defaults to /var/run/postgresql)
pcp_socket_dir = '/tmp'

# Unix domain socket path for the backend. Debian package defaults to / 
var/run/postgresql!
backend_socket_dir = '/tmp'

# pgpool communication manager timeout. 0 means no timeout, but  
strongly not recommended!
pcp_timeout = 10

# number of pre-forked child process
num_init_children = 300

# Number of connection pools allowed for a child process
max_pool = 2

# If idle for this many seconds, child exits.  0 means no timeout.
child_life_time = 300

# If idle for this many seconds, connection to PostgreSQL closes.
# 0 means no timeout.
connection_life_time = 0

# If child_max_connections connections were received, child exits.
# 0 means no exit.
child_max_connections = 0

# If client_idle_limit is n (n > 0), the client is forced to be
# disconnected whenever after n seconds idle (even inside an explicit
# transactions!)
# 0 means no disconnect.
client_idle_limit = 0

# Maximum time in seconds to complete client authentication.
# 0 means no timeout.
authentication_timeout = 60

# Logging directory
logdir = '/tmp'

# pid file name
pid_file_name = '/var/run/pgpool/pgpool.pid'

# Replication mode
replication_mode = true

# Load balancing mode, i.e., all SELECTs are load balanced.
# This is ignored if replication_mode is false.
load_balance_mode = true

# if there's a data mismatch between master and secondary
# start degeneration to stop replication mode
replication_stop_on_mismatch = false

# If true, replicate SELECT statement when load balancing is disabled.
# If false, it is only sent to the master node.
replicate_select = false

# Semicolon separated list of queries to be issued at the end of a
# session
reset_query_list = 'ABORT; DISCARD ALL'
# for 8.2 or older this should be as follows.
#reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION  
DEFAULT'

# If true print timestamp on each log line.
print_timestamp = true

# If true, operate in master/slave mode.
master_slave_mode = false

# If true, cache connection pool.
connection_cache = true

# Health check timeout.  0 means no timeout.
health_check_timeout = 20

# Health check period.  0 means no health check.
health_check_period = 30

# Health check user
health_check_user = 'postgres'

# Execute command by failover.
# special values:  %d = node id
#                  %h = host name
#                  %p = port number
#                  %D = database cluster path
#                  %m = new master node id
#                  %M = old master node id
#                  %% = '%' character
#
failover_command = 'echo host:%h, new master id:%m, old master id:%M  
 > /tmp/failover.log'

# Execute command by failback.
# special values:  %d = node id
#                  %h = host name
#                  %p = port number
#                  %D = database cluster path
#                  %m = new master node id
#                  %M = old master node id
#                  %% = '%' character
#
failback_command = 'echo host:%h, new master id:%m, old master id:%M  
 > /tmp/failback.log'

# If true, trigger fail over when writing to the backend communication
# socket fails. This is the same behavior of pgpool-II 2.2.x or
# earlier. If set to false, pgpool will report an error and disconnect
# the session.
fail_over_on_backend_error = true

# If true, automatically locks a table with INSERT statements to keep
# SERIAL data consistency.  If the data does not have SERIAL data
# type, no lock will be issued. An /*INSERT LOCK*/ comment has the
# same effect.  A /NO INSERT LOCK*/ comment disables the effect.
insert_lock = true

# If true, ignore leading white spaces of each query while pgpool judges
# whether the query is a SELECT so that it can be load balanced.  This
# is useful for certain APIs such as DBI/DBD which is known to adding an
# extra leading white space.
ignore_leading_white_space = true

# If true, print all statements to the log.  Like the log_statement  
option
# to PostgreSQL, this allows for observing queries without engaging in  
full
# debugging.
log_statement = false

# If true, print all statements to the log. Similar to log_statement  
except
# that prints DB node id and backend process id info.
log_per_node_statement = false

# If true, incoming connections will be printed to the log.
log_connections = false

# If true, hostname will be shown in ps status. Also shown in
# connection log if log_connections = true.
# Be warned that this feature will add overhead to look up hostname.
log_hostname = false

# if non 0, run in parallel query mode
parallel_mode = false

# if non 0, use query cache
enable_query_cache = false

#set pgpool2 hostname
pgpool2_hostname = ''

# system DB info
system_db_hostname = 'localhost'
system_db_port = 4002
system_db_dbname = 'pgpool'
system_db_schema = 'public'
system_db_user = 'postgres'
system_db_password = ''

# backend_hostname, backend_port, backend_weight


# - HBA -

# If true, use pool_hba.conf for client authentication. In pgpool-II
# 1.1, the default value is false. The default value will be true in
# 1.2.
enable_pool_hba = true

# - online recovery -
# online recovery user
#recovery_user = 'postgres'

# online recovery password
#recovery_password = ''

# execute a command in first stage.
#recovery_1st_stage_command = 'copy_base_backup'

# execute a command in second stage.
#recovery_2nd_stage_command = 'pgpool_recovery_pitr'

# maximum time in seconds to wait for the recovering node's postmaster
# start-up. 0 means no wait.
# this is also used as a timer waiting for clients disconnected before
# starting 2nd stage
#recovery_timeout = 90

# If client_idle_limit_in_recovery is n (n > 0), the client is forced
# to be disconnected whenever after n seconds idle (even inside an
# explicit transactions!)  0 means no disconnect. This parameter only
# takes effect in recovery 2nd stage.
client_idle_limit_in_recovery = 0

replication_timeout = 5000
backend_hostname0 = 'localhost'
backend_port0 = 4002
backend_weight0 = 10
backend_data_directory0 = '/usr/local/pgpool/data-pp2'
backend_hostname1 = 'im-pp3'
backend_port1 = 4003
backend_weight1 = 10
backend_data_directory1 = '/usr/local/pgpool/data-pp3'
backend_hostname2 = 'im-pp1'
backend_port2 = 4001
backend_weight2 = 10
backend_data_directory2 = '/usr/local/pgpool/data-pp1'
backend_hostname3 = 'im-pp4'
backend_port3 = 4004
backend_weight3 = 10
backend_data_directory3 = '/usr/local/pgpool/data-pp4'

Thanks again!

---

Fernando Marcelo
www.consultorpc.com
fernando at consultorpc.com

Em 11/12/2009, às 16:41, Marcos Davi Reis escreveu:

> Please send more log info to help us to support you!
>
>
> Att,
> Marcos Davi Reis
> www.movaomundo.com
>
>
>
> On Fri, Dec 11, 2009 at 3:37 PM, Fernando Morgenstern <fernando at consultorpc.com 
> > wrote:
> Hello,
>
> I started to use pgpool a few days ago and, while testing it, i am  
> having some issues to bring nodes back.
>
> I have 4 nodes running, so i decided to stop one of them. I saw on  
> logs that its status changed to 3. Ok, perfect.
>
> Them, i started the failed node again, but its status was still 3.
>
> I decided to stop and start pgpool again, now all nodes have status 3:
>
> 2009-12-11 11:29:28 DEBUG: pid 1107: health_check: 0 th DB node  
> status: 3
> 2009-12-11 11:29:28 DEBUG: pid 1107: health_check: 1 th DB node  
> status: 3
> 2009-12-11 11:29:28 DEBUG: pid 1107: health_check: 2 th DB node  
> status: 3
> 2009-12-11 11:29:28 DEBUG: pid 1107: health_check: 3 th DB node  
> status: 3
>
> Am i doing something wrong here?
>
> I have checked and i can connect from pgpool host to all postgres  
> servers.
>
> Best Regards,
> ---
>
> Fernando Marcelo
> www.consultorpc.com
> fernando at consultorpc.com
>
> _______________________________________________
> Pgpool-general mailing list
> Pgpool-general at pgfoundry.org
> http://pgfoundry.org/mailman/listinfo/pgpool-general
>
>
>
> -- 
> Marcos Davi Reis
> Mova
> www.movaomundo.com
> +55 21 3553-1511
> +55 21 9923-8319

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pgfoundry.org/pipermail/pgpool-general/attachments/20091211/4ab2578d/attachment-0001.html>


More information about the Pgpool-general mailing list