[pgpool-general: 487] Re: abnormal behavior pgpool on drop

Tatsuo Ishii ishii at postgresql.org
Tue May 22 21:25:41 JST 2012


In both cases apparently postmaster received shutdown signal. There's
no code in pgpool which login to the server where PostgreSQL is
running on then sends shutdown signal to it. It seems someone other
than pgpoool does it, isn't it?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> 1)
> I have 2 servers where installed pgpool with heartbeat+pacemaker HA
> cluster. Pgpool always run on one machine at the same time.
> Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
> 
> HW of this pgpool servers is identical:
>  HP ProLiant BL460c G1
> 
> I have 3 servers where started postgresql in streaming replication mode
> One master and two slave, one in sync replication and another one potential
> 
> HW of this postgresql servers is identical:
> HP ProLiant BL460c G1
> 
> 2)
> listen_addresses = '*'
> port = 5433
> socket_dir = '/tmp'
> pcp_port = 9898
> pcp_socket_dir = '/tmp'
> backend_hostname0 = 'dbip1'
> backend_port0 = 5432
> backend_weight0 = 0.499
> backend_data_directory0 = '/var/lib/postgresql/9.1/main'
> backend_hostname1 = 'dbip2'
> backend_port1 = 5432
> backend_weight1 = 0.499
> backend_data_directory1 = '/var/lib/postgresql/9.1/main'
> backend_hostname2 = 'dbip3'
> backend_port2 = 5432
> backend_weight2 = 0.002
> backend_data_directory2 = '/var/lib/postgresql/9.1/main'
> enable_pool_hba = off
> authentication_timeout = 60
> ssl = off
> num_init_children = 120
> max_pool = 2
> child_life_time = 64800
> child_max_connections = 1000
> connection_life_time = 300
> client_idle_limit = 64800
> log_destination = 'stderr'
> print_timestamp = on
> log_connections = off
> log_hostname = on
> log_statement = off
> log_per_node_statement = off
> log_standby_delay = 'if_over_threshold'
> syslog_facility = 'LOCAL0'
> syslog_ident = 'pgpool'
> debug_level = 0
> pid_file_name = '/var/run/pgpool/pgpool.pid'
> logdir = '/tmp'
> connection_cache = on
> reset_query_list = 'ABORT; DISCARD ALL'
> replication_mode = off
> replicate_select = off
> insert_lock = off
> lobj_lock_table = ''
> replication_stop_on_mismatch = off
> failover_if_affected_tuples_mismatch = off
> load_balance_mode = on
> ignore_leading_white_space = off
> white_function_list = ''
> black_function_list = 'nextval,setval,set.*,drop.*'
> master_slave_mode = on
> master_slave_sub_mode = 'stream'
> sr_check_period = 10
> sr_check_user = 'sr_check'
> sr_check_password = 'xxx'
> delay_threshold = 100
> follow_master_command = '/etc/pgpool.d/follow_master_command.sh %d %h %p %D
> %m %H %M %P'
> parallel_mode = off
> enable_query_cache = off
> pgpool2_hostname = ''
> system_db_hostname  = 'localhost'
> system_db_port = 5432
> system_db_dbname = 'pgpool'
> system_db_schema = 'pgpool_catalog'
> system_db_user = 'pgpool'
> system_db_password = ''
> health_check_period = 5
> health_check_timeout = 20
> health_check_user = 'health_check'
> health_check_password = 'xxx'
> failover_command = '/etc/pgpool.d/failover.sh %d %h %p %D %m %H %M %P'
> failback_command = '/etc/pgpool.d/failback_command.sh %d %h %p %D %m %H %M
> %P'
> fail_over_on_backend_error = on
> recovery_user = 'postgres'
> recovery_password = 'postgres'
> recovery_1st_stage_command = 'copy_base_backup'
> recovery_2nd_stage_command = ''
> recovery_timeout = 90
> client_idle_limit_in_recovery = 0
> relcache_expire = 5
> 
> If need i can give you my failover, follow and failback scripts.
> 3)
> I stumbled with this problem twice when try to drop database with size
> approximately
> 1.5G.  I connect to pgpool by pgAdmin - 1.14.2 and try to drop database and
> then my session to pgpool hang i go directly to balancer and to the primary
> postgresql server and try find out what happens.
> 
> I write you about my second case, i find logs about my first case they are
> identical.
> pgpool logs:
> 2012-05-16 10:42:20 LOG:   pid 14558: pool_send_and_wait: Error or notice
> message from backend: : DB node id: 0 backend pid: 31490 statement: DROP
> DATABASE prod_context_copy; message: database "dbname" is being accessed by
> other users
> 2012-05-16 10:43:11 LOG:   pid 14159: postmaster on DB node 0 was shutdown
> by administrative command
> 2012-05-16 10:43:11 LOG:   pid 14159: degenerate_backend_set: 0 fail over
> request from pid 14159
> 2012-05-16 10:43:11 LOG:   pid 4975: starting degeneration. shutdown host
> dbmaster(5432)
> 2012-05-16 10:43:11 LOG:   pid 4975: Restart all children
> 2012-05-16 10:43:11 LOG:   pid 4975: execute command:
> /etc/pgpool.d/failover.sh 0 dbmaster 5432 /var/lib/postgresql/9.1/main 1
> dbslave1 0 0
> 2012-05-16_10:43:11 failover: Master failed at dbmaster (0). New master is
> dbslave1 (1)
> 2012-05-16_10:43:12 failover: New master start triggered.
> 
> postgresql logs:
> 2012-05-16 10:43:13 MSK @  LOG:  received fast shutdown request
> 2012-05-16 10:43:13 MSK @  LOG:  aborting any active transactions
> 
> 
> 
> 
> 2012/5/22 Tatsuo Ishii <ishii at postgresql.org>
> 
>> Can you please provide:
>>
>> 1) pgpool, PostgreSQL which server are they running on?  Note that I
>>   don't need actual hostnames and IP addresses. I just want to know
>>   the "topology" of the server configuration.
>>
>> 2) pgpool.conf
>>
>> 3) How to reproduce the problem (if you know)
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
>>
>> > Tatsuo what information from me, can help you to diagnose the problem? I
>> am
>> > ready to describe to you my configuration for pgpool and postgresql.
>> >
>> > 2012/5/22 Tatsuo Ishii <ishii at postgresql.org>
>> >
>> >> I don't know about your configurations where each server is
>> >> residing. So I cannot say anything from what you showed below...
>> >> --
>> >> Tatsuo Ishii
>> >> SRA OSS, Inc. Japan
>> >> English: http://www.sraoss.co.jp/index_en.php
>> >> Japanese: http://www.sraoss.co.jp
>> >>
>> >> > I see it in log for example:
>> >> >
>> >> > On master db:
>> >> > in i see this /var/log/postgresql/postgresql-9.1-main.log
>> >> > 2012-05-21 15:02:21 MSK @  LOG:  received fast shutdown request
>> >> > 2012-05-21 15:02:21 MSK @  LOG:  aborting any active transactions
>> >> >
>> >> > in /var/log/auth.log i see this:
>> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: PAM: initializing for
>> >> "postgres"
>> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: PAM: setting PAM_RHOST to
>> >> > "dbbalancer1"
>> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: PAM: setting PAM_TTY to
>> "ssh"
>> >> > May 21 15:02:21 pre-db2 sshd[6805]: Failed none for postgres from
>> >> > 10.2.141.51 port 33031 ssh2
>> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: Checking blacklist file
>> >> > /usr/share/ssh/blacklist.RSA-2048
>> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: Checking blacklist file
>> >> > /etc/ssh/blacklist.RSA-2048
>> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: temporarily_use_uid:
>> 105/109
>> >> > (e=0/0)
>> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: trying public key file
>> >> > /var/lib/postgresql/.ssh/authorized_keys
>> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: fd 4 clearing O_NONBLOCK
>> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: matching key found: file
>> >> > /var/lib/postgresql/.ssh/authorized_keys, line 2
>> >> >
>> >> > I run pgpool on host dbbalancer1 under user postgres.  And configure
>> >> pgpool
>> >> > script work under postgres and used ssh keys.
>> >> >
>> >> > And this situation repeated twice.
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > 2012/5/21 Tatsuo Ishii <ishii at postgresql.org>
>> >> >
>> >> >> > I have pgpool-II version 3.1.3 installed and three db postgresql
>> >> 9.1.3 in
>> >> >> > streaming replication. When i try to drop database i see this in
>> >> pgpool
>> >> >> log:
>> >> >> >
>> >> >> > 2012-05-21 15:01:48 LOG:   pid 15484: pool_send_and_wait: Error or
>> >> notice
>> >> >> > message from backend: : DB node id: 1 backend pid: 6769 statement:
>> >> DROP
>> >> >> > DATABASE xxx; message: database "xxx" is being accessed by other
>> users
>> >> >> > 2012-05-21 15:02:20 LOG:   pid 26326: postmaster on DB node 1 was
>> >> >> shutdown
>> >> >> > by administrative command
>> >> >> > 2012-05-21 15:02:20 LOG:   pid 26326: degenerate_backend_set: 1
>> fail
>> >> over
>> >> >> > request from pid 26326
>> >> >> > 2012-05-21 15:02:20 LOG:   pid 4975: starting degeneration.
>> shutdown
>> >> host
>> >> >> > 10.141.2.54(5432)
>> >> >> > 2012-05-21 15:02:20 LOG:   pid 4975: Restart all children
>> >> >> > 2012-05-21 15:02:20 LOG:   pid 4975: execute command:
>> >> >> > /etc/pgpool.d/failover.sh 1 10.141.2.54 5432
>> >> >> /var/lib/postgresql/9.1/main 0
>> >> >> > 10.141.2.53 0 1
>> >> >> > 2012-05-21_15:02:20 failover: Master failed at 10.141.2.54 (0). New
>> >> >> master
>> >> >> > is 10.141.2.53 (0)
>> >> >> > 2012-05-21_15:02:20 failover: New master start triggered.
>> >> >> >
>> >> >> > Why pgpool initiated failover procedure on statement drop
>> database? I
>> >> add
>> >> >> > drop statement in black_function_list may be this helps me.
>> >> >>
>> >> >> Why do you think drop database initiated failover? The failover
>> >> >> happend 32 seconds after drop database error.
>> >> >> --
>> >> >> Tatsuo Ishii
>> >> >> SRA OSS, Inc. Japan
>> >> >> English: http://www.sraoss.co.jp/index_en.php
>> >> >> Japanese: http://www.sraoss.co.jp
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > С уважением Селявка Евгений
>> >>
>> >
>> >
>> >
>> > --
>> > С уважением Селявка Евгений
>>
> 
> 
> 
> -- 
> С уважением Селявка Евгений


More information about the pgpool-general mailing list