[pgpool-general: 486] Re: abnormal behavior pgpool on drop

Евгений Селявка evg.selyavka at gmail.com
Tue May 22 20:53:31 JST 2012


1)
I have 2 servers where installed pgpool with heartbeat+pacemaker HA
cluster. Pgpool always run on one machine at the same time.
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd

HW of this pgpool servers is identical:
 HP ProLiant BL460c G1

I have 3 servers where started postgresql in streaming replication mode
One master and two slave, one in sync replication and another one potential

HW of this postgresql servers is identical:
HP ProLiant BL460c G1

2)
listen_addresses = '*'
port = 5433
socket_dir = '/tmp'
pcp_port = 9898
pcp_socket_dir = '/tmp'
backend_hostname0 = 'dbip1'
backend_port0 = 5432
backend_weight0 = 0.499
backend_data_directory0 = '/var/lib/postgresql/9.1/main'
backend_hostname1 = 'dbip2'
backend_port1 = 5432
backend_weight1 = 0.499
backend_data_directory1 = '/var/lib/postgresql/9.1/main'
backend_hostname2 = 'dbip3'
backend_port2 = 5432
backend_weight2 = 0.002
backend_data_directory2 = '/var/lib/postgresql/9.1/main'
enable_pool_hba = off
authentication_timeout = 60
ssl = off
num_init_children = 120
max_pool = 2
child_life_time = 64800
child_max_connections = 1000
connection_life_time = 300
client_idle_limit = 64800
log_destination = 'stderr'
print_timestamp = on
log_connections = off
log_hostname = on
log_statement = off
log_per_node_statement = off
log_standby_delay = 'if_over_threshold'
syslog_facility = 'LOCAL0'
syslog_ident = 'pgpool'
debug_level = 0
pid_file_name = '/var/run/pgpool/pgpool.pid'
logdir = '/tmp'
connection_cache = on
reset_query_list = 'ABORT; DISCARD ALL'
replication_mode = off
replicate_select = off
insert_lock = off
lobj_lock_table = ''
replication_stop_on_mismatch = off
failover_if_affected_tuples_mismatch = off
load_balance_mode = on
ignore_leading_white_space = off
white_function_list = ''
black_function_list = 'nextval,setval,set.*,drop.*'
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 10
sr_check_user = 'sr_check'
sr_check_password = 'xxx'
delay_threshold = 100
follow_master_command = '/etc/pgpool.d/follow_master_command.sh %d %h %p %D
%m %H %M %P'
parallel_mode = off
enable_query_cache = off
pgpool2_hostname = ''
system_db_hostname  = 'localhost'
system_db_port = 5432
system_db_dbname = 'pgpool'
system_db_schema = 'pgpool_catalog'
system_db_user = 'pgpool'
system_db_password = ''
health_check_period = 5
health_check_timeout = 20
health_check_user = 'health_check'
health_check_password = 'xxx'
failover_command = '/etc/pgpool.d/failover.sh %d %h %p %D %m %H %M %P'
failback_command = '/etc/pgpool.d/failback_command.sh %d %h %p %D %m %H %M
%P'
fail_over_on_backend_error = on
recovery_user = 'postgres'
recovery_password = 'postgres'
recovery_1st_stage_command = 'copy_base_backup'
recovery_2nd_stage_command = ''
recovery_timeout = 90
client_idle_limit_in_recovery = 0
relcache_expire = 5

If need i can give you my failover, follow and failback scripts.
3)
I stumbled with this problem twice when try to drop database with size
approximately
1.5G.  I connect to pgpool by pgAdmin - 1.14.2 and try to drop database and
then my session to pgpool hang i go directly to balancer and to the primary
postgresql server and try find out what happens.

I write you about my second case, i find logs about my first case they are
identical.
pgpool logs:
2012-05-16 10:42:20 LOG:   pid 14558: pool_send_and_wait: Error or notice
message from backend: : DB node id: 0 backend pid: 31490 statement: DROP
DATABASE prod_context_copy; message: database "dbname" is being accessed by
other users
2012-05-16 10:43:11 LOG:   pid 14159: postmaster on DB node 0 was shutdown
by administrative command
2012-05-16 10:43:11 LOG:   pid 14159: degenerate_backend_set: 0 fail over
request from pid 14159
2012-05-16 10:43:11 LOG:   pid 4975: starting degeneration. shutdown host
dbmaster(5432)
2012-05-16 10:43:11 LOG:   pid 4975: Restart all children
2012-05-16 10:43:11 LOG:   pid 4975: execute command:
/etc/pgpool.d/failover.sh 0 dbmaster 5432 /var/lib/postgresql/9.1/main 1
dbslave1 0 0
2012-05-16_10:43:11 failover: Master failed at dbmaster (0). New master is
dbslave1 (1)
2012-05-16_10:43:12 failover: New master start triggered.

postgresql logs:
2012-05-16 10:43:13 MSK @  LOG:  received fast shutdown request
2012-05-16 10:43:13 MSK @  LOG:  aborting any active transactions




2012/5/22 Tatsuo Ishii <ishii at postgresql.org>

> Can you please provide:
>
> 1) pgpool, PostgreSQL which server are they running on?  Note that I
>   don't need actual hostnames and IP addresses. I just want to know
>   the "topology" of the server configuration.
>
> 2) pgpool.conf
>
> 3) How to reproduce the problem (if you know)
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
>
> > Tatsuo what information from me, can help you to diagnose the problem? I
> am
> > ready to describe to you my configuration for pgpool and postgresql.
> >
> > 2012/5/22 Tatsuo Ishii <ishii at postgresql.org>
> >
> >> I don't know about your configurations where each server is
> >> residing. So I cannot say anything from what you showed below...
> >> --
> >> Tatsuo Ishii
> >> SRA OSS, Inc. Japan
> >> English: http://www.sraoss.co.jp/index_en.php
> >> Japanese: http://www.sraoss.co.jp
> >>
> >> > I see it in log for example:
> >> >
> >> > On master db:
> >> > in i see this /var/log/postgresql/postgresql-9.1-main.log
> >> > 2012-05-21 15:02:21 MSK @  LOG:  received fast shutdown request
> >> > 2012-05-21 15:02:21 MSK @  LOG:  aborting any active transactions
> >> >
> >> > in /var/log/auth.log i see this:
> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: PAM: initializing for
> >> "postgres"
> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: PAM: setting PAM_RHOST to
> >> > "dbbalancer1"
> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: PAM: setting PAM_TTY to
> "ssh"
> >> > May 21 15:02:21 pre-db2 sshd[6805]: Failed none for postgres from
> >> > 10.2.141.51 port 33031 ssh2
> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: Checking blacklist file
> >> > /usr/share/ssh/blacklist.RSA-2048
> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: Checking blacklist file
> >> > /etc/ssh/blacklist.RSA-2048
> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: temporarily_use_uid:
> 105/109
> >> > (e=0/0)
> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: trying public key file
> >> > /var/lib/postgresql/.ssh/authorized_keys
> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: fd 4 clearing O_NONBLOCK
> >> > May 21 15:02:21 pre-db2 sshd[6805]: debug1: matching key found: file
> >> > /var/lib/postgresql/.ssh/authorized_keys, line 2
> >> >
> >> > I run pgpool on host dbbalancer1 under user postgres.  And configure
> >> pgpool
> >> > script work under postgres and used ssh keys.
> >> >
> >> > And this situation repeated twice.
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > 2012/5/21 Tatsuo Ishii <ishii at postgresql.org>
> >> >
> >> >> > I have pgpool-II version 3.1.3 installed and three db postgresql
> >> 9.1.3 in
> >> >> > streaming replication. When i try to drop database i see this in
> >> pgpool
> >> >> log:
> >> >> >
> >> >> > 2012-05-21 15:01:48 LOG:   pid 15484: pool_send_and_wait: Error or
> >> notice
> >> >> > message from backend: : DB node id: 1 backend pid: 6769 statement:
> >> DROP
> >> >> > DATABASE xxx; message: database "xxx" is being accessed by other
> users
> >> >> > 2012-05-21 15:02:20 LOG:   pid 26326: postmaster on DB node 1 was
> >> >> shutdown
> >> >> > by administrative command
> >> >> > 2012-05-21 15:02:20 LOG:   pid 26326: degenerate_backend_set: 1
> fail
> >> over
> >> >> > request from pid 26326
> >> >> > 2012-05-21 15:02:20 LOG:   pid 4975: starting degeneration.
> shutdown
> >> host
> >> >> > 10.141.2.54(5432)
> >> >> > 2012-05-21 15:02:20 LOG:   pid 4975: Restart all children
> >> >> > 2012-05-21 15:02:20 LOG:   pid 4975: execute command:
> >> >> > /etc/pgpool.d/failover.sh 1 10.141.2.54 5432
> >> >> /var/lib/postgresql/9.1/main 0
> >> >> > 10.141.2.53 0 1
> >> >> > 2012-05-21_15:02:20 failover: Master failed at 10.141.2.54 (0). New
> >> >> master
> >> >> > is 10.141.2.53 (0)
> >> >> > 2012-05-21_15:02:20 failover: New master start triggered.
> >> >> >
> >> >> > Why pgpool initiated failover procedure on statement drop
> database? I
> >> add
> >> >> > drop statement in black_function_list may be this helps me.
> >> >>
> >> >> Why do you think drop database initiated failover? The failover
> >> >> happend 32 seconds after drop database error.
> >> >> --
> >> >> Tatsuo Ishii
> >> >> SRA OSS, Inc. Japan
> >> >> English: http://www.sraoss.co.jp/index_en.php
> >> >> Japanese: http://www.sraoss.co.jp
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > С уважением Селявка Евгений
> >>
> >
> >
> >
> > --
> > С уважением Селявка Евгений
>



-- 
С уважением Селявка Евгений
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20120522/8601967c/attachment-0001.html>


More information about the pgpool-general mailing list