[Pgpool-general] pgpool-II 2.2 RC1 released

Wed Feb 18 16:21:44 UTC 2009

On Sun, Feb 15, 2009 at 6:43 AM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> pgpool-II 2.2 RC1 is released. Changes from beta2:

While trying to reproduce the problem with reloading PostgreSQL
(change log_connections and log_statements) while under load, I
decided to do an online recovery. I used version beta 2.

And got this nasty error in the syslog where pgpool-II is executing
(the master node):

Feb 18 14:52:45 pgsql1 pgpool[11708]: Rsyncing directory pg_xlog
Feb 18 14:52:56 pgsql1 pgpool[11712]: Rsyncing file recovery.conf
(with source deletion)
Feb 18 14:52:56 pgsql1 pgpool[11715]: Executing pg_stop_backup
Feb 18 14:52:56 pgsql1 pgpool: LOG:   pid 12706: 1st stage is done
Feb 18 14:52:56 pgsql1 pgpool: LOG:   pid 12706: starting 2nd stage
Feb 18 14:53:26 pgsql1 pgpool: LOG:   pid 3307: pool_process_query:
child connection forced to terminate due to
client_idle_limit_in_recovery(30) reached
Feb 18 14:53:26 pgsql1 pgpool: LOG:   pid 24993: pool_process_query:
child connection forced to terminate due to
client_idle_limit_in_recovery(30) reached
Feb 18 14:53:26 pgsql1 pgpool: LOG:   pid 3314: pool_process_query:
child connection forced to terminate due to
client_idle_limit_in_recovery(30) reached
Feb 18 14:53:29 pgsql1 pgpool: LOG:   pid 12706: all connections from
clients have been closed
Feb 18 14:53:29 pgsql1 pgpool: LOG:   pid 12706: CHECKPOINT in the 2nd
stage done
Feb 18 14:53:29 pgsql1 pgpool: LOG:   pid 12706: starting recovery
command: "SELECT pgpool_recovery('pgpool-recovery-pitr',
'pgsql2.freyatest.domain', '/var/l
ib/postgresql/8.3/main')"
Feb 18 14:53:29 pgsql1 pgpool[11725]: Executing pgpool-recovery-pitr
as user postgres
Feb 18 14:53:29 pgsql1 pgpool[11726]: Executing pg_switch_xlog
Feb 18 14:53:29 pgsql1 pgpool[11730]: pg_switch_xlog executed successfully.
Feb 18 14:53:29 pgsql1 pgpool[11734]: Executing pgpool_remote_start as
user postgres
Feb 18 14:53:29 pgsql1 pgpool[11735]: Starting remote PostgreSQL server
Feb 18 14:53:34 pgsql1 pgpool: LOG:   pid 12706: 1 node restarted
Feb 18 14:53:34 pgsql1 pgpool: LOG:   pid 12706:
send_failback_request: fail back 1 th node request from pid 12706
Feb 18 14:53:34 pgsql1 pgpool: LOG:   pid 12669: starting fail back.
reconnect host pgsql2.freyatest.domain(5432)
Feb 18 14:53:34 pgsql1 pgpool: LOG:   pid 12669: execute command:
/var/lib/postgresql/8.3/main/pgpool-failback 1 pgsql2.freyatest.domain
5432 /var/lib/postgre
sql/8.3/main 0 0
Feb 18 14:53:34 pgsql1 pgpool[11768]: Executing pgpool-failback as
user postgres
Feb 18 14:53:34 pgsql1 pgpool[11769]: Failback of node 1 at hostname
pgsql2.freyatest.domain. New master node is 0. Old master node was 0.
Feb 18 14:53:34 pgsql1 kernel: [2427427.254118] pgpool[3307] segfault
at 10 ip 4137ac sp 7fff12fb3930 error 4 in pgpool[400000+b0000]
Feb 18 14:53:34 pgsql1 pgpool: LOG:   pid 12669: failover_handler: set
new master node: 0
Feb 18 14:53:34 pgsql1 kernel: [2427427.267249] pgpool[3314] segfault
at 10 ip 4137ac sp 7fff12fb3930 error 4 in pgpool[400000+b0000]
Feb 18 14:53:34 pgsql1 pgpool: LOG:   pid 12669: failback done.
reconnect host pgsql2.freyatest.domain(5432)
Feb 18 14:53:34 pgsql1 pgpool: LOG:   pid 12706: recovery done

The thing is that pgpool-II was active and both nodes seemed to be
online, but the error gave me the creeps. I got so scared that now I
am using RC1 with the createindex patch you submitted and doing all
the tests from scratch again. Recovering again did not produce the
same error (using RC1 with the patch), although it was a quick
operation as it did not have to transfer tons of gigabytes of
information (they were already transfered before, when it gave the
error).

I'll submit more as soon as I can. Too damn busy these days and can't
really test as much as I would like :'(

-- 
Jaume Sabater
http://linuxsilo.net/

"Ubi sapientas ibi libertas"