[pgpool-general: 1407] Re: pgPool Online Recovery Streaming Replication

ning chan ninchan8328 at gmail.com
Wed Feb 20 06:02:59 JST 2013


Hi Tatsuo,
I figured out the problem to the some scripting error in the
pgpool_remote_start to have the incorrect path to pg_ctl.
As soon as I corrected it i can recover the failed server and bring it
online.
I now however facing another problem.
After I bring the failed master back in to the pgpool as a standby server,
i then shutdown the currect Primary server and expecting Standby server
will be promoted to become the new Primary; however it did not happen.
I check the pgpool log, i see failover command is called but it did not
execute. I check and confirm my failover script works just fine.

Here is the log:
Feb 19 14:51:40 server0 pgpool[3519]: set 1 th backend down status
Feb 19 14:51:43 server0 pgpool[3554]: wd_create_send_socket: connect()
reports failure (No route to host). You can safely ignore this while
starting up.
Feb 19 14:51:43 server0 pgpool[3522]: wd_lifecheck: lifecheck failed 3
times. pgpool seems not to be working
Feb 19 14:51:43 server0 pgpool[3522]: wd_IP_down: ifconfig down succeeded
Feb 19 14:51:43 server0 pgpool[3519]: starting degeneration. shutdown host
server1(5432)
Feb 19 14:51:43 server0 pgpool[3519]: Restart all children
Feb 19 14:51:43 server0 pgpool[3519]: execute command:
/home/pgpool/failover.py -d 1 -h server1 -p 5432 -D /opt/postgres/9.2/data
-m 0 -H server0 -M 0 -P 1 -r 5432 -R /opt/postgres/9.2/data
Feb 19 14:51:43 server0 postgres[3939]: [2-1] LOG:  incomplete startup
packet
Feb 19 14:51:43 server0 postgres[3931]: [2-1] LOG:  incomplete startup
packet
Feb 19 14:51:43 server0 postgres[3935]: [2-1] LOG:  incomplete startup
packet
Feb 19 14:51:43 server0 pgpool[3519]: find_primary_node_repeatedly: waiting
for finding a primary node
Feb 19 14:51:46 server0 pgpool[3522]: wd_create_send_socket: connect()
reports failure (No route to host). You can safely ignore this while
starting up.
Feb 19 14:51:46 server0 pgpool[3522]: wd_lifecheck: lifecheck failed 3
times. pgpool seems not to be working
Feb 19 14:51:54 server0 pgpool[3556]: connect_inet_domain_socket: connect()
failed: Connection timed out
Feb 19 14:51:54 server0 pgpool[3556]: make_persistent_db_connection:
connection to server1(5432) failed
Feb 19 14:51:54 server0 pgpool[3556]: check_replication_time_lag: could not
connect to DB node 1, check sr_check_user and sr_check_password
Feb 19 14:52:05 server0 pgpool[3522]: wd_lifecheck: lifecheck failed 3
times. pgpool seems not to be working
Feb 19 14:52:05 server0 pgpool[3556]: connect_inet_domain_socket: connect()
failed: No route to host
Feb 19 14:52:05 server0 pgpool[3556]: make_persistent_db_connection:
connection to server1(5432) failed
Feb 19 14:52:05 server0 pgpool[3556]: check_replication_time_lag: could not
connect to DB node 1, check sr_check_user and sr_check_password
Feb 19 14:52:05 server0 pgpool[3522]: wd_lifecheck: lifecheck failed 3
times. pgpool seems not to be working
Feb 19 14:52:14 server0 pgpool[3519]: failover: no follow backends are
degenerated
Feb 19 14:52:14 server0 pgpool[3519]: failover: set new primary node: -1
Feb 19 14:52:14 server0 pgpool[3519]: failover: set new master node: 0
Feb 19 14:52:14 server0 pgpool[3980]: connection received:
host=server0.local port=45361
Feb 19 14:52:14 server0 pgpool[3556]: worker process received restart
request
Feb 19 14:52:14 server0 pgpool[3519]: failover done. shutdown host
server1(5432)

As you can see, postgreSQL did not restarted.


More information about the pgpool-general mailing list