<html><body><div style="color:#000; background-color:#fff; font-family:times new roman, new york, times, serif;font-size:14pt"><div><span>Thanks a million Sir, you have been of great help to me and also a quick suggestion from my team lead which finally solved this long impending issue. Cheers!!!</span></div><div style="color: rgb(0, 0, 0); font-size: 18.6667px; font-family: times new roman,new york,times,serif; background-color: transparent; font-style: normal;"><br><span></span></div><div style="color: rgb(0, 0, 0); font-size: 18.6667px; font-family: times new roman,new york,times,serif; background-color: transparent; font-style: normal;"><span>I just installed fresh latest postgres-9.3.2 from source, latest pgpool-II-3.3.2 and pgpoolAdmin-3.3.0. <br></span></div><div><br>The e1012 was resolved by adding a line just before rsync command in both the "basebackup.sh" files of postgres servers. Thats it and my pgpool with streaming replication worked like

 a charm. <br><br>this is the extra "basebackup.sh" line added from the suggestion of my team lead<br><br>***************************************************<br><br>ssh -T $recovery_node_host_name rm -rf $recovery_db_cluster/pg_xlog <br><br>*****************************************************<br>I will definitely need your expert guidance and support in my future testing and performance tuning phases of postgres and pgpool.<br><br></div><div>Thanks and Regards,</div><div>Syed Irfan.</div>Sr. Developer<br><div><br></div><div style="display: block;" class="yahoo_quoted"> <br> <br> <div style="font-family: times new roman, new york, times, serif; font-size: 14pt;"> <div style="font-family: times new roman, new york, times, serif; font-size: 12pt;"> <div dir="ltr"> <font face="Arial" size="2"> On Thursday, 30 January 2014 5:21 AM, Tatsuo Ishii <ishii@postgresql.org> wrote:<br> </font> </div>  <div class="y_msg_container">It seems the cause of your

 problem is apparently this:<br clear="none"><br clear="none">> /usr/local/pgsql/data/basebackup.sh: line 33: psql: command not found<br clear="none"><br clear="none">Please fix it.<br clear="none"><br clear="none">Best regards,<br clear="none">--<br clear="none">Tatsuo Ishii<br clear="none">SRA OSS, Inc. Japan<br clear="none">English: <a shape="rect" href="http://www.sraoss.co.jp/index_en.php" target="_blank">http://www.sraoss.co.jp/index_en.php</a><br clear="none">Japanese: <a shape="rect" href="http://www.sraoss.co.jp/" target="_blank">http://www.sraoss.co.jp</a><div class="yqt8730170964" id="yqtfd77991"><br clear="none"><br clear="none">> Dear Tatsuo,<br clear="none">> <br clear="none">>       Thanks for your reply, I have followed your recommendations and this time the errorcode e1012 pops up not on the third try but on the second try and within 2 seconds of clicking the recovery button. <br clear="none">> <br

 clear="none">> First Try of Recovery button when primary was down<br clear="none">> Recovery success on 172.16.80.49(when it was down manually) the backup log of 172.16.80.47 is as follows<br clear="none">> <br clear="none">> ******************************************************************<br clear="none">>  pg_start_backup <br clear="none">> -----------------<br clear="none">>  1/3C000020<br clear="none">> (1 row)<br clear="none">> <br clear="none">> mkdir: cannot create directory `/usr/local/pgsql/data/pg_xlog': File exists<br clear="none">> NOTICE:  WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup<br clear="none">>  pg_stop_backup <br clear="none">> ----------------<br clear="none">>  1/3C0000D8<br clear="none">> (1 row)<br clear="none">>

 ******************************************************************<br clear="none">> <br clear="none">> Second Try of Recovery button when the new primary was down manually<br clear="none">> Recovery fails on 172.16.80.47(when it was down manually) the backup log of 172.16.80.49 is as follows<br clear="none">> <br clear="none">> ******************************************************************<br clear="none">> /usr/local/pgsql/data/basebackup.sh: line 12: psql: command not found<br clear="none">> mkdir: cannot create directory `/usr/local/pgsql/data/pg_xlog': File exists<br clear="none">> /usr/local/pgsql/data/basebackup.sh: line 33: psql: command not found<br clear="none">> ******************************************************************<br clear="none">> <br clear="none">> The basebackup.sh on both servers is as follows with the added script for basebackup.log and uncommented recovery_target_timeline = 'latest'<br

 clear="none">> <br clear="none">> *******************************************************************<br clear="none">> #/bin/sh -x<br clear="none">> exec > /tmp/basebackup.log 2>&1<br clear="none">> # XXX We assume master and recovery host uses the same port number<br clear="none">> PORT=5432<br clear="none">> master_node_host_name=`hostname`<br clear="none">> master_db_cluster=$1<br clear="none">> recovery_node_host_name=$2<br clear="none">> recovery_db_cluster=$3<br clear="none">> tmp=/tmp/mytemp$$<br clear="none">> trap "rm -f $tmp" 0 1 2 3 15<br clear="none">> <br clear="none">> psql -p $PORT -c "SELECT pg_start_backup('Streaming Replication', true)" postgres<br clear="none">> <br clear="none">> rsync -C -a -c --delete --exclude postgresql.conf --exclude postmaster.pid \<br clear="none">> --exclude postmaster.opts --exclude pg_log \<br clear="none">> --exclude recovery.conf --exclude

 recovery.done \<br clear="none">> --exclude pg_xlog \<br clear="none">> $master_db_cluster/ $recovery_node_host_name:$recovery_db_cluster<br clear="none">> <br clear="none">> ssh -T $recovery_node_host_name mkdir $recovery_db_cluster/pg_xlog<br clear="none">> ssh -T $recovery_node_host_name chmod 700 $recovery_db_cluster/pg_xlog<br clear="none">> ssh -T $recovery_node_host_name rm -f $recovery_db_cluster/recovery.done<br clear="none">> <br clear="none">> cat > $tmp <<EOF<br clear="none">> recovery_target_timeline = 'latest'<br clear="none">> standby_mode          = 'on'<br clear="none">> primary_conninfo      = 'host=$master_node_host_name port=$PORT user=postgres'<br clear="none">> trigger_file = '/var/log/pgpool/trigger/trigger_file1'<br clear="none">> EOF<br clear="none">> <br clear="none">> scp $tmp

 $recovery_node_host_name:$recovery_db_cluster/recovery.conf<br clear="none">> <br clear="none">> psql -p $PORT -c "SELECT pg_stop_backup()" postgres<br clear="none">> *******************************************************************<br clear="none">> <br clear="none">>        <br clear="none">>      Also, the reason for commenting (recovery_target_timeline = 'latest') was it was not mentioned in your "Simple Streaming replication setting with pgpool-II(multiple servers version)" <a shape="rect" href="http://www.pgpool.net/pgpool-web/contrib_docs/simple_sr_setting2_3.0/" target="_blank">http://www.pgpool.net/pgpool-web/contrib_docs/simple_sr_setting2_3.0/ </a>page. <br clear="none">>      But after a long search on the net i found someone adding the line (recovery_target_timeline = 'latest') so for test purpose i have added it and once it did not solve the purpose I

 had commented it.<br clear="none">> <br clear="none">>      Request you to help on the issue ASAP.<br clear="none">> <br clear="none">> Best Regards,<br clear="none">> Syed Irfan<br clear="none">> Sr Developer<br clear="none">> <br clear="none">>  <br clear="none">> Thanks and Regards,<br clear="none">> Syed Irfan.<br clear="none">> <br clear="none">> Sr. Developer<br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> On Tuesday, 28 January 2014 4:52 AM, Tatsuo Ishii <<a shape="rect" ymailto="mailto:ishii@postgresql.org" href="mailto:ishii@postgresql.org">ishii@postgresql.org</a>> wrote:<br clear="none">>  <br clear="none">>> This is what the recovery.conf looks like.<br clear="none">>> *******************************************<br clear="none">>> #recovery_target_timeline = 'latest'<br

 clear="none">>> standby_mode          = 'on'<br clear="none">>> primary_conninfo      = 'host=postgres-p.rolta.com port=5432 user=postgres'<br clear="none">>> trigger_file = '/var/log/pgpool/trigger/trigger_file1'<br clear="none">>> ********************************************************<br clear="none">> <br clear="none">> Why did you remove "recovery_target_timeline = 'latest'"?<br clear="none">> <br clear="none">> I suggesto to take an execution log of script. You change the very<br clear="none">> begging of the script:<br clear="none">> <br clear="none">> #/bin/sh -x<br clear="none">> <br clear="none">> to:<br clear="none">> <br clear="none">> #/bin/sh -x<br clear="none">> exec > /tmp/basebackup.log 2>&1<br clear="none">> <br clear="none">> and please show us the content of /tmp/basebackup.log after execution of pcp_recovery_node.<br

 clear="none">> <br clear="none">> Best regards,<br clear="none">> --<br clear="none">> Tatsuo Ishii<br clear="none">> SRA OSS, Inc. Japan<br clear="none">> English: <a shape="rect" href="http://www.sraoss.co.jp/index_en.php" target="_blank">http://www.sraoss.co.jp/index_en.php</a><br clear="none">> Japanese: <a shape="rect" href="http://www.sraoss.co.jp/" target="_blank">http://www.sraoss.co.jp</a><br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">>> The basebackup.sh on both postgres databases is as follows<br clear="none">>> **************************************************<br clear="none">>> <br clear="none">>> #/bin/sh -x<br clear="none">>> #<br clear="none">>> # XXX We assume master and recovery host uses the same port number<br clear="none">>> PORT=5432<br clear="none">>> master_node_host_name=`hostname`<br clear="none">>>

 master_db_cluster=$1<br clear="none">>> recovery_node_host_name=$2<br clear="none">>> recovery_db_cluster=$3<br clear="none">>> tmp=/tmp/mytemp$$<br clear="none">>> trap "rm -f $tmp" 0 1 2 3 15<br clear="none">>> <br clear="none">>> psql -p $PORT -c "SELECT pg_start_backup('Streaming Replication', true)" postgres<br clear="none">>> <br clear="none">>> rsync -C -a -c --delete --exclude postgresql.conf --exclude postmaster.pid \<br clear="none">>> --exclude postmaster.opts --exclude pg_log \<br clear="none">>> --exclude recovery.conf --exclude recovery.done \<br clear="none">>> --exclude pg_xlog \<br clear="none">>> $master_db_cluster/ $recovery_node_host_name:$recovery_db_cluster<br clear="none">>> <br clear="none">>> ssh -T $recovery_node_host_name mkdir $recovery_db_cluster/pg_xlog<br clear="none">>> ssh -T $recovery_node_host_name chmod 700

 $recovery_db_cluster/pg_xlog<br clear="none">>> ssh -T $recovery_node_host_name rm -f $recovery_db_cluster/recovery.done<br clear="none">>> <br clear="none">>> cat > $tmp <<EOF<br clear="none">>> #recovery_target_timeline = 'latest'<br clear="none">>> standby_mode          = 'on'<br clear="none">>> primary_conninfo      = 'host=$master_node_host_name port=$PORT user=postgres'<br clear="none">>> trigger_file = '/var/log/pgpool/trigger/trigger_file1'<br clear="none">>> EOF<br clear="none">>> <br clear="none">>> scp $tmp $recovery_node_host_name:$recovery_db_cluster/recovery.conf<br clear="none">>> <br clear="none">>> psql -p $PORT -c "SELECT pg_stop_backup()" postgres<br clear="none">>> ***********************************************<br clear="none">>>  <br clear="none">>> Thanks and Regards,<br

 clear="none">>> Syed Irfan.<br clear="none">>> <br clear="none">>> Sr. Developer<br clear="none">>> <br clear="none">>> <br clear="none">>> <br clear="none">>> <br clear="none">>> <br clear="none">>> On Thursday, 23 January 2014 11:07 PM, Jeff Frost <<a shape="rect" ymailto="mailto:jeff@pgexperts.com" href="mailto:jeff@pgexperts.com">jeff@pgexperts.com</a>> wrote:<br clear="none">>>  <br clear="none">>> <br clear="none">>> <br clear="none">>> On Jan 23, 2014, at 9:32 AM, Syed Irfan <<a shape="rect" ymailto="mailto:syedirfan_77@yahoo.com" href="mailto:syedirfan_77@yahoo.com">syedirfan_77@yahoo.com</a>> wrote:<br clear="none">>> <br clear="none">>> Dear Tatsuo Ishii,<br clear="none">>>><br clear="none">>>><br clear="none">>>>       I am still awaiting for your reply on this issue, I have tried your

 suggestions but still I am unable to successfully run the Recovery process the third time it's surprises me how does it work the first time but same thing fails in the third attempt.?<br clear="none">>>><br clear="none">>>><br clear="none">>>>The Postgres log shows as below<br clear="none">>>><br clear="none">>>><br clear="none">>>><br clear="none">>>><br clear="none">>>>>> 28038 2014-01-09 21:28:33 BDT FATAL:  timeline 35 of the primary does not match recovery target timeline 36<br clear="none">>>>>>> 28039 2014-01-09 21:28:38 BDT FATAL:  timeline 35 of the primary does not match recovery target timeline 36<br clear="none">>>><br clear="none">>>><br clear="none">>>><br clear="none">>>>I urgently request you to help me in this impending issue.<br clear="none">>>><br clear="none">>> <br

 clear="none">>> This is usually caused by postgres trying to replay WAL files from the wrong source.  Did you clean out the pg_xlog directory on the replica before taking the base backup?<br clear="none">>> <br clear="none">>> What does your recovery.conf look like?</div><br><br></div>  </div> </div>  </div> </div></body></html>