[pgpool-general: 4738] Re: Failover_command is not executed on standby server
Tatsuo Ishii
ishii at postgresql.org
Sat Jun 25 09:37:44 JST 2016
> Hello.
>
> I am using "load_balance_mode = off".
I think load_balance_mode is irrelevant here.
> When postgresql service on standby node is down, pgpool log file on both
> nodes shows every 10 second:
>
> 2016-06-24 15:12:52: pid 5256: ERROR: Failed to check replication time lag
> 2016-06-24 15:12:52: pid 5256: DETAIL: No persistent db connection for the
> node 1
> 2016-06-24 15:12:52: pid 5256: HINT: check sr_check_user and
> sr_check_password
> 2016-06-24 15:12:52: pid 5256: CONTEXT: while checking replication time lag
> 2016-06-24 15:12:52: pid 5256: LOG: failed to connect to PostgreSQL server
> on "172.16.0.2:5432", getsockopt() detected error "Connection refused"
> 2016-06-24 15:12:52: pid 5256: ERROR: failed to make persistent db
> connection
> 2016-06-24 15:12:52: pid 5256: DETAIL: connection to host:"172.16.0.2:5432"
> failed
It's normal because:
> 2016-06-24 15:12:52: pid 5256: DETAIL: No persistent db connection for the
> node 1
> 2016-06-24 15:12:52: pid 5256: HINT: check sr_check_user and
means replicaton delay checking is failing (of course it fails).
Have you enabled health checking? Otherwise pgpool doesn't notice that
PostgreSQL goes down.
> If I execute sql sentence "show pool nodes", then failover command is
> executed on standby node.
Even without health checking, if you try to execute an SQL command,
that could trigger fail over.
> 2016-06-24 15:14:48: pid 5329: LOG: received degenerate backend request
> for node_id: 1 from pid [5329]
> 2016-06-24 15:14:48: pid 5329: FATAL: failed to create a backend connection
> 2016-06-24 15:14:48: pid 5329: DETAIL: executing failover on backend
> 2016-06-24 15:14:48: pid 1096: LOG: watchdog notifying to start
> interlocking
> 2016-06-24 15:14:48: pid 1096: LOG: watchdog became a new lock holder
> 2016-06-24 15:14:48: pid 1099: LOG: sending watchdog response
> 2016-06-24 15:14:48: pid 1099: DETAIL: WD_STAND_FOR_LOCK_HOLDER received
> but lock holder already exists
> 2016-06-24 15:14:49: pid 1096: LOG: starting degeneration. shutdown host
> 172.16.0.2(5432)
> 2016-06-24 15:14:49: pid 1096: LOG: Restart all children
> 2016-06-24 15:14:49: pid 1096: LOG: execute command:
> /etc/pgpool-II/failover_stream.sh 1 172.16.0.1 /tmp/trigger_file0
> 2016-06-24 15:14:49: pid 5307: LOG: child process received shutdown
> request signal 3
> 2016-06-24 15:14:49: pid 5363: LOG: child process received shutdown
> request signal 3
> 2016-06-24 15:14:49: pid 5314: LOG: child process received shutdown
> request signal 3
>
> (... more similar lines in log file ...)
>
> 2016-06-24 15:14:49: pid 5328: LOG: child process received shutdown
> request signal 3
> 2016-06-24 15:14:49: pid 5093: LOG: child process received shutdown
> request signal 3
> 2016-06-24 15:14:49: pid 1096: LOG: watchdog notifying to end interlocking
> 2016-06-24 15:14:50: pid 1096: LOG: failover: set new primary node: 0
> 2016-06-24 15:14:50: pid 1096: LOG: failover: set new master node: 0
> 2016-06-24 15:14:50: pid 5412: LOG: failback event detected
> 2016-06-24 15:14:50: pid 5412: DETAIL: restarting myself
> failover done. shutdown host 172.16.0.2(5432)2016-06-24 15:14:50: pid 1096:
> LOG: failover done. shutdown host 172.16.0.2(5432)
> 2016-06-24 15:14:50: pid 5256: ERROR: Failed to check replication time lag
> 2016-06-24 15:14:50: pid 5256: DETAIL: No persistent db connection for the
> node 1
> 2016-06-24 15:14:50: pid 5256: HINT: check sr_check_user and
> sr_check_password
> 2016-06-24 15:14:50: pid 5256: CONTEXT: while checking replication time lag
>
>
> I disable watchdog ("use_watchdog = off"), then failover command is
> executed on standby server when postgresql service is down.
>
> Is this working as expected?
>
>
> On Fri, Jun 24, 2016 at 3:52 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>
>> Works for me (without watchdog). When I shutdown standy node, it
>> triggers failover.
>>
>> 2016-06-24 22:46:28: pid 24757: LOG: reading and processing packets
>> 2016-06-24 22:46:28: pid 24757: DETAIL: postmaster on DB node 1 was
>> shutdown by administrative command
>> 2016-06-24 22:46:28: pid 24757: LOG: received degenerate backend request
>> for node_id: 1 from pid [24757]
>> 2016-06-24 22:46:28: pid 24740: LOG: starting degeneration. shutdown host
>> /tmp(11003)
>> 2016-06-24 22:46:28: pid 24740: LOG: Restart all children
>> 2016-06-24 22:46:28: pid 24740: LOG: execute command:
>> /home/t-ishii/work/pgpool-II/current/aaa/etc/failover.sh 1 /tmp 11003
>> /home/t-ishii/work/pgpool-II/current/aaa/data1 0 0 /tmp 0 11002
>> /home/t-ishii/work/pgpool-II/current/aaa/da
>>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>>
>> > Hello.
>> > I have two nodes with postgresql with streaming replication and pgpool
>> with
>> > watchdog.
>> >
>> > Using pgpool version 3.4.7 (tataraboshi):
>> > - If postgres service of standby server is down, then failover_command is
>> > not executed (no lines showed in pgpool log file).
>> > - If postgres service of primary server is down, then failover_command is
>> > executed (it is showed in pgpool log file).
>> >
>> > Using pgpool version 3.4.6 (tataraboshi):
>> > - If postgres service of standby server is down, then failover_command is
>> > executed (it is showed in pgpool log file).
>> > - If postgres service of primary server is down, then failover_command is
>> > executed (it is showed in pgpool log file).
>> >
>> > Why does failover_command is not executed with version 3.4.7 when
>> postgres
>> > service in standby server is down?
>> >
>> > Thanks in advance.
>>
More information about the pgpool-general
mailing list