[pgpool-hackers: 4380] Re: Load balancing after failover not working in 3-node HA PostgreSQL cluster

Salman Ahmed salman.ahmed at stormatics.tech
Sun Aug 20 02:53:05 JST 2023


Sorry i don't think i understand that could you please be more specific
about this " solution
just for a reference here are our
follow_primary_command:
https://github.com/stormatics/pg_cirrus/blob/589cc51a12339a13b2a64dc25d7d7e0952e3a6a2/3-node-cluster/ansible/playbooks/setup-pgpool.yml#L45
follow_primary_script file:
https://github.com/stormatics/pg_cirrus/blob/main/3-node-cluster/pgpool-internals/follow_primary.sh

>
> > NOTE, I attempted to resolve the situation by restarting pgpool2 using
> the
> > -d switch. After the restart, everything seemed to work fine, and
> standby2
> > node was correctly marked as standby again. Is restart of pgpool really
> > necessary?
>
> No. You just should have waited a little bit longer so that the follow
> primary command completed the job: recovering node 2.

I waited almost an hour after the failover but still getting above log that
there is no standby node

On Sat, Aug 19, 2023 at 5:28 PM Dev BotX <devbotx5 at gmail.com> wrote:

>
>
> On Sat, 19 Aug 2023 at 08:03, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>
>> > Hi Tatsuo
>> > I've observed some logs following an auto failover that I'd like to
>> discuss
>> >
>> > 2023-08-19 00:44:07.277: main pid 62145: LOG: find_primary_node: primary
>> > node is 1
>> > 2023-08-19 00:44:07.277: main pid 62145: LOG: find_primary_node: standby
>> > node is 2
>> > 2023-08-19 00:44:07.278: main pid 62145: LOG: starting follow
>> degeneration.
>> > shutdown host 172.16.14.165(5432)
>> > 2023-08-19 00:44:07.279: main pid 62145: LOG: starting follow
>> degeneration.
>> > shutdown host 172.16.14.163(5432)
>> > 2023-08-19 00:44:07.279: main pid 62145: LOG: failover: 2 follow
>> backends
>> > have been degenerated
>> > 2023-08-19 00:44:07.280: main pid 62145: LOG: failover: set new primary
>> > node: 1
>> >
>> > do_query: extended:0 query:"SELECT pg_is_in_recovery()"
>> > 2023-08-19 00:44:08.305: sr_check_worker pid 62243: DEBUG:
>> > verify_backend_node_status: there's no standby node
>> > 2023-08-19 00:44:08.305: sr_check_worker pid 62243: DEBUG: node
>> status[0]: 0
>> > 2023-08-19 00:44:08.305: sr_check_worker pid 62243: DEBUG: node
>> status[1]: 1
>> > 2023-08-19 00:44:08.305: sr_check_worker pid 62243: DEBUG: node
>> status[2]: 0
>> >
>> >
>> > In the above logs, it's evident that the initial primary node at IP
>> address
>> > 172.16.14.165 (which is currently down) and standby2 node at
>> 172.16.14.163
>> > were involved in the auto failover process.
>> > My concern is why did pgpool initiate the shutdown of standby2 node as
>> part
>> > of the auto failover process?
>>
>> Because in general standby servers cannot connect to new primary
>> without otaining copy of the new primary database.
>>
>> > and How can we prevent such a degeneration
>> > from happening to standby2 in our case?
>>
>> You can set '' to follow_primary_command to prevent standbys being
>> killed by pgpool.
>>
> Sorry i don't think i understand that could you please be more specific
> about this " solution
> just for a reference here are our
> follow_primary_command:
> https://github.com/stormatics/pg_cirrus/blob/589cc51a12339a13b2a64dc25d7d7e0952e3a6a2/3-node-cluster/ansible/playbooks/setup-pgpool.yml#L45
> follow_primary_script file:
> https://github.com/stormatics/pg_cirrus/blob/main/3-node-cluster/pgpool-internals/follow_primary.sh
>
>>
>> > NOTE, I attempted to resolve the situation by restarting pgpool2 using
>> the
>> > -d switch. After the restart, everything seemed to work fine, and
>> standby2
>> > node was correctly marked as standby again. Is restart of pgpool really
>> > necessary?
>>
>> No. You just should have waited a little bit longer so that the follow
>> primary command completed the job: recovering node 2.
>
> I waited almost an hour after the failover but still getting above log
> that there is no standby node
>
> Please remember
>> that the follow primary command keeps on running after the failover
>> completed.
>>
>> Best reagards,
>> --
>> Tatsuo Ishii
>> SRA OSS LLC
>> English: http://www.sraoss.co.jp/index_en/
>> Japanese:http://www.sraoss.co.jp
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-hackers/attachments/20230819/a4e3d208/attachment.htm>


More information about the pgpool-hackers mailing list