[pgpool-general: 6511] Re: how does pgpool manage the /tmp/pgpool_status file or status changes in general?

Tue Apr 16 06:18:18 JST 2019

Hi

On Tue, Apr 16, 2019 at 12:18 AM Rob Reinhardt <rreinhardt at eitccorp.com>
wrote:

> I performed a simple test today with everything up in normal state.
> Shutdown the primary server's db instance, and pgpool detected it (that's
> fine). At the moment I don't have any automatic actions taking place
> failover or otherwise. And I didn't do any manual ones.
>
> I simply  restarted my primary instance, and repmgr shows a nice green
> cluster.
>
> At that point, pgpool was confused and incorrect.  it still thought the
> primary node was down, so I guess it doesn't check again. show pool_nodes
> won't connect at this point either and pgpool logs continually retrying to
> find a primary node.
>

When pgpool detects the backend node failure, the failed node is ejected
> from the pgpool. This is done to protect against a split-brain and possible
> data corruption. Even If the failed node comes back again it will not
> automatically gets attached to pgpool and you have to manually attach it
> back.
>

> Then if I restart pgpool like the hint in pgpool's log says to do, it
> comes back and still can't see that there is a primary.  show pool_nodes
> also still can't connect.
>
> So then, I troubleshoot and I take a look at /tmp/pgpool_status, it says:
> down
> up
> up
>
> On a lark, I try shutting down pgpool again, then deleting that file, and
> then restarted it.
>

You can start pgpool witth -D or --discard-status to discard the status
file.

>
> This time it comes up and does check and show pool_nodes can connect and
> has the correct cluster status.
>
> So...
> I'm too new and naive to pgpool to assume anything is meant to be or not,
> so here are some stupid questions:
>
> 1) Shouldn't I expect pgpool to just handle this without requiring a
> restart? Why isn't it re-checking and re-updating status?  I mean it knows
> an incident just occured, why wouldn't it keep rechecking the real status
> and update itself automatically? Or is it supposed to and this is just a
> bug?
>

As described above, this is as per the design to guard against data
corruption and split-brain scenarios.
Pgpool provides two mechanisms to attach/reattach the backend node without
restarting it, One by using *pcp_attach_node* command,
which one is useful to attach the node after manually restoring it. And the
other way is to use pgpool's online-recovery

> 2) Even if I assume that manually restarting pgpool has to happen every
> time my cluster status changes (for whatever reason, planned or otherwise).
> then why do I have to manually remove that file to get pgpool to see the
> right of things when it comes back up? Or am I not supposed to have to and
> this is just a bug?
> 3) If all of that is true and just the way it has to be, why doesnt the
> systemd start script for the pgpool.service that comes with it, have code
> to always remove that file on startup so that it CAN get a clean start? I'd
> have to assume that it is not intended that this file need to be manually
> managed or script in the startup. Or that a service script bug?
> 4) And finally, if none of that is bugs and is just the way it is designed
> to work, do you foresee any problems with me adding code to the service
> script to remove that file each time it starts up in order to force it to
> automatically check out the real cluster status when it comes up.
>

Thanks
Best Regards
Muhammad Usama

>
> --
> Thanks,
> Rob
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20190416/7d5d1532/attachment.html>