[pgpool-hackers: 3935] Re: Problem with detach_false_primary/follow_primary_command

Muhammad Usama m.usama at gmail.com
Thu Jun 17 20:05:30 JST 2021


Hi Ishii-San



On Wed, Jun 16, 2021 at 3:15 PM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> Hi Usama,
>
> Unfortunately the patch did not apply cleanly on current master
> branch:
>

Sorry, Appearently I had'nt created a patch from current master head.
Attached is the rebaed version

>
> $ git apply ~/wd_coordinating_follow_and_detach__primary.patch
> error: patch failed: src/include/pool.h:426
> error: src/include/pool.h: patch does not apply
>
> So I have not actually tested the patch but it seems the idea of
> locking watchdog level is more robust than my idea (executing false
> primary check only on the coordinator node).
>

Thanks for the confirmation. I have confirmed the regression is fine with
the patch
but I think I need some more testing before I can commit it.

Best regards
Muhammad Usama


> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
> > Hi Ishii-San
> >
> > As discussed over the slack. I have cooked up a POC patch for
> implementing
> > the
> > follow_primary locking over the watchdog channel.
> >
> > The idea is just before executing the follow_primary during the failover
> > process
> > we just direct all standby watchdog nodes to acquire the same lock on
> their
> > respective
> > nodes, so that they stop the false primary detection during the period
> when
> > the
> > follow_primary is being executed on the watchdog coordinator node.
> >
> > Moreover to keep the watchdog process blocked on waiting for the lock I
> > have introduced
> > the pending remote lock mechanism, so that remote locks can be acquired
> in
> > the background
> > after the completion of the inflight replication checks.
> >
> > Finally I have removed the REQ_DETAIL_CONFIRMED flag from
> > degenerate_backend_set()
> > request that gets issued to detach the false primary, That means all
> quorum
> > and consensus rules
> > will needed to be satisfied for the detach to happen.
> >
> > I haven't done a rigorous testing or regression with the patch and
> > sharing the initial version with you
> > to get your consensus on the basic idea and design.
> >
> > Can you kindly take a look if you agree with the approach.
> >
> > Thanks
> > Best regards
> > Muhammad Usama
> >
> >
> > On Fri, May 7, 2021 at 9:47 AM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> >
> >> I am going to commit/push the patches to master down to 4.0 stable
> >> (detach_false_primary was introduced in 4.0) branches if there's no
> >> objection.
> >>
> >> Best regards,
> >> --
> >> Tatsuo Ishii
> >> SRA OSS, Inc. Japan
> >> English: http://www.sraoss.co.jp/index_en.php
> >> Japanese:http://www.sraoss.co.jp
> >>
> >> From: Tatsuo Ishii <ishii at sraoss.co.jp>
> >> Subject: [pgpool-hackers: 3893] Re: Problem with
> >> detach_false_primary/follow_primary_command
> >> Date: Tue, 04 May 2021 13:09:23 +0900 (JST)
> >> Message-ID: <20210504.130923.644768896074013686.t-ishii at gmail.com>
> >>
> >> > In the previous mail I have explained the problem and proposed a patch
> >> > for the issue.
> >> >
> >> > However the original reporter also said the problem will occur in more
> >> > complex way if watchdog is enabled.
> >> >
> >> >
> https://www.pgpool.net/pipermail/pgpool-general/2021-April/007590.html
> >> >
> >> > In summary it seems multiple pgpool nodes perform detach_false_primary
> >> > concurrently and this is the cause of the problem. I think there's no
> >> > reason to perform detach_false_primary in multiple pgpool nodes
> >> > concurrently. Rather we should perform detach_false_primary only on
> >> > the leader node. If this is correct, we also should not perform
> >> > detach_false_primary if the quorum is absent because there's no leader
> >> > if the quorum is absent. Attached is the patch to introduce the check
> >> > in addition to the v2 patch.
> >> >
> >> > I would like to hear opinion from other pgpool developers on that
> >> > whether we should apply the v3 patch to existing branches. I am asking
> >> > because currently we perform detach_false_primary even if the quorum
> >> > is absent and the change may be "change of user visible behavior"
> >> > which we usually avoid on stable branches. However the current
> >> > detach_false_primary apparently does not work on the environment where
> >> > watchdog is enabled, I think patching to back branches are
> exceptionally
> >> > reasonable choice.
> >> >
> >> > Also I have added the regression test patch.
> >> >
> >> >> In the posting:
> >> >>
> >> >> [pgpool-general: 7525] Strange behavior on switchover with
> >> detach_false_primary enabled
> >> >>
> >> >> it is reported that detach_false_primary and follow_primary_command
> >> >> could conflict each other and pgpool goes into unwanted state. We can
> >> >> reproduce the issue by using pgpool_setup to create 3 node
> >> >> configuration.
> >> >>
> >> >> $ pgpool_setup -n 3
> >> >>
> >> >> echo "detach_false_primary" >> etc/pgpool.conf
> >> >> echo "sr_check_period = 1" >> etc/pgpool.conf
> >> >>
> >> >> The latter may not be mandatory but making the streaming replication
> >> >> check frequently will reliably reproduce the problem because
> >> >> detach_false_primary is executed in the streaming replication check
> >> >> process.
> >> >>
> >> >> The initial state is as follows:
> >> >>
> >> >> psql -p 11000 -c "show pool_nodes" test
> >> >>  node_id | hostname | port  | status | pg_status | lb_weight |  role
> >>  | pg_role | select_cnt | load_balance_node | replication_delay |
> >> replication_state | replication_sync_state | last_status_change
> >> >>
> >>
> ---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> >> >>  0       | /tmp     | 11002 | up     | up        | 0.333333  |
> primary
> >> | primary | 0          | true              | 0                 |
> >>        |                        | 2021-05-04 11:12:01
> >> >>  1       | /tmp     | 11003 | up     | up        | 0.333333  |
> standby
> >> | standby | 0          | false             | 0                 |
> streaming
> >>        | async                  | 2021-05-04 11:12:01
> >> >>  2       | /tmp     | 11004 | up     | up        | 0.333333  |
> standby
> >> | standby | 0          | false             | 0                 |
> streaming
> >>        | async                  | 2021-05-04 11:12:01
> >> >> (3 rows)
> >> >>
> >> >> Execute pcp_detatch_node against node 0.
> >> >>
> >> >> $ pcp_detach_node -w -p 11001 0
> >> >>
> >> >> This will let the primary be in down status and this will promote
> node
> >> 1.
> >> >>
> >> >> 2021-05-04 12:12:14: pcp_child pid 31449: LOG:  received degenerate
> >> backend request for node_id: 0 from pid [31449]
> >> >> 2021-05-04 12:12:14: main pid 31221: LOG:  Pgpool-II parent process
> has
> >> received failover request
> >> >> 2021-05-04 12:12:14: main pid 31221: LOG:  starting degeneration.
> >> shutdown host /tmp(11002)
> >> >> 2021-05-04 12:12:14: pcp_main pid 31260: LOG:  PCP process with pid:
> >> 31449 exit with SUCCESS.
> >> >> 2021-05-04 12:12:14: pcp_main pid 31260: LOG:  PCP process with pid:
> >> 31449 exits with status 0
> >> >> 2021-05-04 12:12:14: main pid 31221: LOG:  Restart all children
> >> >> 2021-05-04 12:12:14: main pid 31221: LOG:  execute command:
> >> /home/t-ishii/work/Pgpool-II/current/x/etc/failover.sh 0 /tmp 11002
> >> /home/t-ishii/work/Pgpool-II/current/x/data0 1 0 /tmp 0 11003
> >> /home/t-ishii/work/Pgpool-II/current/x/data1
> >> >>
> >> >> However detach_false_primary found that the just promoted node 1 is
> >> >> not good because it does not have any follower standby node because
> >> >> follow_primary_command did not completed yet.
> >> >>
> >> >> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG:
> >> verify_backend_node_status: primary 1 does not connect to standby 2
> >> >> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG:
> >> verify_backend_node_status: primary 1 owns only 0 standbys out of 1
> >> >> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG:
> >> pgpool_worker_child: invalid node found 1
> >> >>
> >> >> And detach_false_primary sent failover request for node 1.
> >> >>
> >> >> 2021-05-04 12:12:14: sr_check_worker pid 31261: LOG:  received
> >> degenerate backend request for node_id: 1 from pid [31261]
> >> >>
> >> >> Moreover every 1 second detach_false_primary tries to detach node 1.
> >> >>
> >> >> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG:
> >> verify_backend_node_status: primary 1 does not connect to standby 2
> >> >> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG:
> >> verify_backend_node_status: primary 1 owns only 0 standbys out of 1
> >> >> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG:
> >> pgpool_worker_child: invalid node found 1
> >> >> 2021-05-04 12:12:15: sr_check_worker pid 31261: LOG:  received
> >> degenerate backend request for node_id: 1 from pid [31261]
> >> >>
> >> >> The confuses the whole follow_primary_command and in the end we have:
> >> >>
> >> >> psql -p 11000 -c "show pool_nodes" test
> >> >>  node_id | hostname | port  | status | pg_status | lb_weight |  role
> >>  | pg_role | select_cnt | load_balance_node | replication_delay |
> >> replication_state | replication_sync_state | last_status_change
> >> >>
> >>
> ---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> >> >>  0       | /tmp     | 11002 | down   | down      | 0.333333  |
> standby
> >> | unknown | 0          | false             | 0                 |
> >>        |                        | 2021-05-04 12:12:16
> >> >>  1       | /tmp     | 11003 | up     | up        | 0.333333  |
> standby
> >> | standby | 0          | false             | 0                 |
> >>        |                        | 2021-05-04 12:22:28
> >> >>  2       | /tmp     | 11004 | up     | up        | 0.333333  |
> standby
> >> | standby | 0          | true              | 0                 |
> >>        |                        | 2021-05-04 12:22:28
> >> >> (3 rows)
> >> >>
> >> >> Of course this is totally unwanted result.
> >> >>
> >> >> I think the root cause of the problem is, detach_false_primary and
> >> >> follow_primary_command are allowed to run concurrently. To solve the
> >> >> problem we need to have a lock so that if detach_false_primary
> already
> >> >> runs, follow_primary_command should wait for it's completion or vice
> >> >> versa.
> >> >>
> >> >> For this purpose I propose attached patch
> >> >> detach_false_primary_v2.diff. In the patch new function
> >> >> pool_acquire_follow_primary_lock(bool block) and
> >> >> pool_release_follow_primary_lock(void) are introduced. They are
> >> >> responsible for acquiring or releasing the lock. There are 3 places
> >> >> where those functions are used:
> >> >>
> >> >> 1) find_primary_node
> >> >>
> >> >> This function is called upon startup and failover in the main pgpool
> >> >> process to find new primary node.
> >> >>
> >> >> 2) failover
> >> >>
> >> >> This function is called in the follow_primary_command subprocess
> >> >> forked off by pgpool main process to execute follow_primary_command
> >> >> script. The lock should be help until all follow_primary_command are
> >> >> completed.
> >> >>
> >> >> 3) streaming replication check
> >> >>
> >> >> Before starting verify_backend_node, which is the work horse of
> >> >> detach_false_primary, the lock must be acquired. If it fails, just
> >> >> skip the streaming replication check cycle.
> >> >>
> >> >>
> >> >> I and the user who made the initial report confirmed that tha patch
> >> >> works well.
> >> >>
> >> >> Unfortunately the story is not the all. However the mail is already
> >> >> too long. I will continue to the next mail.
> >> >>
> >> >> Best regards,
> >> >> --
> >> >> Tatsuo Ishii
> >> >> SRA OSS, Inc. Japan
> >> >> English: http://www.sraoss.co.jp/index_en.php
> >> >> Japanese:http://www.sraoss.co.jp
> >> _______________________________________________
> >> pgpool-hackers mailing list
> >> pgpool-hackers at pgpool.net
> >> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-hackers/attachments/20210617/89303ad1/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: wd_coordinating_follow_and_detach__primary_rebased.patch
Type: application/octet-stream
Size: 18047 bytes
Desc: not available
URL: <http://www.pgpool.net/pipermail/pgpool-hackers/attachments/20210617/89303ad1/attachment-0001.obj>


More information about the pgpool-hackers mailing list