[pgpool-general: 6508] Re: New pgpool-II-95 4.0.4 install in front of a 3-node repmgr95 postgresql-9.5 cluster - not finding all the nodes

Rob Reinhardt rreinhardt at eitccorp.com
Sat Apr 13 04:10:13 JST 2019


Thank you Pierre.  That fixed that problem.

I have another though.  It will not stay running.  Just sitting there doing
nothing or just monitoring the node status every 10 seconds, it claims to
the a stop request and shuts itself down.

I'm on Redhat 7.5 if that helps.

----
deleted the stuff in between start and automatic? stops

Apr 12 17:37:31  pgpool[6291]: [6-1] 2019-04-12 17:37:31: pid 6291: LOG:
pgpool-II successfully started. version 4.0.4 (torokiboshi)
Apr 12 18:03:10  pgpool[9426]: [1-1] 2019-04-12 18:03:10: pid 9426: LOG:
stop request sent to pgpool. waiting for termination...
~26 minutes it stopped itself?

Apr 12 18:10:42  pgpool[10288]: [6-1] 2019-04-12 18:10:42: pid 10288: LOG:
pgpool-II successfully started. version 4.0.4 (torokiboshi)
Apr 12 18:33:10  pgpool[13333]: [1-1] 2019-04-12 18:33:10: pid 13333: LOG:
stop request sent to pgpool. waiting for termination...
~23 minutes it stopped itself?

@trying this time without the 10second watch monitor to see if that changes
anything...
Apr 12 18:42:34  pgpool[14349]: [6-1] 2019-04-12 18:42:34: pid 14349: LOG:
pgpool-II successfully started. version 4.0.4 (torokiboshi)
Apr 12 19:03:10 pgpool[16505]: [1-1] 2019-04-12 19:03:10: pid 16505: LOG:
stop request sent to pgpool. waiting for termination...
~21 mins?

I feel  like I'm in the movie Independence Day and there is a countdown to
something even less nice.


On Fri, Apr 12, 2019 at 1:09 PM Pierre Timmermans <ptim007 at yahoo.com> wrote:

> Hello,
>
> In your config you have
>
> backend_hostname0 = '192.x.y.a'
>
> 3 times, it should be once backend_hostname0, backend_hostname1 and
> backend_hostname2 (the same for backend_port0,  etc)
>
> Pierre
>
>
> On Friday, April 12, 2019, 5:05:05 PM GMT+2, Rob Reinhardt <
> rreinhardt at eitccorp.com> wrote:
>
>
> I "feel like" it should be working since so much of it is working, except
> the main function of the s/w seems to be failing me.
>
> my repmgr95 says this:
>
> ID | Name | Role | Status | Upstream | Location | Connection string
>
> ----+---------+---------+-----------+----------+----------+----------------------------------------------------------
> 1 | r01sv05 | standby | running | r01sv04 | default | host=r01sv05
> user=repmgr dbname=repmgr connect_timeout=2
> 2 | r01sv04 | primary | * running | | default | host=r01sv04 user=repmgr
> dbname=repmgr connect_timeout=2
> 3 | r01sv03 | standby | running | r01sv04 | default | host=r01sv03
> user=repmgr dbname=repmgr connect_timeout=2
>
> (actually 05 is now the primary, that is an old shot)
>
> r01sv02 is the pgpool server btw, and they are all on the same subnet.
>
> my pgpool says this:
>
> -bash-4.2$ psql -U pgpool --dbname=pgpool --host r01sv02 -c "show
> pool_nodes"
>  node_id | hostname | port | status | lb_weight |  role   | select_cnt |
> load_balance_node | replication_delay | last_status_change
>
> ---------+----------+------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
>  0       | r01sv03  | 5432 | up     | 1.000000  | standby | 0          |
> true              | 0                 | 2019-04-11 19:48:43
> (1 row)
>
> pgpool keeps logging this:
>
> Apr 12 14:03:03 r01sv02.change.me pgpool[14630]: [259-1] 2019-04-12
> 14:03:03: pid 14630: LOG:  find_primary_node: standby node is 0
> Apr 12 14:03:03 r01sv02.change.me pgpool[14630]: [259-2] 2019-04-12
> 14:03:03: pid 14630: LOCATION:  pgpool_main.c:3438
> Apr 12 14:03:04 r01sv02.change.me pgpool[14630]: [260-1] 2019-04-12
> 14:03:04: pid 14630: LOG:  find_primary_node: standby node is 0
> Apr 12 14:03:04 r01sv02.change.me pgpool[14630]: [260-2] 2019-04-12
> 14:03:04: pid 14630: LOCATION:  pgpool_main.c:3438
> Apr 12 14:03:05 r01sv02.change.me pgpool[14630]: [261-1] 2019-04-12
> 14:03:05: pid 14630: LOG:  find_primary_node: standby node is 0
> and occasionally the find_primary_node_repeatedly line
>
> Quick summary of my setup:
> 3 postgresql-9.5 db nodes, one is primary, the other two are standby, in a
> streaming replication cluster built and managed with repmgr95.  This is
> working fine.
>
> 1 pgpool 4.0.4 server that has the same version of postgresql-9.5 and
> postgres user setup as the other 3.
> - pgpool is running as postgres
>
> what does work:
> -the postgres user has ssh access to/from any of the four servers. I can
> remotely run repmgr from the pgpool server as postgres user with no problem
> -psql can access all the db's says with simple \list or \dt or whatever
> from any of the 4 nodes asking for 5432 access from any of the four nodes,
> even from the pgpool server
> -i can use the postgres user or pgpool user with psql
> - dns is working too, but I changed from using the hostname to the IP's in
> the config file in case it made a difference, but it did not.
>
> I've even run this commands by hand and it gets the right answers:
>
> -bash-4.2$ psql -U pgpool --dbname=pgpool --host r01sv02 -c "SELECT
> pg_is_in_recovery();"
>  pg_is_in_recovery
> -------------------
>  t
> (1 row)
>
> -bash-4.2$ psql -U pgpool --dbname=pgpool --host r01sv03 -c "SELECT
> pg_is_in_recovery();"
>  pg_is_in_recovery
> -------------------
>  t
> (1 row)
>
> -bash-4.2$ psql -U pgpool --dbname=pgpool --host r01sv04 -c "SELECT
> pg_is_in_recovery();"
>  pg_is_in_recovery
> -------------------
>  t
> (1 row)
>
> -bash-4.2$ psql -U pgpool --dbname=pgpool --host r01sv05 -c "SELECT
> pg_is_in_recovery();"
>  pg_is_in_recovery
> -------------------
>  f
> (1 row)
>
> pgpool for some reason finds one of the three nodes, a standby node, and
> it has the right.
>
> the pgpool database I created, I created on my primary.  I had thought
> that when pgpool started up it might put some stuff in that database, but I
> haven't seen anything, in case that is the problem.  i found notes on
> creating said database and user, but have seen nothing on actually putting
> anything in it by hand.--anyway, I was just looking at that in case it is
> something
>
> Main question -- where are the other two nodes?
>
> Also, I've noted that each time I start pgpool, it throws those errors
> (above) until the steps reaches 300, then it finally says "successfully
> started" and at that point the pcp_* commands will work, before then it has
> not yet created the pcp socket.  Don't know if that is normal/expected or
> not.  Seemed odd to me, for basic commands to take 5 minutes to even be
> available.
>
> The other thing is that while it will come up for a while, pgpool seems to
> be stopping itself after about 10 minutes or so.  the log just says that
> pgpool was told to stop (but I didn't do it).
>
> I've attached a sanitized version of my pgpool.conf file
>
> In case it helps, here also is the sanitized contents of the .pgpass and
> .pcppass files in the postgres home dir of all four of my servers and the
> pool_passwd, in case you see a problem with these (they are 600 owned by
> postgres).
>
> -bash-4.2$ cat .pgpass
> r01sv02:5432:*:pgpool:sanitized
> r01sv05:5432:*:postgres:pgpool:sanitized
> r01sv04:5432:*:postgres:pgpool:sanitized
> r01sv03:5432:*:postgres:pgpool:sanitized
> r01sv05:5432:replication:repmgr:pgpool:sanitized
> r01sv04:5432:replication:repmgr:pgpool:sanitized
> r01sv03:5432:replication:repmgr:pgpool:sanitized
>
> -bash-4.2$ cat .pcppass
> *:*:pgpool:pgpool:sanitized
> *:*:postgres:pgpool:sanitized
>
> pcp.conf
> pgpool:sanitized
> nrpe:sanitized
> postgres:sanitized
>
> pool_passwd
> pgpool:sanitized
> nrpe:sanitized
> postgres:sanitized
>
>
> -bash-4.2$ cat pool_hba.conf
> # pgpool Client Authentication Configuration File
>
> # "local" is for Unix domain socket connections only
> local   all         all                               trust
> # IPv4 local connections:
> host    all         all         127.0.0.1/32          trust
> host    all         all         ::1/128               trust
> host    all         all         192.x.y.0/24             md5
>
> Thanks,
> Rob
>
>
>
>
>
>
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
>


-- 
Rob Reinhardt
DevOps Engineer
Enlighten IT Consulting (EITC), a MacAulay-Brown, Inc. company
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20190412/e49dd1a4/attachment-0001.html>


More information about the pgpool-general mailing list