[pgpool-general: 6082] Re: primary suddenly is down only in pgpool

Mariel Cherkassky mariel.cherkassky at gmail.com
Sun May 13 17:49:11 JST 2018


Hi Tatsuo.
It suddenly happened again during the weekend. This time I got errors in my
log :
-11 18:43:33 - [No Connection] [20902]LOG:  trying connecting to PostgreSQL
server on "ptkpl-psgsqldb2:5432" by INET socket
[[No Connection]]([No Connection]) - 2018-05-11 18:43:33 - [No Connection]
[20902]DETAIL:  timed out. retrying...
11 18:44:03 - [No Connection] [18906]LOG:  failed to connect to PostgreSQL
server on "ptkpl-psgsqldb2:5432", getsockopt() detected error "No route to
host"
[[No Connection]]([No Connection]) - 2018-05-11 18:44:03 - [No Connection]
[18906]LOG:  received degenerate backend request for node_id: 1 from pid
[18906]

and the pool keeped looking for the primary "find_primary_node: checking
backend no 0/1/2" for  6 minutes. During all this time the primary was up
and was working fine. What do you recommend to do ? Only after attaching
the primary again everything worked. Why the pool didnt recognizer the
primary ? I'm checking with my networking team If there was a network
problem but I dont think that it is related.


Thanks , MARIEL.

2018-05-06 17:22 GMT+03:00 Tatsuo Ishii <ishii at sraoss.co.jp>:

> Both "show pool_nodes" and pcp_node_info after all checks the status
> on the shared memory area. However the implementation is completely
> different; "show pool_nodes" is simpler and it's just a wrapper for
> showing the status as SQL. pcp_node_info is a client/server
> program. The status is retrieved by pcp server then is sent to pcp
> client (pcp_node_info) via pcp protocol.
>
> Also next time you'd better check the status file to very whether
> pcp_node_info tells the truth.
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
> > No, I didnt check the status via "show pool_nodes". To be honest it
> isnt
> > the first time it happens. Does there a difference between
> show_pool_nodes
> > and pcp_node info on the deeper level ? I mean I know that
> show_pool_nodes
> > queries a view or a table, what about pcp_node_info ? I dont think that
> it
> > is related to repmgr..
> >
> > 2018-05-06 16:49 GMT+03:00 Tatsuo Ishii <ishii at sraoss.co.jp>:
> >
> >> > Hi,
> >> > I have 3 postgres servers (one primary + 2 standbys) that have
> >> replciation
> >> > configured with repmgr:
> >> > pg1 - standby
> >> > pg2 - primary
> >> > pg3 - standby
> >> >
> >> > I have also 2 pgpool servers(v 3.7.2 and on each one there is one pool
> >> > instance. There isnt any watchdog, instead I have a vip address that
> >> > directs the requests to the available pgpool instance. I configured my
> >> own
> >> > metrics that check the status of the database nodes via the pcp
> >> interface.
> >> >
> >> > Today at 11:25 suddenly I got an alert that both my pgpools saw that
> the
> >> > primary node is down (via pcp). I connected and checked and indeed the
> >> > primary was down :
> >> > [postgres at pool2 log]$ pcp_node_info -h localhost -U postgres -p 9898
> 1
> >> -w
> >> > pg2 5432 2 0.333333 down standby
> >> >
> >> > I checked it in both pools and the same result. I immediatly attached
> >> them
> >> > and it worked. I wanted to understand why it happened but I dont see
> any
> >> > error in the logs. I attach the logs of both my pools. Can you help me
> >> > identify the problem ?
> >>
> >> No idea. I have never seen PostgreSQL is detached without any trace in
> >> pgpool log. Have you seen the node status using "show pool_nodes"? If
> >> not, I suspect there's a bug with pcp_node_info. If you tried "show
> >> pool_nodes" and saw the same status as pcp_node_info, then I
> >> completely lose idea.
> >>
> >> There may be a interaction with repmgr, but I am not familiar with
> >> repmgr and this is just a wild guess.
> >>
> >> Best regards,
> >> --
> >> Tatsuo Ishii
> >> SRA OSS, Inc. Japan
> >> English: http://www.sraoss.co.jp/index_en.php
> >> Japanese:http://www.sraoss.co.jp
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20180513/6eb6e9bf/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pgpool.log.Friday
Type: application/octet-stream
Size: 1616738 bytes
Desc: not available
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20180513/6eb6e9bf/attachment-0001.obj>


More information about the pgpool-general mailing list