[pgpool-general: 6346] Re: Pgpool is not accepting any new connections

Pierre Timmermans ptim007 at yahoo.com
Fri Dec 14 05:25:16 JST 2018


Everything looks of from a pgpool point of view
It looks like the postgres failover is also executed, which is good, but in the logs there are some errors and then after there is a second failover operation but this time on the new primary pgnode02. That's clearly not normal. And so both databases are marked down.
Dec 13 13:26:56 emkioblph04 pgpool: 2018-12-13 13:26:56: pid 48073: LOG:  starting degeneration. shutdown host pgnode02.mydomain.com(5433)Dec 13 13:26:56 emkioblph04 pgpool: 2018-12-13 13:26:56: pid 48073: WARNING:  All the DB nodes are in down status and skip writing status file.Dec 13 13:26:56 emkioblph04 pgpool: 2018-12-13 13:26:56: pid 48073: LOG:  failover: no valid backend node foundDec 13 13:26:56 emkioblph04 pgpool: 2018-12-13 13:26:56: pid 48073: LOG:  Restart all childrenDec 13 13:26:56 emkioblph04 pgpool: 2018-12-13 13:26:56: pid 48073: LOG:  execute command: /etc/pgpool-II/failover.sh 0 1 "" ""Dec 13 13:26:56 emkioblph04 pgpool: 2018-12-13 13:26:56: pid 48073: LOG:  execute command: /etc/pgpool-II/failover.sh 1 1 "" ""Dec 13 13:26:56 emkioblph04 systemd-logind: New session 3992 of user postgres.
Did you try to do a failover of postgres only, without doing at the same time a failover of pgpool ? So stop postgres on server A but keep pgpool running. This is a simpler case and so you can make sure that your failover script is correct and also make sense of the various errors reported in your log. Normally a failover is a simple operation in postgres, there is no need to stop the surviving database to do the failover. I use the scripts from repmgr but I believe that with the trigger file it is also easy and should not take long

Pierre 

    On Thursday, December 13, 2018, 9:01:55 PM GMT+1, Juan Carlos Michaca <jcmichaca at vinkos.com> wrote:  
 
 Pgpool on server B detects primary pgpool is lost, so Pgpool B asumes master role as you can see in following log entry

    Dec 13 13:26:36 emkioblph04 pgpool: 2018-12-13 13:26:36: pid 48074: LOG:  I am announcing my self as master/coordinator watchdog node


And Pgpool B aquires successfuly virtual IP, as shown in following log entry
Dec 13 13:26:45 emkioblph04 pgpool: 2018-12-13 13:26:45: pid 48380: LOG:  successfully acquired the delegate IP:"10.101.21.99"
And if I run ifconfig command y validate server B gets successfully virtual IP.

After failover script runs PG Backend B is able to accept read and write queries. I tested it by connect directly to PG Backend B and run CREATE DATABASE query.
In pgpool log I can see pgpool B keeps looking for a backend node; I have a lot of entries like:
Dec 13 13:26:57 emkioblph04 pgpool: 2018-12-13 13:26:57: pid 48073: LOG:  find_primary_node_repeatedly: waiting for finding a primary node
Dec 13 13:26:57 emkioblph04 pgpool: 2018-12-13 13:26:57: pid 48073: LOG:  find_primary_node: checking backend no 0
Dec 13 13:26:57 emkioblph04 pgpool: 2018-12-13 13:26:57: pid 48073: LOG:  find_primary_node: checking backend no 1.....
Dec 13 13:26:57 emkioblph04 pgpool: 2018-12-13 13:26:57: pid 48073: LOG:  find_primary_node: checking backend no 0
Dec 13 13:26:57 emkioblph04 pgpool: 2018-12-13 13:26:57: pid 48073: LOG:  find_primary_node: checking backend no 1
I'm unable to get watchdog info by running pcp_watchdog_info -h pgnode02.mydomain.com -p 19000 -U postgres -d. I got the error
ERROR: unable to read data from socket.

I share you detailed log file from server B, after I shut down Pgpool A.

Thank you Pierre

On Thu, Dec 13, 2018 at 1:24 PM Pierre Timmermans <ptim007 at yahoo.com> wrote:

What is in the log of the pgpool on server B ? You should see in the log of pgpool on node B a lot of information, normally the watchdog detects that the primary pgpool is lost, then it logs about the new consensus and the election of a new leader. If the node B takes over the primary role, you will see that the VIP is acquired by node B,...Also because the primary postgres was also lost, in the log of on node B you should also see that the promotion of the standby database (on node B) to primary was done.

Maybe you can try the pcp_watchdog_info utility on the pgpool of server B, if it does not respond than indeed pgpool B might be stuck..
Maybe you can post the log of the pgpool B at the time you shutted down the server A ?

Pierre 

    On Thursday, December 13, 2018, 8:08:01 PM GMT+1, Juan Carlos Michaca <jcmichaca at vinkos.com> wrote:  
 
 Hi Pierre;
Thanks for your reply.
Currently I have just 2 connections in  pg_stat_activity, and num_init_children is set to 32 in pgpool.conf.
I set num_init_children to 50 and problem persists, I should restart pgpool force it to accept connections.
I share my pgpool.conf

Regards.



On Thu, Dec 13, 2018 at 8:33 AM Pierre Timmermans <ptim007 at yahoo.com> wrote:

Hi 

The most common reason for that is when the parameter num_init_children is too low: when there are more connections to postgres than num_init_children then pgpool does not give an error but it refuses (or it queues it maybe) without giving an error.
So count the number of connection (from pg_stat_activity) and if it is equal to num_init_children then you have the answer...

Pierre 

    On Thursday, December 13, 2018, 2:13:52 AM GMT+1, Juan Carlos Michaca <jcmichaca at vinkos.com> wrote:  
 
 
Hi all,


I am testing pgpool in this scenario


Pgpool A(Master) + PG Backend A (primary)

Pgpool B(Standby) + PG Backend B (replica with streaming replication)


At this point I'm able to run queries to pgpool from any address in my network.





If I shut down Pgpool A(Master) and its PG Backend (primary), Pgpool B(now master) is not accepting any connections until I restart it. 




After restarting Pgpool B(now master) I can connect to pgpool from any host and run queries.


Please, can you help to figure out what's could be wrong?


Regards
_______________________________________________
pgpool-general mailing list
pgpool-general at pgpool.net
http://www.pgpool.net/mailman/listinfo/pgpool-general
  
  
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20181213/02b31c15/attachment.html>


More information about the pgpool-general mailing list