View Issue Details

IDProjectCategoryView StatusLast Update
0000636Pgpool-IIBugpublic2020-08-24 00:10
Reportervandy Assigned Tot-ishii  
PriorityhighSeveritymajorReproducibilityalways
Status assignedResolutionopen 
Product Version4.0.2 
Summary0000636: pgpool failover not working
DescriptionHi Team,

We have one DB node which is connected to DB

We write small failover script which will connect to DB whenever DB node got restarted

cat failover.sh
#! /bin/bash
while :
do
err=`pcp_attach_node -U <usename> -n 0 -w`
succ="Command Successful"
if [[ $err =~ $succ ]];
then
        exit 0;
else
        sleep 5
fi
done



pgpool.conf
   failover_command = '/tmp/failover.sh &'


But still pgpool is not able to connect to DB after DB node restart


LOGS
Pgpool-II parent process has received failover request
failover: no valid backend node found
execute command: /tmp/failover.sh &
failover: set new primary node: 0
failover done. shutdown host <DB_NODE_PRIVAT_INFO> 2020-08-17 09:50:47: pid 1019: LOG: worker process received restart request
failover done. shutdown host <DB_NODE_PRIVAT_INFO>
failover: no backends are degenerated
.
.
.
.
PCP child 1018 exits with status 0 in failover()
fork a new PCP child pid 2028 in failover()
.


And we still get same error utill we restart the pgpool

2020-08-19 06:25:57: pid 1607: DETAIL: all backend nodes are down, pgpool requires at least one valid node
2020-08-19 06:25:57: pid 1607: HINT: repair the backend nodes and restart pgpool
2020-08-19 06:25:57: pid 17: LOG: fork a new child process with pid: 2496
2020-08-19 06:25:57: pid 17: LOG: child process with pid: 942 exits with status 256
2020-08-19 06:25:57: pid 17: LOG: fork a new child process with pid: 2497
2020-08-19 06:25:57: pid 17: LOG: child process with pid: 944 exits with status 256
2020-08-19 06:25:57: pid 17: LOG: fork a new child process with pid: 2498
2020-08-19 06:25:57: pid 17: LOG: child process with pid: 972 exits with status 256
2020-08-19 06:25:57: pid 17: LOG: fork a new child process with pid: 2499
2020-08-19 06:25:57: pid 17: LOG: child process with pid: 1004 exits with status 256
2020-08-19 06:25:57: pid 17: LOG: fork a new child process with pid: 2500
2020-08-19 06:25:57: pid 17: LOG: child process with pid: 1077 exits with status 256
2020-08-19 06:25:57: pid 17: LOG: fork a new child process with pid: 2501
2020-08-19 06:25:57: pid 17: LOG: child process with pid: 1142 exits with status 256
2020-08-19 06:25:57: pid 2297: FATAL: pgpool is not accepting any new connections
2020-08-19 06:25:57: pid 2297: DETAIL: all backend nodes are down, pgpool requires at least one valid node
2020-08-19 06:25:57: pid 2297: HINT: repair the backend nodes and restart pgpool
2020-08-19 06:25:57: pid 2063: FATAL: pgpool is not accepting any new connections
2020-08-19 06:25:57: pid 2063: DETAIL: all backend nodes are down, pgpool requires at least one valid node
2020-08-19 06:25:57: pid 2063: HINT: repair the backend nodes and restart pgpool


Please help
TagsNo tags attached.

Activities

vandy

2020-08-19 16:43

reporter   ~0003484

master_slave_mode = on
master_slave_sub_mode = 'stream'

pgdude

2020-08-19 23:51

reporter   ~0003487

I couldn't figure out how to close this issue, so would you please close it? Thanks in advance.

t-ishii

2020-08-21 19:42

developer   ~0003495

You cannot execute pcp_attach_node to failback a node which is going down in a failover script. You can execute pcp_attach_node after failover and the DB comes back however.

pgdude:
You are not the reporter of the issue. I would like to hear from vandy who is the original reporter if he/she really wants to close the issue.

t-ishii

2020-08-21 22:01

developer   ~0003496

Another way to solve the problem is, using "DISALLOW_FAILOVER" flag. Add this:

backend_flag0 = 'DISALLOW_TO_FAILOVER'

to pgpool.conf. Then the backend will never failover, thus you can restart PostgreSQL anytime. After the PostgreSQL comes back to normal state, clients can connect to pgpool. Note that while the PostgreSQL is rebooting, clients will get following message:

psql: error: could not connect to server: FATAL: failed to create a backend connection
DETAIL: executing failover on backend

vandy

2020-08-22 04:25

reporter   ~0003498

Hi t-ishii,

Thanks for replying

Regarding your solution ,

When DB get restarted, pgpool will not automatically connect to DB , right ?

vandy

2020-08-22 05:15

reporter   ~0003499

And we are performing pcp_attach_node command in forever while loop till DB node come up and pgpool able to connect to it


while :
do
err=`pcp_attach_node -U <usename> -n 0 -w`
succ="Command Successful"
if [[ $err =~ $succ ]];
then
        exit 0;
else
        sleep 5
fi
done

t-ishii

2020-08-22 06:44

developer   ~0003500

> When DB get restarted, pgpool will not automatically connect to DB , right ?
If DISALLOW_TO_FAILOVER is used, pgpool will automatically connect to DB. More precisely, with DISALLOW_TO_FAILOVER pgpool never disconnects DB, so actually it does not need to connect to DB again after DB restarts.

vandy

2020-08-22 23:53

reporter   ~0003501

Hi t-ishii,

I will try this .

but I am still not clear
Doubt is :
When failover happen , i am keep running pcp command till it succeeded but its not working

while :
do
err=`pcp_attach_node -U <usename> -n 0 -w`
succ="Command Successful"
if [[ $err =~ $succ ]];
then
        exit 0;
else
        sleep 5
fi
done

It there any problem with failover command or my understanding of failover working in pgpool ?

pgdude

2020-08-22 23:56

reporter   ~0003502

I am so sorry for responding to this thread thinking it was mine when I said:

"I couldn't figure out how to close this issue, so would you please close it? Thanks in advance."

t-ishii

2020-08-23 08:15

developer   ~0003503

> It there any problem with failover command
I have not tried myself but your script seems to have a problem. pcp_attach_node's output is not "Command Successful" but "pcp_attach_node -- Command Successful". So the script will never end.

vandy

2020-08-24 00:10

reporter   ~0003504

Hi t-ishii,
Script will search for "Command Successful string in output not exact string match

I tested the script ,
Problem is that some time its working and pgpool is able to attach the DB node but some time its not able to , with following error


Pgpool-II parent process has received failover request
failover: no valid backend node found
execute command: /tmp/failover.sh &
failover: set new primary node: 0
failover done. shutdown host <DB_NODE_PRIVAT_INFO> 2020-08-17 09:50:47: pid 1019: LOG: worker process received restart request
failover done. shutdown host <DB_NODE_PRIVAT_INFO>
failover: no backends are degenerated
.
.
.
.
PCP child 1018 exits with status 0 in failover()
fork a new PCP child pid 2028 in failover()
.


And we still get same error utill we restart the pgpool

2020-08-19 06:25:57: pid 1607: DETAIL: all backend nodes are down, pgpool requires at least one valid node
2020-08-19 06:25:57: pid 1607: HINT: repair the backend nodes and restart pgpool
2020-08-19 06:25:57: pid 17: LOG: fork a new child process with pid: 2496
2020-08-19 06:25:57: pid 17: LOG: child process with pid: 942 exits with status 256
2020-08-19 06:25:57: pid 17: LOG: fork a new child process with pid: 2497

Issue History

Date Modified Username Field Change
2020-08-19 16:37 vandy New Issue
2020-08-19 16:43 vandy Note Added: 0003484
2020-08-19 23:51 pgdude Note Added: 0003487
2020-08-21 19:42 t-ishii Note Added: 0003495
2020-08-21 19:42 t-ishii Assigned To => t-ishii
2020-08-21 19:42 t-ishii Status new => assigned
2020-08-21 19:42 t-ishii Status assigned => feedback
2020-08-21 22:01 t-ishii Note Added: 0003496
2020-08-22 04:25 vandy Note Added: 0003498
2020-08-22 04:25 vandy Status feedback => assigned
2020-08-22 05:15 vandy Note Added: 0003499
2020-08-22 06:44 t-ishii Note Added: 0003500
2020-08-22 06:44 t-ishii Status assigned => feedback
2020-08-22 23:53 vandy Note Added: 0003501
2020-08-22 23:53 vandy Status feedback => assigned
2020-08-22 23:56 pgdude Note Added: 0003502
2020-08-23 08:15 t-ishii Note Added: 0003503
2020-08-24 00:10 vandy Note Added: 0003504