View Issue Details

IDProjectCategoryView StatusLast Update
0000424Pgpool-IIBugpublic2019-05-21 10:15
Reporternagata Assigned Tot-ishii  
PrioritynormalSeverityminorReproducibilityhave not tried
Status closedResolutionopen 
Summary0000424: pcp_recovery_nod in follow_master_command possibly fails with an error
DescriptionOne of our clients is getting this error each time follow_master calls pcp_recovery_node on a downed standby:

ERROR: failed to process PCP request at the moment
DETAIL: failover is in progress

In the current implementation, pcp_recovery_node can not be run during failover/failback. (This is checked in pcp_process_command()).
So, it seams that pcp_recovery_node is being called before the failover_command completes.

The ideas to run pcp_recovery_node safely in follow_master are as below.

1. Insert a sleep before running pcp_recovery_node.
2. Retry pcp_recovery_node if this fails due to the error.

However, can this be handled in Pgpool-II itself? If running pcp_recovery_node in follow_master_command is a expected use case, I think Pgpool-II should provide the safe way to do this. Any idea to resolve this?
TagsNo tags attached.

Activities

t-ishii

2018-08-16 17:03

developer   ~0002161

Probably you are misunderstanding the usage of follow master command. The command should be run *after* failover done. Users can issue SQL to Pgpool-II while follow master commands are running. See follow_master_command.sh generated by pgpool_setup.

nagata

2018-08-16 17:34

developer   ~0002162

On my understanding, the follow_master_command are triggered in failover() and this is before Req_info->switching is cleared (= set to false).

t-ishii

2018-08-16 17:39

developer   ~0002163

Don't you miss the next line?
if(Req_info->request_queue_tail != Req_info->request_queue_head)

nagata

2018-08-16 17:43

developer   ~0002164

Oops, sorry, I missed this. I'll looking into this, again.

nagata

2018-08-16 18:20

developer   ~0002165

OK
- Req_info->request_queue_tail is incremented when failover or failover request is registered by register_node_operation_request().
- Req_info->request_queue_head is incremented at the top of the loop in failover(), that is, at the start point of processing each failback/failover request.

So, (Req_info->request_queue_tail != Req_info->request_queue_head) is true when multiple failover or failback were
registered but some part of them is not processed yet. This may happen if failover is requested twice or more quickly,
or the "failback" request is registered at pcp_recovery_node in the first follow_master_command, for example.

I'll report more when we get a log messages from the client or when I can reproduce this in my machine.

t-ishii

2019-02-24 20:21

developer   ~0002402

Can we close this issue?

t-ishii

2019-05-21 10:15

developer   ~0002609

No response from the reporter over 1 month. I am going to close this issue.

Issue History

Date Modified Username Field Change
2018-08-16 16:40 nagata New Issue
2018-08-16 17:03 t-ishii Note Added: 0002161
2018-08-16 17:34 nagata Note Added: 0002162
2018-08-16 17:39 t-ishii Note Added: 0002163
2018-08-16 17:43 nagata Note Added: 0002164
2018-08-16 18:20 nagata Note Added: 0002165
2019-01-30 10:08 administrator Assigned To => t-ishii
2019-01-30 10:08 administrator Status new => assigned
2019-02-04 08:59 t-ishii Status assigned => feedback
2019-02-24 20:21 t-ishii Note Added: 0002402
2019-05-21 10:15 t-ishii Note Added: 0002609
2019-05-21 10:15 t-ishii Status feedback => closed