[pgpool-general: 7971] Re: Hang with Skunk because Flush not honoured

Mon Jan 10 21:04:25 JST 2022

> Unfortunately this will not work in some cases. Consider following
> scenario.
> 
> Parse A received.
> Parse B received.
> Flush received. So start to flush all response from backend.
> Bind A received. So stop flushing.
> Parse A complete received.
> :
> :

Right. So together with the other mitigations then. But ultimately, it was just another hack idea while we await proper flush tracking. Feel free to disregard it.

> After all, we need to find a compromise. For me it seems Skunk is
> minority.

Certainly. But no matter how rare or strange a client is, it should never get stuck forever, especially when it works perfectly well when connected directly to a database server without Pgpool-II in the middle.

> BTW, I don't understand why Skunk issuses flush after every message to
> obtain response. Simple query protocol is already there for that
> purpose.

If I understand correctly, the simple query protocol has no prepared statements/bind.

But yes, I agree Skunk is not doing it efficiently. From what I have seen, this has to do with implementation restrictions within Skunk itself and might be improved in the future. Still, even if it is inefficient, it is not incorrect.

>> It knows better than anyone what the network connection and buffer state is, and it has timeouts to ensure data is never stuck in a buffer indefinitely like is happening in Pgpool-II currently. If you want to hint to the OS that you have more data coming, you can use MSG_MORE/TCP_CORK. (For example, the Linux default with TCP_CORK is to wait at most 200 ms for the application to signal the end of data by unsetting TCP_CORK.)
> 
> "is happening in Pgpool-II currently"
> 
> What do you mean by this? I don't understand it anything relates to
> the issue discussed in this thread.

The issue discussed in this thread is data getting stuck in a buffer and never flushed out.

So as a last resort, intermediate buffers like this tend to have a timeout after which the data is sent out even if no explicit flush has been requested: that way, there may be an inefficient delay, but at least the connection can make progress. But, of course, it still is not a desirable outcome.

The specific example I named was the Linux network stack, which is able to do the same buffering that Pgpool-II is doing on its own and does have a timeout. I do think that ideally you could use that functionality directly, but it may well be that portability concerns prevent this, as I do not know how good other platforms are at this.

> Suppose Pgpool-II sends SET command to all PostgreSQL
> servers. Then Pgpool-II must not flush the downstream buffer until all
> backends reply with Command Complete or Error response.

Sorry if this is a dumb question (I must admit I am new to multiple-server scenarios), but why? If some of the servers have already sent some earlier response data before the SET, what is wrong with flushing it downstream? If a client had connected to the database directly and sent the same SET, it would have already received that response by now.

-- 
Best regards,
Oleg Oshmyan