[pgpool-hackers: 3479] Re: Proposal: health check statistics
Tatsuo Ishii
ishii at sraoss.co.jp
Mon Dec 16 17:13:52 JST 2019
>>> Currently Pgpool-II's health check process logs various information
>>> including backend connection problem, retrying to recover from it, and
>>> so on. This information is very important for users because it reports
>>> the healthiness problem of PostgreSQL. For example, observing
>>> increase of retry count may suggest that network connection between
>>> Pgpool-II and PostgreSQL having trouble so that users could replace
>>> the switch before actual failure occurs. Problem is, it is annoying to
>>> look for such that information from log files afterward since it may
>>> already disappear or was not logged by other problems (such as disk
>>> full).
>>>
>>> I would like to propose a new feature:
>>>
>>> - Accumulate health check statistics on shared memory so that later on
>>> users can look into the stats using PCP commands.
>>>
>>> - Such statistics includes:
>>> - failure count per backend nodes
>>> - retry count per backend nodes
>>> - success count after retries
>>
>> I think, we should add statistis about:
>> - success count per backend nodes
>>
>> If pgpool's statistics have this, we can know parcentage of failure.
>
> That's definitely a good thing for users. Than you for your suggestion.
So, here is the revised proposal for health check statistics.
(all per node data).
- total count
- total success count
- total failure count
- total retry count
- average retry count
- maximum retry count
- average response time
- maximum response time
- the latest healthchek timestamp
- the latest retry timestamp
- the latest status change timestamp
- cause of the status change (failover, failback etc.)
- current status (up, down...)
- last 10 status change timestamp and it's status at the time ("10" should be configurable)
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
More information about the pgpool-hackers
mailing list