[pgpool-general: 3591] Re: Out of sync with actual db state (replication / load balancing mode)

Gervais de Montbrun gervais at silverorange.com
Thu Apr 2 03:06:24 JST 2015


Replying to my own message (I apologize). I made changes to the systemctl stop {pgpool,postgresql} commands to use the "-m smart stop” commands. This may have sorted my issues with stopping postgresql.

Still curious as to why the nodes are reporting themselves as master/slave.

Cheers,
Gervais

> On Mar 31, 2015, at 5:04 PM, Gervais de Montbrun <gervais at silverorange.com> wrote:
> 
> Hi Yugo,
> 
> I understand your description below of the situation and should have realized the difference between the nodes in show pool_nodes and pcp_watchdog_info. I’m still curious why the backend nodes report themselves as “master” and “slave” when I am not running in replication and load balancing mode. Shouldn’t this be a master/master situation as writes are going to both server simultaneously?
> 
> On another note… I am now testing my setup and have hit a snag. I have a php script that is writing to the pgpool (which then sends writes to both back-ends) in an endless loop. When I shutdown one of the postgres servers, the watchdog notices and sets this node to a “degenerated” state. But my script then quits out with an error… I was expecting and hoping that it would continue to run and that I could then just recover the downed node later. After some experimenting and searching, I found an old post where Tatsuo Ishii tells a user that there is a -g flag for pcp_detach_node…. I then tested doing a graceful detach and this does not crash my script… It looks like watchdog does not do a graceful detach, but instead immediately detaches a node when it gets the signal that it is shutting down. Is there a way to make watchdog default to graceful detach of a node. This way we could shutdown and service a node without breaking our web sites?
> 
> This is the mail exchange I am referencing: http://www.sraoss.jp/pipermail/pgpool-general/2014-July/003100.html
> 
> Thank you in advance for your help with this.
> 
> Cheers,
> Gervais
> 
>> On Mar 31, 2015, at 3:20 AM, Yugo Nagata <nagata at sraoss.co.jp> wrote:
>> 
>> Hi,
>> 
>> On Mon, 30 Mar 2015 11:28:15 -0300
>> Gervais de Montbrun <gervais at silverorange.com> wrote:
>> 
>>> Hi,
>>> 
>>> Thank you for your reply.
>>> 
>>> I was concerned about ignoring the pgpool_status file at startup. When pgpool starts, it check to see if the backend nodes are running, but does not communicate with the other pgpool server to check the state of the backends.
>> 
>> Yes, pgpool-II doesn't communicate with each other about pgpool_stats.
>> I realized this problem and plan to enhance wathdog about this in future.
>> 
>>> It is possible for postgresql server to be running and still be in a detached state. I changed the failover and failback commands to include shutting down the postgresql-server on the backend that is being detached and attached. This seems to be working fine now and I think that I have my setup properly configured for all edge cases.
>> 
>>> 
>>> One thing that I noticed is that the result of the show pool_nodes command seems to be incorrect.
>>> 	psql -Upostgres -hcara0 -c "show pool_nodes”
>>> 
>>> 	 node_id | hostname | port  | status | lb_weight | role
>>> 	---------+----------+-------+--------+-----------+————
>>> 	 0       | cara0    | 54321 | 2      | 0.500000  | master
>>> 	 1       | cara1    | 54321 | 2      | 0.500000  | slave
>>> 	(2 rows) 
>>> 
>>> 
>>> 	pcp_watchdog_info 10 cara0 9898 postgres <password> 
>>> 	cara0 5432 9000 2
>>> 
>>> 	pcp_watchdog_info 10 cara0 9898 postgres <password> 0
>>> 	cara1 5432 9000 3
>> 
>> No problem. show pool_nodes and pcp_watchdog_info report different information.
>> 
>> "show pool_nodes" reports status of backend nodes, i.e., pgpool_status.
>> The status number means as same as pcp_node_info.
>> 
>>   0 - This state is only used during the initialization. PCP will never display it.
>>   1 - Node is up. No connections yet.
>>   2 - Node is up. Connections are pooled.
>>   3 - Node is down
>> 
>> pcp_watchdog_info reports the watchdog status of the pgpool-II, that is,
>> which pgpool-II is active (virtual IP holder).
>> 
>> Status is represented by a digit from [1 to 4].
>> 
>>   1 - watchdog is not started
>>   2 - Standby: not holding the virtual IP
>>   3 - Active: holding the virtual IP
>>   4 - Down
>> 
>> 
>>> 
>>> We can see that the server called cara1 is in fact the active node. I’m not concerned about this if it is just a reporting issue and is safe to ignore, but thought that I should ask. 
>>> 
>>> Cheers,
>>> Gervais
>>> 
>>>> On Mar 27, 2015, at 6:03 AM, Yugo Nagata <nagata at sraoss.co.jp> wrote:
>>>> 
>>>> Hi Gervais,
>>>> 
>>>> On Thu, 26 Mar 2015 11:39:40 -0300
>>>> Gervais de Montbrun <gervais at silverorange.com> wrote:
>>>> 
>>>>> I have two servers running with replication, load balancing, watchdog (with virtual IP) enabled.
>>>>> 
>>>>> My plan is to remove any single point of failure from my database. The web sites I am running would point to the virtual interface’s IP on whatever is running as a primary in the pgpool and there would be two database servers behind the pool. Here is my setup as it stands:
>>>>> 
>>>>> dbserver0:
>>>>> 	pgpool - get’s virtual IP if it starts first
>>>>> 	postgres
>>>>> 
>>>>> dbserver1:
>>>>> 	pgpool
>>>>> 	postgres 
>>>>> 
>>>>> I have everything working except that the there can be a situation where one of the pgpools starts up with a stale pgpool_staus file and it thinks a backend node is either up when it is down (or down when it is up). I am not sure how to sort this out as this file seems to only be written to when pgpool stops.
>>>> 
>>>> When pgpool starts up with -D option, an old pgpool_status is discarded
>>>> and pgpool recognizes the current status of backends.
>>>> If you want to modify backend status while pgpool is runing,
>>>> you can use pcp_attach_node or pcp_detach_node command.
>>>> 
>>>>> 
>>>>> Can someone suggest how to sort this out. I know that you will likely need to review any number of config files and/or scripts in my configuration. Can you let me know what you would want to see before I sanitize them all and send them on.
>>>>> 
>>>>> Also, if my plan is completely off, I am open to any alternative suggestion as to how to best achieve my goal as described above.
>>>>> 
>>>>> Cheers,
>>>>> Gervais
>>>>> ------------------------------------
>>>>> Gervais de Montbrun
>>>>> Systems Administrator
>>>>> GTalk: gervais at silverorange.com
>>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Yugo Nagata <nagata at sraoss.co.jp>
>>> 
>> 
>> 
>> -- 
>> Yugo Nagata <nagata at sraoss.co.jp>
> 
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general



More information about the pgpool-general mailing list