[pgpool-general: 2692] Re: Restart Postgresql service

Sergey Arlashin sergeyarl.maillist at gmail.com
Wed Apr 2 15:32:02 JST 2014


HI!

Thank you for your response!

I removed these lines. 

But still every time I start writing to database and then restart slave I get the following messages in log:


Apr  2 06:26:39 lb-node1 pgpool[14680]: connection closed. retry to create new connection pool.
Apr  2 06:26:39 lb-node1 pgpool[14631]: do_command: SFATAL
Apr  2 06:26:39 lb-node1 pgpool[14631]: connect_using_existing_connection: do_command failed. command: SET application_name TO 'psql'
Apr  2 06:26:39 lb-node1 pgpool[14650]: do_command: SFATAL
Apr  2 06:26:39 lb-node1 pgpool[14650]: connect_using_existing_connection: do_command failed. command: SET application_name TO 'psql'
Apr  2 06:26:39 lb-node1 pgpool[14660]: do_command: SFATAL
Apr  2 06:26:39 lb-node1 pgpool[14660]: connect_using_existing_connection: do_command failed. command: SET application_name TO 'psql'
Apr  2 06:26:39 lb-node1 pgpool[14442]: do_command: SFATAL
Apr  2 06:26:39 lb-node1 pgpool[14442]: connect_using_existing_connection: do_command failed. command: SET application_name TO 'psql'
Apr  2 06:26:39 lb-node1 pgpool[14680]: pool_read_kind: kind does not match between master(52) slot[1] (45)
Apr  2 06:26:39 lb-node1 pgpool[14680]: pool_read_kind: error message from 1 th backend:the database system is shutting down
Apr  2 06:26:39 lb-node1 pgpool[14673]: connection closed. retry to create new connection pool.
Apr  2 06:26:39 lb-node1 pgpool[14673]: connect_inet_domain_socket: getsockopt() detected error: Connection refused
Apr  2 06:26:39 lb-node1 pgpool[14673]: connection to db-node2.site(5432) failed
Apr  2 06:26:39 lb-node1 pgpool[14673]: new_connection: create_cp() failed
Apr  2 06:26:39 lb-node1 pgpool[14673]: degenerate_backend_set: 1 fail over request from pid 14673
Apr  2 06:26:39 lb-node1 pgpool[13800]: starting degeneration. shutdown host db-node2.site(5432)
Apr  2 06:26:39 lb-node1 pgpool[13800]: Restart all children
Apr  2 06:26:39 lb-node1 pgpool[13800]: execute command: /etc/pgpool2/scripts/failover.sh 1 0 db-node1.site /var/lib/postgresql/9.3/main/switch_master
Apr  2 06:26:39 lb-node1 pgpool[13800]: failover: set new primary node: 0
Apr  2 06:26:39 lb-node1 pgpool[13800]: failover: set new master node: 0
Apr  2 06:26:39 lb-node1 pgpool[14626]: worker process received restart request
Apr  2 06:26:39 lb-node1 pgpool[13800]: failover done. shutdown host db-node2.site(5432)
Apr  2 06:26:40 lb-node1 pgpool[14623]: pcp child process received restart request
Apr  2 06:26:40 lb-node1 pgpool[13800]: PCP child 14623 exits with status 256 in failover()
Apr  2 06:26:40 lb-node1 pgpool[13800]: fork a new PCP child pid 14815 in failover()
Apr  2 06:26:40 lb-node1 pgpool[13800]: worker child 14626 exits with status 256
Apr  2 06:26:40 lb-node1 pgpool[13800]: fork a new worker child pid 14816




when I restart master I get the following:


Apr  2 06:29:55 lb-node1 pgpool[14902]: connection closed. retry to create new connection pool.
Apr  2 06:29:55 lb-node1 pgpool[14886]: connection closed. retry to create new connection pool.
Apr  2 06:29:55 lb-node1 pgpool[14785]: do_command: SFATAL
Apr  2 06:29:55 lb-node1 pgpool[14785]: connect_using_existing_connection: do_command failed. command: SET application_name TO 'psql'
Apr  2 06:29:55 lb-node1 pgpool[14734]: do_command: SFATAL
Apr  2 06:29:55 lb-node1 pgpool[14734]: connect_using_existing_connection: do_command failed. command: SET application_name TO 'psql'
Apr  2 06:29:55 lb-node1 pgpool[14901]: do_command: SFATAL
Apr  2 06:29:55 lb-node1 pgpool[14901]: connect_using_existing_connection: do_command failed. command: SET application_name TO 'psql'
Apr  2 06:29:55 lb-node1 pgpool[14902]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14902]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14886]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14886]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14907]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14907]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14863]: connection closed. retry to create new connection pool.
Apr  2 06:29:55 lb-node1 pgpool[14863]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14863]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14915]: connection closed. retry to create new connection pool.
Apr  2 06:29:55 lb-node1 pgpool[14857]: connection closed. retry to create new connection pool.
Apr  2 06:29:55 lb-node1 pgpool[14857]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14915]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14915]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14857]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14853]: connection closed. retry to create new connection pool.
Apr  2 06:29:55 lb-node1 pgpool[14853]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14853]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14749]: connection closed. retry to create new connection pool.
Apr  2 06:29:55 lb-node1 pgpool[14749]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14749]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14908]: connection closed. retry to create new connection pool.
Apr  2 06:29:55 lb-node1 pgpool[14908]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14908]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14852]: connection closed. retry to create new connection pool.
Apr  2 06:29:55 lb-node1 pgpool[14852]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14852]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14841]: connection closed. retry to create new connection pool.
Apr  2 06:29:55 lb-node1 pgpool[14841]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14841]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14908]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14908]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14852]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14852]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14902]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14902]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14886]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14886]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14853]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14853]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14911]: connection closed. retry to create new connection pool.
Apr  2 06:29:55 lb-node1 pgpool[14911]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14911]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[13800]: s_do_auth: expecting R got E
Apr  2 06:29:55 lb-node1 pgpool[13800]: make_persistent_db_connection: s_do_auth failed
Apr  2 06:29:55 lb-node1 pgpool[14915]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14915]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[13800]: s_do_auth: expecting R got E
Apr  2 06:29:55 lb-node1 pgpool[13800]: make_persistent_db_connection: s_do_auth failed
Apr  2 06:29:55 lb-node1 pgpool[13800]: health check failed. 0 th host db-node1.site at port 5432 is down
Apr  2 06:29:55 lb-node1 pgpool[13800]: health check retry sleep time: 2 second(s)
Apr  2 06:29:55 lb-node1 pgpool[14749]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14749]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14916]: connection closed. retry to create new connection pool.
Apr  2 06:29:55 lb-node1 pgpool[14916]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14916]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14734]: connection closed. retry to create new connection pool.
Apr  2 06:29:55 lb-node1 pgpool[14734]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:55 lb-node1 pgpool[14734]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:55 lb-node1 pgpool[14785]: connection closed. retry to create new connection pool.
Apr  2 06:29:56 lb-node1 pgpool[14785]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:56 lb-node1 pgpool[14785]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:56 lb-node1 pgpool[14857]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:56 lb-node1 pgpool[14857]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:56 lb-node1 pgpool[14863]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:56 lb-node1 pgpool[14863]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:56 lb-node1 pgpool[14799]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:56 lb-node1 pgpool[14799]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:56 lb-node1 pgpool[14907]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:56 lb-node1 pgpool[14907]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:56 lb-node1 pgpool[14901]: connection closed. retry to create new connection pool.
Apr  2 06:29:56 lb-node1 pgpool[14901]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:56 lb-node1 pgpool[14901]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:56 lb-node1 pgpool[14781]: connection closed. retry to create new connection pool.
Apr  2 06:29:56 lb-node1 pgpool[14851]: connection closed. retry to create new connection pool.
Apr  2 06:29:56 lb-node1 pgpool[14781]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:56 lb-node1 pgpool[14781]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:56 lb-node1 pgpool[14851]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:56 lb-node1 pgpool[14851]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:56 lb-node1 pgpool[14907]: pool_read_kind: kind does not match between master(45) slot[1] (52)
Apr  2 06:29:56 lb-node1 pgpool[14907]: pool_read_kind: error message from master backend:the database system is shutting down
Apr  2 06:29:56 lb-node1 pgpool[14749]: pool_read: read failed (Connection reset by peer)
Apr  2 06:29:56 lb-node1 pgpool[14749]: degenerate_backend_set: 0 fail over request from pid 14749
Apr  2 06:29:56 lb-node1 pgpool[13800]: starting degeneration. shutdown host db-node1.site(5432)
Apr  2 06:29:56 lb-node1 pgpool[13800]: Restart all children
Apr  2 06:29:56 lb-node1 pgpool[13800]: execute command: /etc/pgpool2/scripts/failover.sh 0 0 db-node2.site /var/lib/postgresql/9.3/main/switch_master
Apr  2 06:29:56 lb-node1 pgpool[14749]: pool_flush_it: write failed to backend (0). reason: Broken pipe offset: 0 wlen: 5
Apr  2 06:29:56 lb-node1 pgpool[13800]: find_primary_node_repeatedly: waiting for finding a primary node
Apr  2 06:30:02 lb-node1 pgpool[13800]: find_primary_node: primary node id is 1
Apr  2 06:30:02 lb-node1 pgpool[13800]: failover: set new primary node: 1
Apr  2 06:30:02 lb-node1 pgpool[13800]: failover: set new master node: 1
Apr  2 06:30:02 lb-node1 pgpool[14838]: worker process received restart request
Apr  2 06:30:02 lb-node1 pgpool[13800]: failover done. shutdown host db-node1.site(5432)
Apr  2 06:30:03 lb-node1 pgpool[14837]: pcp child process received restart request
Apr  2 06:30:03 lb-node1 pgpool[13800]: PCP child 14837 exits with status 256 in failover()
Apr  2 06:30:03 lb-node1 pgpool[13800]: fork a new PCP child pid 15332 in failover()
Apr  2 06:30:03 lb-node1 pgpool[13800]: worker child 14838 exits with status 256
Apr  2 06:30:03 lb-node1 pgpool[13800]: fork a new worker child pid 15333
Apr  2 06:30:03 lb-node1 pgpool[13800]: after some retrying backend returned to healthy state








On Apr 2, 2014, at 2:58 AM, Tatsuo Ishii <ishii at postgresql.org> wrote:

> I think you need to modify the pgpool source to get the desired
> behavior.
> 
> around line 4938 of pool_process_query.c:
> 
> 				/*
> 				 * admin shutdown postmaster or postmaster goes down
> 				 */
> 				r = detect_postmaster_down_error(CONNECTION(backend, i), MAJOR(backend));
> 				if (r == SPECIFIED_ERROR)
> 				{
> 					pool_log("postmaster on DB node %d was shutdown by administrative command", i);
> 					/* detach backend node. */
> 					was_error = 1;
> 					if (!VALID_BACKEND(i))
> 						break;
> 					notice_backend_error(i);
> 					sleep(5);
> 					break;
> 				}
> 				else if (r < 0)
> 				{
> 					/*
> 					 * This could happen after detecting backend errors and before actually
> 					 * detaching the backend. In this case reading from backend socket will
> 					 * return EOF and it's better to close this session. So returns POOL_END.
> 					 */ 
> 					pool_log("detect_postmaster_down_error returns error on backend %d. Going to close this session.", i);
> 					return POOL_END;
> 				}
> 
> By removing those lines, you could get the desired behavior though I
> have not tested it.
> 
> In the long term, we should be able to enable/disable failover when
> PostgreSQL is shutdwon.
> 
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
> 
> 
>> Any suggestions? 
>> 
>> I forgot to mention that I'm using streaming replication.
>> 
>> The thing is  some PostgreSQL configuration parameters (like max_connections or shared_buffers) apply only on server start. Therefore if I need to change them I have to restart PostgreSQL backends. 
>> 
>> But if my application is writing or reading from the database at the moment the restart almost every time makes Pgpool consider the restarted backend as faulty. 
>> If it happens to the slave node I can always attach it again with pcp_attach_node almost instantly. But if I restart the master Pgpool promotes the slave to master and then I need to run pcp_recovery_node which will cause to resync the new slave with the new master. And this takes some time and is very inconvenient. 
>> 
>> I have the following health_check parameters at the moment:
>> 
>> health_check_period           = 12
>> health_check_timeout          = 20
>> health_check_max_retries      = 16
>> health_check_retry_delay      = 2
>> 
>> And also restart backends with:
>> 
>> sudo -u postgres /usr/lib/postgresql/9.3/bin/pg_ctl -D /var/lib/postgresql/9.3/main/  restart -m fast
>> 
>> But the behaviour remains the same. 
>> 
>> I can stop Pgpool while restarting Postgresql backends and then start it again. In this case everything will be ok. But it is very inconvenient as well. 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Mar 27, 2014, at 11:21 PM, Sergey Arlashin <sergeyarl.maillist at gmail.com> wrote:
>> 
>>> I tried different health check settings and used 
>>> 
>>> sudo -u postgres /usr/lib/postgresql/9.3/bin/pg_ctl -D /var/lib/postgresql/9.3/main/  restart -m fast
>>> 
>>> to restart postgresl, but still each time I restart one of postgresql backends I get something like this in pgpool.log:
>>> 
>>> Mar 27 19:13:35 lb-node1 pgpool[23886]: ProcessFrontendResponse: failed to read kind from frontend. frontend abnormally exited
>>> Mar 27 19:13:35 lb-node1 pgpool[13960]: postmaster on DB node 1 was shutdown by administrative command
>>> Mar 27 19:13:35 lb-node1 pgpool[32459]: postmaster on DB node 1 was shutdown by administrative command
>>> Mar 27 19:13:35 lb-node1 pgpool[13960]: degenerate_backend_set: 1 fail over request from pid 13960
>>> Mar 27 19:13:35 lb-node1 pgpool[32146]: postmaster on DB node 1 was shutdown by administrative command
>>> Mar 27 19:13:35 lb-node1 pgpool[32459]: degenerate_backend_set: 1 fail over request from pid 32459
>>> Mar 27 19:13:35 lb-node1 pgpool[32146]: degenerate_backend_set: 1 fail over request from pid 32146
>>> Mar 27 19:13:35 lb-node1 pgpool[32187]: starting degeneration. shutdown host db-node2.site(5432)
>>> Mar 27 19:13:35 lb-node1 pgpool[32187]: Restart all children
>>> Mar 27 19:13:35 lb-node1 pgpool[4684]: postmaster on DB node 1 was shutdown by administrative command
>>> Mar 27 19:13:35 lb-node1 pgpool[32187]: execute command: /etc/pgpool2/scripts/failover.sh 1 1 db-node1.site /var/lib/postgresql/9.3/main/switch_master
>>> Mar 27 19:13:40 lb-node1 pgpool[32187]: find_primary_node_repeatedly: waiting for finding a primary node
>>> Mar 27 19:13:44 lb-node1 pgpool[32187]: find_primary_node: primary node id is 0
>>> Mar 27 19:13:44 lb-node1 pgpool[32187]: failover: set new primary node: 0
>>> Mar 27 19:13:44 lb-node1 pgpool[32187]: failover: set new master node: 0
>>> Mar 27 19:13:44 lb-node1 pgpool[4684]: degenerate_backend_set: 1 fail over request from pid 4684
>>> Mar 27 19:13:44 lb-node1 pgpool[32295]: worker process received restart request
>>> Mar 27 19:13:44 lb-node1 pgpool[32187]: failover done. shutdown host db-node2.site(5432)
>>> Mar 27 19:13:45 lb-node1 pgpool[32289]: pcp child process received restart request
>>> Mar 27 19:13:45 lb-node1 pgpool[32187]: PCP child 32289 exits with status 256 in failover()
>>> Mar 27 19:13:45 lb-node1 pgpool[32187]: fork a new PCP child pid 15547 in failover()
>>> Mar 27 19:13:45 lb-node1 pgpool[32187]: worker child 32295 exits with status 256
>>> Mar 27 19:13:46 lb-node1 pgpool[32187]: fork a new worker child pid 15548
>>> Mar 27 19:13:46 lb-node1 pgpool[32187]: failover: no backends are degenerated
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Mar 27, 2014, at 11:46 AM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>>> 
>>>>> Hi!
>>>>> Almost every time I restart backend postgresql service ( while pgpool is working ) it causes pgpool to mark this node as failed.
>>>>> So I'm wondering if there is some 'proper' way to restart postgresql backends ( without stopping pgpool service ) which doesn't break the cluster. 
>>>> 
>>>> You can tweak health check parameters. For example,
>>>> 
>>>> health_check_max_retries = 10
>>>> health_check_retry_delay = 1
>>>> 
>>>> Also make sure that you restart PostgreSQL with "fast" mode.
>>>> 
>>>> Best regards,
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese: http://www.sraoss.co.jp
>>> 
>> 



More information about the pgpool-general mailing list