[pgpool-hackers: 375] Re: Exception Manager for pgpool
Muhammad Usama
m.usama at gmail.com
Fri Oct 11 21:53:31 JST 2013
Hi
Seems like the repository was changed after I had generated the patch.
Please find the updated patch rebased with the current state of git.
Thanks
Usama
On Fri, Oct 11, 2013 at 9:47 AM, Ahsan Hadi <ahsan.hadi at enterprisedb.com>wrote:
> Usama,
> Please take a look.
>
> Tatsuo,
> I was apply to apply this patch cleanly on pgpool master and test before
> Usama sent the patch. It is likely that we have done some changes to the
> master branch since then..
>
>
> On Fri, Oct 11, 2013 at 4:03 AM, Tatsuo Ishii <ishii at postgresql.org>wrote:
>
>> Usama,
>>
>> I have applied your patch and got following error:
>>
>> Hunk #12 FAILED at 499.
>> 1 out of 12 hunks FAILED -- saving rejects to file src/main/main.c.rej
>>
>> I also attached main.c.rej.
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
>>
>> > Muhammad,
>> >
>> > Thank you for your great work! I'll look into this.
>> > --
>> > Tatsuo Ishii
>> > SRA OSS, Inc. Japan
>> > English: http://www.sraoss.co.jp/index_en.php
>> > Japanese: http://www.sraoss.co.jp
>> >
>> >> Hi
>> >>
>> >> I am working on adding the exception manager in pgpool and my plan of
>> >> action for this is to use the Postgres exception manager API (elog and
>> >> friends). Since the exception manager in Postgres uses the long jump,
>> so
>> >> importing this API in pgpool will effect all existing pgpool code flows
>> >> especially in case of an error. and lot of care will be required for
>> this
>> >> integration, and secondly the elog API along with its friends will
>> touch
>> >> almost all parts of pgpool source code which will add up to a very huge
>> >> patch.
>> >> So instead throwing a very huge patch to the community my plan is to
>> divide
>> >> this task into multiple smaller sub tasks so that it would be easier
>> >> maintain and review the patch.
>> >>
>> >> Cut to the chase, attached is the first of the series of related
>> patches
>> >> to come.This is the first cut patch for implementing the exception
>> manager
>> >> in pgpool. As described above the exception manager and related code is
>> >> borrowed from PostgreSQL source code.
>> >> and the exception manager (elog API) is very closely tied with memory
>> >> manager in PostgreSQL (palloc API) so the patch also borrows the PG's
>> >> memory manager.
>> >>
>> >> Below is the little description of things part of this patch.
>> >>
>> >> -- Exception manager API of Postgres is added to pgpool, The API
>> consists
>> >> of elog.c and elog.h files. Since this API is very extensive and is
>> >> designed for PostgreSQL so to fit it properly into pgpool I have
>> modified
>> >> it a little bit, and most of the modifications are related to removal
>> of
>> >> code which is not required for pgpool.
>> >>
>> >> -- Added on_proc_exit callback mechanism of Postgres. To facilitate the
>> >> cleanup at exit time.
>> >>
>> >> -- Added PostgreSQL's memory manager (palloc API). This includes the
>> client
>> >> side palloc functions placed in 'src/tools' directory (fe_memutils)
>> >>
>> >> -- Removed the existing memory manager which was very minimalistic and
>> was
>> >> not integrated in all parts of the code.
>> >>
>> >> -- I have also tried to reflector some portions of code to make the
>> code
>> >> more readable at first glance. This includes
>> >>
>> >> - dividing the main.c file into two files main.c and pgpool_main.c,
>> Now the
>> >> main.c file only contains the code related to early initialisations of
>> >> pgpool and parsing the command line options and related code. The
>> actual
>> >> logic of the pgpool main process is moved to new pgpool_main.c file.
>> >> - breaking up some large functions in child.c into smaller functions.
>> >> - rewrite the pgpool's main loop logic to make the code more readable.
>> >>
>> >>
>> >> Remaining TODOs on this front.
>> >>
>> >> -- The current patch only integrates the memory and exception manager
>> in
>> >> main process and connection creation segment of pgpool child process.
>> >> integration of newly added APIs in pcp and worker child process codes
>> will
>> >> be done be next patch.
>> >>
>> >> -- Integration of newly added API into query processor logic in child
>> >> process. ( this will be the toughest part)
>> >>
>> >> -- elog.c and elog.h files needs some cleanups and changes ( to remove
>> >> unwanted functions and data members of ErrorData structure) but this
>> will
>> >> be done at the end when we will have 100% surety if something in there
>> is
>> >> required or not.
>> >>
>> >>
>> >> Thanks
>> >> Muhammad Usama
>> > _______________________________________________
>> > pgpool-hackers mailing list
>> > pgpool-hackers at pgpool.net
>> > http://www.pgpool.net/mailman/listinfo/pgpool-hackers
>>
>> --- src/main/main.c
>> +++ src/main/main.c
>> @@ -499,1915 +-23,60 @@
>> fd = open(pool_config->pid_file_name, O_CREAT|O_WRONLY,
>> S_IRUSR|S_IWUSR);
>> if (fd == -1)
>> {
>> - pool_error("could not open pid file as %s. reason: %s",
>> - pool_config->pid_file_name,
>> strerror(errno));
>> - pool_shmem_exit(1);
>> - exit(1);
>> + ereport(FATAL,
>> + (errmsg("could not open pid file as %s. reason:
>> %s",
>> + pool_config->pid_file_name,
>> strerror(errno))));
>> }
>> snprintf(pidbuf, sizeof(pidbuf), "%d", (int)getpid());
>> if (write(fd, pidbuf, strlen(pidbuf)+1) == -1)
>> {
>> - pool_error("could not write pid file as %s. reason: %s",
>> - pool_config->pid_file_name,
>> strerror(errno));
>> close(fd);
>> - pool_shmem_exit(1);
>> - exit(1);
>> + ereport(FATAL,
>> + (errmsg("could not write pid file as %s. reason:
>> %s",
>> + pool_config->pid_file_name,
>> strerror(errno))));
>> }
>> if (fsync(fd) == -1)
>> {
>> - pool_error("could not fsync pid file as %s. reason: %s",
>> - pool_config->pid_file_name,
>> strerror(errno));
>> close(fd);
>> - pool_shmem_exit(1);
>> - exit(1);
>> + ereport(FATAL,
>> + (errmsg("could not fsync pid file as %s. reason:
>> %s",
>> + pool_config->pid_file_name,
>> strerror(errno))));
>> }
>> if (close(fd) == -1)
>> {
>> - pool_error("could not close pid file as %s. reason: %s",
>> - pool_config->pid_file_name,
>> strerror(errno));
>> - pool_shmem_exit(1);
>> - exit(1);
>> + ereport(FATAL,
>> + (errmsg("could not close pid file as %s. reason:
>> %s",
>> + pool_config->pid_file_name,
>> strerror(errno))));
>> }
>> + /* register the call back to delete the pid file at system exit */
>> + on_proc_exit(FileUnlink, (Datum) pool_config->pid_file_name);
>> }
>>
>> /*
>> -* Read the status file
>> -*/
>> -static int read_status_file(bool discard_status)
>> + * get_config_file_name: return full path of pgpool.conf.
>> + */
>> +char *get_config_file_name(void)
>> {
>> - FILE *fd;
>> - char fnamebuf[POOLMAXPATHLEN];
>> - int i;
>> - bool someone_wakeup = false;
>> -
>> - snprintf(fnamebuf, sizeof(fnamebuf), "%s/%s",
>> pool_config->logdir, STATUS_FILE_NAME);
>> - fd = fopen(fnamebuf, "r");
>> - if (!fd)
>> - {
>> - pool_log("Backend status file %s does not exist",
>> fnamebuf);
>> - return -1;
>> - }
>> -
>> - /*
>> - * If discard_status is true, unlink pgpool_status and
>> - * do not restore previous status.
>> - */
>> - if (discard_status)
>> - {
>> - fclose(fd);
>> - if (unlink(fnamebuf) == 0)
>> - {
>> - pool_log("Backend status file %s discarded",
>> fnamebuf);
>> - }
>> - else
>> - {
>> - pool_error("Failed to discard backend status file
>> %s reason:%s", fnamebuf, strerror(errno));
>> - }
>> - return 0;
>> - }
>> -
>> - if (fread(&backend_rec, 1, sizeof(backend_rec), fd) !=
>> sizeof(backend_rec))
>> - {
>> - pool_error("Could not read backend status file as %s.
>> reason: %s",
>> - fnamebuf, strerror(errno));
>> - fclose(fd);
>> - return -1;
>> - }
>> - fclose(fd);
>> -
>> - for (i=0;i< pool_config->backend_desc->num_backends;i++)
>> - {
>> - if (backend_rec.status[i] == CON_DOWN)
>> - {
>> - BACKEND_INFO(i).backend_status = CON_DOWN;
>> - pool_log("read_status_file: %d th backend is set
>> to down status", i);
>> - }
>> - else
>> - {
>> - BACKEND_INFO(i).backend_status = CON_CONNECT_WAIT;
>> - someone_wakeup = true;
>> - }
>> - }
>> -
>> - /*
>> - * If no one woke up, we regard the status file bogus
>> - */
>> - if (someone_wakeup == false)
>> - {
>> - for (i=0;i< pool_config->backend_desc->num_backends;i++)
>> - {
>> - BACKEND_INFO(i).backend_status = CON_CONNECT_WAIT;
>> - }
>> - }
>> -
>> - return 0;
>> + return conf_file;
>> }
>>
>> /*
>> -* Write the pid file
>> -*/
>> -static int write_status_file(void)
>> -{
>> - FILE *fd;
>> - char fnamebuf[POOLMAXPATHLEN];
>> - int i;
>> -
>> - snprintf(fnamebuf, sizeof(fnamebuf), "%s/%s",
>> pool_config->logdir, STATUS_FILE_NAME);
>> - fd = fopen(fnamebuf, "w");
>> - if (!fd)
>> - {
>> - pool_error("Could not open status file %s", fnamebuf);
>> - return -1;
>> - }
>> -
>> - memset(&backend_rec, 0, sizeof(backend_rec));
>> -
>> - for (i=0;i< pool_config->backend_desc->num_backends;i++)
>> - {
>> - backend_rec.status[i] = BACKEND_INFO(i).backend_status;
>> - }
>> -
>> - if (fwrite(&backend_rec, 1, sizeof(backend_rec), fd) !=
>> sizeof(backend_rec))
>> - {
>> - pool_error("Could not write backend status file as %s.
>> reason: %s",
>> - fnamebuf, strerror(errno));
>> - fclose(fd);
>> - return -1;
>> - }
>> - fclose(fd);
>> - return 0;
>> -}
>> -
>> -/*
>> - * fork a child for PCP
>> - */
>> -pid_t pcp_fork_a_child(int unix_fd, int inet_fd, char *pcp_conf_file)
>> -{
>> - pid_t pid;
>> -
>> - pid = fork();
>> -
>> - if (pid == 0)
>> - {
>> - close(pipe_fds[0]);
>> - close(pipe_fds[1]);
>> -
>> - myargv = save_ps_display_args(myargc, myargv);
>> -
>> - /* call PCP child main */
>> - POOL_SETMASK(&UnBlockSig);
>> - health_check_timer_expired = 0;
>> - reload_config_request = 0;
>> - run_as_pcp_child = true;
>> - pcp_do_child(unix_fd, inet_fd, pcp_conf_file);
>> - }
>> - else if (pid == -1)
>> - {
>> - pool_error("fork() failed. reason: %s", strerror(errno));
>> - myexit(1);
>> - }
>> - return pid;
>> -}
>> -
>> -/*
>> -* fork a child
>> -*/
>> -pid_t fork_a_child(int unix_fd, int inet_fd, int id)
>> -{
>> - pid_t pid;
>> -
>> - pid = fork();
>> -
>> - if (pid == 0)
>> - {
>> - /* Before we unconditionally closed pipe_fds[0] and
>> pipe_fds[1]
>> - * here, which is apparently wrong since in the start up
>> of
>> - * pgpool, pipe(2) is not called yet and it mistakenly
>> closes
>> - * fd 0. Now we check the fd > 0 before close(), expecting
>> - * pipe returns fds greater than 0. Note that we cannot
>> - * unconditionally remove close(2) calls since
>> fork_a_child()
>> - * may be called *after* pgpool starting up.
>> - */
>> - if (pipe_fds[0] > 0)
>> - {
>> - close(pipe_fds[0]);
>> - close(pipe_fds[1]);
>> - }
>> -
>> - myargv = save_ps_display_args(myargc, myargv);
>> -
>> - /* call child main */
>> - POOL_SETMASK(&UnBlockSig);
>> - health_check_timer_expired = 0;
>> - reload_config_request = 0;
>> - my_proc_id = id;
>> - run_as_pcp_child = false;
>> - do_child(unix_fd, inet_fd);
>> - }
>> - else if (pid == -1)
>> - {
>> - pool_error("fork() failed. reason: %s", strerror(errno));
>> - myexit(1);
>> - }
>> - return pid;
>> -}
>> -
>> -/*
>> -* fork worker child process
>> -*/
>> -pid_t worker_fork_a_child()
>> -{
>> - pid_t pid;
>> -
>> - pid = fork();
>> -
>> - if (pid == 0)
>> - {
>> - /* Before we unconditionally closed pipe_fds[0] and
>> pipe_fds[1]
>> - * here, which is apparently wrong since in the start up
>> of
>> - * pgpool, pipe(2) is not called yet and it mistakenly
>> closes
>> - * fd 0. Now we check the fd > 0 before close(), expecting
>> - * pipe returns fds greater than 0. Note that we cannot
>> - * unconditionally remove close(2) calls since
>> fork_a_child()
>> - * may be called *after* pgpool starting up.
>> - */
>> - if (pipe_fds[0] > 0)
>> - {
>> - close(pipe_fds[0]);
>> - close(pipe_fds[1]);
>> - }
>> -
>> - myargv = save_ps_display_args(myargc, myargv);
>> -
>> - /* call child main */
>> - POOL_SETMASK(&UnBlockSig);
>> - health_check_timer_expired = 0;
>> - reload_config_request = 0;
>> - do_worker_child();
>> - }
>> - else if (pid == -1)
>> - {
>> - pool_error("fork() failed. reason: %s", strerror(errno));
>> - myexit(1);
>> - }
>> - return pid;
>> -}
>> -
>> -/*
>> -* create inet domain socket
>> -*/
>> -static int create_inet_domain_socket(const char *hostname, const int
>> port)
>> -{
>> - struct sockaddr_in addr;
>> - int fd;
>> - int status;
>> - int one = 1;
>> - int len;
>> - int backlog;
>> -
>> - fd = socket(AF_INET, SOCK_STREAM, 0);
>> - if (fd == -1)
>> - {
>> - pool_error("Failed to create INET domain socket. reason:
>> %s", strerror(errno));
>> - myexit(1);
>> - }
>> - if ((setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, (char *) &one,
>> - sizeof(one))) == -1)
>> - {
>> - pool_error("setsockopt() failed. reason: %s",
>> strerror(errno));
>> - myexit(1);
>> - }
>> -
>> - memset((char *) &addr, 0, sizeof(addr));
>> - addr.sin_family = AF_INET;
>> -
>> - if (strcmp(hostname, "*")==0)
>> - {
>> - addr.sin_addr.s_addr = htonl(INADDR_ANY);
>> - }
>> - else
>> - {
>> - struct hostent *hostinfo;
>> -
>> - hostinfo = gethostbyname(hostname);
>> - if (!hostinfo)
>> - {
>> - pool_error("could not resolve host name \"%s\":
>> %s", hostname, hstrerror(h_errno));
>> - myexit(1);
>> - }
>> - addr.sin_addr = *(struct in_addr *) hostinfo->h_addr;
>> - }
>> -
>> - addr.sin_port = htons(port);
>> - len = sizeof(struct sockaddr_in);
>> - status = bind(fd, (struct sockaddr *)&addr, len);
>> - if (status == -1)
>> - {
>> - char *host = "", *serv = "";
>> - char hostname[NI_MAXHOST], servname[NI_MAXSERV];
>> - if (getnameinfo((struct sockaddr *) &addr, len, hostname,
>> sizeof(hostname), servname, sizeof(servname), 0) == 0) {
>> - host = hostname;
>> - serv = servname;
>> - }
>> - pool_error("bind(%s:%s) failed. reason: %s", host, serv,
>> strerror(errno));
>> - myexit(1);
>> - }
>> -
>> - backlog = pool_config->num_init_children * 2;
>> - if (backlog > PGPOOLMAXLITSENQUEUELENGTH)
>> - backlog = PGPOOLMAXLITSENQUEUELENGTH;
>> -
>> - status = listen(fd, backlog);
>> - if (status < 0)
>> - {
>> - pool_error("listen() failed. reason: %s",
>> strerror(errno));
>> - myexit(1);
>> - }
>> - return fd;
>> -}
>> -
>> -/*
>> -* create UNIX domain socket
>> -*/
>> -static int create_unix_domain_socket(struct sockaddr_un un_addr_tmp)
>> -{
>> - struct sockaddr_un addr;
>> - int fd;
>> - int status;
>> - int len;
>> -
>> - fd = socket(AF_UNIX, SOCK_STREAM, 0);
>> - if (fd == -1)
>> - {
>> - pool_error("Failed to create UNIX domain socket. reason:
>> %s", strerror(errno));
>> - myexit(1);
>> - }
>> - memset((char *) &addr, 0, sizeof(addr));
>> - addr.sun_family = AF_UNIX;
>> - snprintf(addr.sun_path, sizeof(addr.sun_path), "%s",
>> un_addr_tmp.sun_path);
>> - len = sizeof(struct sockaddr_un);
>> - status = bind(fd, (struct sockaddr *)&addr, len);
>> - if (status == -1)
>> - {
>> - pool_error("bind(%s) failed. reason: %s", addr.sun_path,
>> strerror(errno));
>> - myexit(1);
>> - }
>> -
>> - if (chmod(un_addr_tmp.sun_path, 0777) == -1)
>> - {
>> - pool_error("chmod() failed. reason: %s", strerror(errno));
>> - myexit(1);
>> - }
>> -
>> - status = listen(fd, PGPOOLMAXLITSENQUEUELENGTH);
>> - if (status < 0)
>> - {
>> - pool_error("listen() failed. reason: %s",
>> strerror(errno));
>> - myexit(1);
>> - }
>> - return fd;
>> -}
>> -
>> -static void myunlink(const char* path)
>> -{
>> - if (unlink(path) == 0) return;
>> - pool_error("unlink(%s) failed: %s", path, strerror(errno));
>> -}
>> -
>> -static void myexit(int code)
>> -{
>> - int i;
>> -
>> - if (getpid() != mypid)
>> - return;
>> -
>> - if (process_info != NULL) {
>> - POOL_SETMASK(&AuthBlockSig);
>> - exiting = 1;
>> - for (i = 0; i < pool_config->num_init_children; i++)
>> - {
>> - pid_t pid = process_info[i].pid;
>> - if (pid)
>> - {
>> - kill(pid, SIGTERM);
>> - }
>> - }
>> -
>> - /* wait for all children to exit */
>> - while (wait(NULL) > 0)
>> - ;
>> - if (errno != ECHILD)
>> - pool_error("wait() failed. reason:%s",
>> strerror(errno));
>> - POOL_SETMASK(&UnBlockSig);
>> - }
>> -
>> - myunlink(un_addr.sun_path);
>> - myunlink(pcp_un_addr.sun_path);
>> - myunlink(pool_config->pid_file_name);
>> -
>> - write_status_file();
>> -
>> - pool_shmem_exit(code);
>> - exit(code);
>> -}
>> -
>> -void notice_backend_error(int node_id)
>> -{
>> - int n = node_id;
>> -
>> - if (getpid() == mypid)
>> - {
>> - pool_log("notice_backend_error: called from pgpool main.
>> ignored.");
>> - }
>> - else
>> - {
>> - degenerate_backend_set(&n, 1);
>> - }
>> -}
>> -
>> -/* notice backend connection error using SIGUSR1 */
>> -void degenerate_backend_set(int *node_id_set, int count)
>> -{
>> - pid_t parent = getppid();
>> - int i;
>> - bool need_signal = false;
>> -#ifdef HAVE_SIGPROCMASK
>> - sigset_t oldmask;
>> -#else
>> - int oldmask;
>> -#endif
>> -
>> - if (pool_config->parallel_mode)
>> - {
>> - return;
>> - }
>> -
>> - POOL_SETMASK2(&BlockSig, &oldmask);
>> - pool_semaphore_lock(REQUEST_INFO_SEM);
>> - Req_info->kind = NODE_DOWN_REQUEST;
>> - for (i = 0; i < count; i++)
>> - {
>> - if (node_id_set[i] < 0 || node_id_set[i] >=
>> MAX_NUM_BACKENDS ||
>> - !VALID_BACKEND(node_id_set[i]))
>> - {
>> - pool_log("degenerate_backend_set: node %d is not
>> valid backend.", i);
>> - continue;
>> - }
>> -
>> - if
>> (POOL_DISALLOW_TO_FAILOVER(BACKEND_INFO(node_id_set[i]).flag))
>> - {
>> - pool_log("degenerate_backend_set: %d failover
>> request from pid %d is canceled because failover is disallowed",
>> node_id_set[i], getpid());
>> - continue;
>> - }
>> -
>> - pool_log("degenerate_backend_set: %d fail over request
>> from pid %d", node_id_set[i], getpid());
>> - Req_info->node_id[i] = node_id_set[i];
>> - need_signal = true;
>> - }
>> -
>> - if (need_signal)
>> - {
>> - if (!pool_config->use_watchdog || WD_OK ==
>> wd_degenerate_backend_set(node_id_set, count))
>> - {
>> - kill(parent, SIGUSR1);
>> - }
>> - else
>> - {
>> - pool_log("degenerate_backend_set: failover
>> request from pid %d is canceled by other pgpool", getpid());
>> - memset(Req_info->node_id, -1, sizeof(int) *
>> MAX_NUM_BACKENDS);
>> - }
>> - }
>> -
>> - pool_semaphore_unlock(REQUEST_INFO_SEM);
>> - POOL_SETMASK(&oldmask);
>> -}
>> -
>> -/* send promote node request using SIGUSR1 */
>> -void promote_backend(int node_id)
>> -{
>> - pid_t parent = getppid();
>> -
>> - if (!MASTER_SLAVE || strcmp(pool_config->master_slave_sub_mode,
>> MODE_STREAMREP))
>> - {
>> - return;
>> - }
>> -
>> - if (node_id < 0 || node_id >= MAX_NUM_BACKENDS ||
>> !VALID_BACKEND(node_id))
>> - {
>> - pool_error("promote_backend: node %d is not valid
>> backend.", node_id);
>> - return;
>> - }
>> -
>> - pool_semaphore_lock(REQUEST_INFO_SEM);
>> - Req_info->kind = PROMOTE_NODE_REQUEST;
>> - Req_info->node_id[0] = node_id;
>> - pool_log("promote_backend: %d promote node request from pid %d",
>> node_id, getpid());
>> -
>> - if (!pool_config->use_watchdog || WD_OK ==
>> wd_promote_backend(node_id))
>> - {
>> - kill(parent, SIGUSR1);
>> - }
>> - else
>> - {
>> - pool_log("promote_backend: promote request from pid %d is
>> canceled by other pgpool", getpid());
>> - Req_info->node_id[0] = -1;
>> - }
>> -
>> - pool_semaphore_unlock(REQUEST_INFO_SEM);
>> -}
>> -
>> -/* send failback request using SIGUSR1 */
>> -void send_failback_request(int node_id)
>> -{
>> - pid_t parent = getppid();
>> -
>> - pool_log("send_failback_request: fail back %d th node request
>> from pid %d", node_id, getpid());
>> - Req_info->kind = NODE_UP_REQUEST;
>> - Req_info->node_id[0] = node_id;
>> -
>> - if (node_id < 0 || node_id >= MAX_NUM_BACKENDS ||
>> - (RAW_MODE && BACKEND_INFO(node_id).backend_status !=
>> CON_DOWN && VALID_BACKEND(node_id)))
>> - {
>> - pool_error("send_failback_request: node %d is alive.",
>> node_id);
>> - Req_info->node_id[0] = -1;
>> - return;
>> - }
>> -
>> - if (pool_config->use_watchdog && WD_OK !=
>> wd_send_failback_request(node_id))
>> - {
>> - pool_log("send_failback_request: failback request from
>> pid %d is canceled by other pgpool", getpid());
>> - Req_info->node_id[0] = -1;
>> - return;
>> - }
>> - kill(parent, SIGUSR1);
>> -}
>> -
>> -static RETSIGTYPE exit_handler(int sig)
>> -{
>> - int i;
>> -
>> - POOL_SETMASK(&AuthBlockSig);
>> -
>> - /*
>> - * this could happen in a child process if a signal has been sent
>> - * before resetting signal handler
>> - */
>> - if (getpid() != mypid)
>> - {
>> - pool_debug("exit_handler: I am not parent");
>> - POOL_SETMASK(&UnBlockSig);
>> - pool_shmem_exit(0);
>> - exit(0);
>> - }
>> -
>> - if (sig == SIGTERM)
>> - pool_log("received smart shutdown request");
>> - else if (sig == SIGINT)
>> - pool_log("received fast shutdown request");
>> - else if (sig == SIGQUIT)
>> - pool_log("received immediate shutdown request");
>> - else
>> - {
>> - pool_error("exit_handler: unknown signal received %d",
>> sig);
>> - POOL_SETMASK(&UnBlockSig);
>> - return;
>> - }
>> -
>> - exiting = 1;
>> -
>> - for (i = 0; i < pool_config->num_init_children; i++)
>> - {
>> - pid_t pid = process_info[i].pid;
>> - if (pid)
>> - {
>> - kill(pid, sig);
>> - }
>> - }
>> -
>> - kill(pcp_pid, sig);
>> - kill(worker_pid, sig);
>> -
>> - if (pool_config->use_watchdog)
>> - {
>> - wd_kill_watchdog(sig);
>> - }
>> -
>> - POOL_SETMASK(&UnBlockSig);
>> -
>> - while (wait(NULL) > 0)
>> - ;
>> -
>> - if (errno != ECHILD)
>> - pool_error("wait() failed. reason:%s", strerror(errno));
>> -
>> - process_info = NULL;
>> - myexit(0);
>> -}
>> -
>> -/*
>> - * Calculate next valid master node id.
>> - * If no valid node found, returns -1.
>> - */
>> -static int get_next_master_node(void)
>> -{
>> - int i;
>> -
>> - for (i=0;i<pool_config->backend_desc->num_backends;i++)
>> - {
>> - /*
>> - * Do not use VALID_BACKEND macro in raw mode.
>> - * VALID_BACKEND return true only if the argument is
>> master
>> - * node id. In other words, standby nodes are false. So
>> need
>> - * to check backend status with VALID_BACKEND_RAW.
>> - */
>> - if (RAW_MODE)
>> - {
>> - if (VALID_BACKEND_RAW(i))
>> - break;
>> - }
>> - else
>> - {
>> - if (VALID_BACKEND(i))
>> - break;
>> - }
>> - }
>> -
>> - if (i == pool_config->backend_desc->num_backends)
>> - i = -1;
>> -
>> - return i;
>> -}
>> -
>> -/*
>> - * handle SIGUSR1
>> - *
>> - */
>> -static RETSIGTYPE failover_handler(int sig)
>> -{
>> - POOL_SETMASK(&BlockSig);
>> - failover_request = 1;
>> - write(pipe_fds[1], "\0", 1);
>> - POOL_SETMASK(&UnBlockSig);
>> -}
>> -
>> -/*
>> - * backend connection error, failover/failback request, if possible
>> - * failover() must be called under protecting signals.
>> - */
>> -static void failover(void)
>> -{
>> - int i;
>> - int node_id;
>> - bool by_health_check;
>> - int new_master;
>> - int new_primary;
>> - int nodes[MAX_NUM_BACKENDS];
>> - bool need_to_restart_children;
>> - int status;
>> - int sts;
>> -
>> - pool_debug("failover_handler called");
>> -
>> - memset(nodes, 0, sizeof(int) * MAX_NUM_BACKENDS);
>> -
>> - /*
>> - * this could happen in a child process if a signal has been sent
>> - * before resetting signal handler
>> - */
>> - if (getpid() != mypid)
>> - {
>> - pool_debug("failover_handler: I am not parent");
>> - kill(pcp_pid, SIGUSR2);
>> - return;
>> - }
>> -
>> - /*
>> - * processing SIGTERM, SIGINT or SIGQUIT
>> - */
>> - if (exiting)
>> - {
>> - pool_debug("failover_handler called while exiting");
>> - kill(pcp_pid, SIGUSR2);
>> - return;
>> - }
>> -
>> - /*
>> - * processing fail over or switch over
>> - */
>> - if (switching)
>> - {
>> - pool_debug("failover_handler called while switching");
>> - kill(pcp_pid, SIGUSR2);
>> - return;
>> - }
>> -
>> - pool_semaphore_lock(REQUEST_INFO_SEM);
>> -
>> - if (Req_info->kind == CLOSE_IDLE_REQUEST)
>> - {
>> - pool_semaphore_unlock(REQUEST_INFO_SEM);
>> - kill_all_children(SIGUSR1);
>> - kill(pcp_pid, SIGUSR2);
>> - return;
>> - }
>> -
>> - /*
>> - * if not in replication mode/master slave mode, we treat this a
>> restart request.
>> - * otherwise we need to check if we have already failovered.
>> - */
>> - pool_debug("failover_handler: starting to select new master
>> node");
>> - switching = 1;
>> - Req_info->switching = true;
>> - node_id = Req_info->node_id[0];
>> -
>> - /* start of command inter-lock with watchdog */
>> - if (pool_config->use_watchdog)
>> - {
>> - by_health_check = (!failover_request &&
>> Req_info->kind==NODE_DOWN_REQUEST);
>> - wd_start_interlock(by_health_check);
>> - }
>> -
>> - /* failback request? */
>> - if (Req_info->kind == NODE_UP_REQUEST)
>> - {
>> - if (node_id >= MAX_NUM_BACKENDS ||
>> - (Req_info->kind == NODE_UP_REQUEST && !(RAW_MODE
>> &&
>> - BACKEND_INFO(node_id).backend_status == CON_DOWN) &&
>> VALID_BACKEND(node_id)) ||
>> - (Req_info->kind == NODE_DOWN_REQUEST &&
>> !VALID_BACKEND(node_id)))
>> - {
>> - pool_semaphore_unlock(REQUEST_INFO_SEM);
>> - pool_error("failover_handler: invalid node_id %d
>> status:%d MAX_NUM_BACKENDS: %d", node_id,
>> -
>> BACKEND_INFO(node_id).backend_status, MAX_NUM_BACKENDS);
>> - kill(pcp_pid, SIGUSR2);
>> - switching = 0;
>> - Req_info->switching = false;
>> -
>> - /* end of command inter-lock */
>> - if (pool_config->use_watchdog)
>> - wd_leave_interlock();
>> -
>> - return;
>> - }
>> -
>> - pool_log("starting fail back. reconnect host %s(%d)",
>> - BACKEND_INFO(node_id).backend_hostname,
>> - BACKEND_INFO(node_id).backend_port);
>> - BACKEND_INFO(node_id).backend_status = CON_CONNECT_WAIT;
>> /* unset down status */
>> -
>> - /* wait for failback command lock or to be lock holder */
>> - if (pool_config->use_watchdog && !wd_am_I_lock_holder())
>> - {
>> - wd_wait_for_lock(WD_FAILBACK_COMMAND_LOCK);
>> - }
>> - /* execute failback command if lock holder */
>> - if (!pool_config->use_watchdog || wd_am_I_lock_holder())
>> - {
>> - trigger_failover_command(node_id,
>> pool_config->failback_command,
>> -
>> MASTER_NODE_ID, get_next_master_node(), PRIMARY_NODE_ID);
>> -
>> - /* unlock failback command */
>> - if (pool_config->use_watchdog)
>> - wd_unlock(WD_FAILBACK_COMMAND_LOCK);
>> - }
>> - }
>> - else if (Req_info->kind == PROMOTE_NODE_REQUEST)
>> - {
>> - if (node_id != -1 && VALID_BACKEND(node_id))
>> - {
>> - pool_log("starting promotion. promote host
>> %s(%d)",
>> -
>> BACKEND_INFO(node_id).backend_hostname,
>> -
>> BACKEND_INFO(node_id).backend_port);
>> - }
>> - else
>> - {
>> - pool_log("failover: no backends are promoted");
>> - pool_semaphore_unlock(REQUEST_INFO_SEM);
>> - kill(pcp_pid, SIGUSR2);
>> - switching = 0;
>> - Req_info->switching = false;
>> -
>> - /* end of command inter-lock */
>> - if (pool_config->use_watchdog)
>> - wd_leave_interlock();
>> -
>> - return;
>> - }
>> - }
>> - else
>> - {
>> - int cnt = 0;
>> -
>> - for (i = 0; i < MAX_NUM_BACKENDS; i++)
>> - {
>> - if (Req_info->node_id[i] != -1 &&
>> - ((RAW_MODE &&
>> VALID_BACKEND_RAW(Req_info->node_id[i])) ||
>> - VALID_BACKEND(Req_info->node_id[i])))
>> - {
>> - pool_log("starting degeneration. shutdown
>> host %s(%d)",
>> -
>> BACKEND_INFO(Req_info->node_id[i]).backend_hostname,
>> -
>> BACKEND_INFO(Req_info->node_id[i]).backend_port);
>> -
>> -
>> BACKEND_INFO(Req_info->node_id[i]).backend_status = CON_DOWN; /* set down
>> status */
>> - /* save down node */
>> - nodes[Req_info->node_id[i]] = 1;
>> - cnt++;
>> - }
>> - }
>> -
>> - if (cnt == 0)
>> - {
>> - pool_log("failover: no backends are degenerated");
>> - pool_semaphore_unlock(REQUEST_INFO_SEM);
>> - kill(pcp_pid, SIGUSR2);
>> - switching = 0;
>> - Req_info->switching = false;
>> -
>> - /* end of command inter-lock */
>> - if (pool_config->use_watchdog)
>> - wd_leave_interlock();
>> -
>> - return;
>> - }
>> - }
>> -
>> - new_master = get_next_master_node();
>> -
>> - if (new_master < 0)
>> - {
>> - pool_error("failover_handler: no valid DB node found");
>> - }
>> -
>> -/*
>> - * Before we tried to minimize restarting pgpool to protect existing
>> - * connections from clients to pgpool children. What we did here was,
>> - * if children other than master went down, we did not fail over.
>> - * This is wrong. Think about following scenario. If someone
>> - * accidentally plugs out the network cable, the TCP/IP stack keeps
>> - * retrying for long time (typically 2 hours). The only way to stop
>> - * the retry is restarting the process. Bottom line is, we need to
>> - * restart all children in any case. See pgpool-general list posting
>> - * "TCP connections are *not* closed when a backend timeout" on Jul 13
>> - * 2008 for more details.
>> - */
>> -#ifdef NOT_USED
>> - else
>> - {
>> - if (Req_info->master_node_id == new_master && *InRecovery
>> == RECOVERY_INIT)
>> - {
>> - pool_log("failover_handler: do not restart
>> pgpool. same master node %d was selected", new_master);
>> - if (Req_info->kind == NODE_UP_REQUEST)
>> - {
>> - pool_log("failback done. reconnect host
>> %s(%d)",
>> -
>> BACKEND_INFO(node_id).backend_hostname,
>> -
>> BACKEND_INFO(node_id).backend_port);
>> - }
>> - else
>> - {
>> - pool_log("failover done. shutdown host
>> %s(%d)",
>> -
>> BACKEND_INFO(node_id).backend_hostname,
>> -
>> BACKEND_INFO(node_id).backend_port);
>> - }
>> -
>> - /* exec failover_command */
>> - for (i = 0; i <
>> pool_config->backend_desc->num_backends; i++)
>> - {
>> - if (nodes[i])
>> - trigger_failover_command(i,
>> pool_config->failover_command);
>> - }
>> -
>> - pool_semaphore_unlock(REQUEST_INFO_SEM);
>> - switching = 0;
>> - Req_info->switching = false;
>> - kill(pcp_pid, SIGUSR2);
>> - switching = 0;
>> - Req_info->switching = false;
>> - return;
>> - }
>> - }
>> -#endif
>> -
>> -
>> - /* On 2011/5/2 Tatsuo Ishii says: if mode is streaming replication
>> - * and request is NODE_UP_REQUEST(failback case) we don't need to
>> - * restart all children. Existing session will not use newly
>> - * attached node, but load balanced node is not changed until this
>> - * session ends, so it's harmless anyway.
>> - */
>> - if (MASTER_SLAVE && !strcmp(pool_config->master_slave_sub_mode,
>> MODE_STREAMREP) &&
>> - Req_info->kind == NODE_UP_REQUEST)
>> - {
>> - pool_log("Do not restart children because we are
>> failbacking node id %d host%s port:%d and we are in streaming replication
>> mode", node_id,
>> - BACKEND_INFO(node_id).backend_hostname,
>> - BACKEND_INFO(node_id).backend_port);
>> -
>> - need_to_restart_children = false;
>> - }
>> - else
>> - {
>> - pool_log("Restart all children");
>> -
>> - /* kill all children */
>> - for (i = 0; i < pool_config->num_init_children; i++)
>> - {
>> - pid_t pid = process_info[i].pid;
>> - if (pid)
>> - {
>> - kill(pid, SIGQUIT);
>> - pool_debug("failover_handler: kill %d",
>> pid);
>> - }
>> - }
>> -
>> - need_to_restart_children = true;
>> - }
>> -
>> - /* wait for failover command lock or to be lock holder*/
>> - if (pool_config->use_watchdog && !wd_am_I_lock_holder())
>> - {
>> - wd_wait_for_lock(WD_FAILOVER_COMMAND_LOCK);
>> - }
>> -
>> - /* execute failover command if lock holder */
>> - if (!pool_config->use_watchdog || wd_am_I_lock_holder())
>> - {
>> - /* Exec failover_command if needed */
>> - for (i = 0; i < pool_config->backend_desc->num_backends;
>> i++)
>> - {
>> - if (nodes[i])
>> - trigger_failover_command(i,
>> pool_config->failover_command,
>> -
>> MASTER_NODE_ID, new_master, PRIMARY_NODE_ID);
>> - }
>> -
>> - /* unlock failover command */
>> - if (pool_config->use_watchdog)
>> - wd_unlock(WD_FAILOVER_COMMAND_LOCK);
>> - }
>> -
>> -
>> -/* no need to wait since it will be done in reap_handler */
>> -#ifdef NOT_USED
>> - while (wait(NULL) > 0)
>> - ;
>> -
>> - if (errno != ECHILD)
>> - pool_error("failover_handler: wait() failed. reason:%s",
>> strerror(errno));
>> -#endif
>> -
>> - if (Req_info->kind == PROMOTE_NODE_REQUEST &&
>> VALID_BACKEND(node_id))
>> - new_primary = node_id;
>> - else
>> - new_primary = find_primary_node_repeatedly();
>> -
>> - /*
>> - * If follow_master_command is provided and in master/slave
>> - * streaming replication mode, we start degenerating all backends
>> - * as they are not replicated anymore.
>> - */
>> - int follow_cnt = 0;
>> - if (MASTER_SLAVE && !strcmp(pool_config->master_slave_sub_mode,
>> MODE_STREAMREP))
>> - {
>> - if (*pool_config->follow_master_command != '\0' ||
>> - Req_info->kind == PROMOTE_NODE_REQUEST)
>> - {
>> - /* only if the failover is against the current
>> primary */
>> - if (((Req_info->kind == NODE_DOWN_REQUEST) &&
>> - (nodes[Req_info->primary_node_id])) ||
>> - ((Req_info->kind == PROMOTE_NODE_REQUEST)
>> &&
>> - (VALID_BACKEND(node_id)))) {
>> -
>> - for (i = 0; i <
>> pool_config->backend_desc->num_backends; i++)
>> - {
>> - /* do not degenerate the new
>> primary */
>> - if ((new_primary >= 0) && (i !=
>> new_primary)) {
>> - BackendInfo *bkinfo;
>> - bkinfo =
>> pool_get_node_info(i);
>> - pool_log("starting follow
>> degeneration. shutdown host %s(%d)",
>> -
>> bkinfo->backend_hostname,
>> -
>> bkinfo->backend_port);
>> - bkinfo->backend_status =
>> CON_DOWN; /* set down status */
>> - follow_cnt++;
>> - }
>> - }
>> -
>> - if (follow_cnt == 0)
>> - {
>> - pool_log("failover: no follow
>> backends are degenerated");
>> - }
>> - else
>> - {
>> - /* update new master node */
>> - new_master =
>> get_next_master_node();
>> - pool_log("failover: %d follow
>> backends have been degenerated", follow_cnt);
>> - }
>> - }
>> - }
>> - }
>> -
>> - memset(Req_info->node_id, -1, sizeof(int) * MAX_NUM_BACKENDS);
>> - pool_semaphore_unlock(REQUEST_INFO_SEM);
>> -
>> - /* wait for follow_master_command lock or to be lock holder */
>> - if (pool_config->use_watchdog && !wd_am_I_lock_holder())
>> - {
>> - wd_wait_for_lock(WD_FOLLOW_MASTER_COMMAND_LOCK);
>> - }
>> -
>> - /* execute follow_master_command */
>> - if (!pool_config->use_watchdog || wd_am_I_lock_holder())
>> - {
>> - if ((follow_cnt > 0) &&
>> (*pool_config->follow_master_command != '\0'))
>> - {
>> - follow_pid =
>> fork_follow_child(Req_info->master_node_id, new_primary,
>> -
>> Req_info->primary_node_id);
>> - }
>> -
>> - /* unlock follow_master_command */
>> - if (pool_config->use_watchdog)
>> - wd_unlock(WD_FOLLOW_MASTER_COMMAND_LOCK);
>> - }
>> -
>> - /* end of command inter-lock */
>> - if (pool_config->use_watchdog)
>> - wd_end_interlock();
>> -
>> - /* Save primary node id */
>> - Req_info->primary_node_id = new_primary;
>> - pool_log("failover: set new primary node: %d",
>> Req_info->primary_node_id);
>> -
>> - if (new_master >= 0)
>> - {
>> - Req_info->master_node_id = new_master;
>> - pool_log("failover: set new master node: %d",
>> Req_info->master_node_id);
>> - }
>> -
>> -
>> - /* Fork the children if needed */
>> - if (need_to_restart_children)
>> - {
>> - for (i=0;i<pool_config->num_init_children;i++)
>> - {
>> -
>> - /*
>> - * Try to kill pgpool child because previous kill
>> signal
>> - * may not be received by pgpool child. This
>> could happen
>> - * if multiple PostgreSQL are going down (or even
>> starting
>> - * pgpool, without starting PostgreSQL can
>> trigger this).
>> - * Child calls degenerate_backend() and it tries
>> to aquire
>> - * semaphore to write a failover request. In this
>> case the
>> - * signal mask is set as well, thus signals are
>> never
>> - * received.
>> - */
>> - kill(process_info[i].pid, SIGQUIT);
>> -
>> - process_info[i].pid = fork_a_child(unix_fd,
>> inet_fd, i);
>> - process_info[i].start_time = time(NULL);
>> - }
>> - }
>> - else
>> - {
>> - /* Set restart request to each child. Children will
>> exit(1)
>> - * whenever they are idle to restart.
>> - */
>> - for (i=0;i<pool_config->num_init_children;i++)
>> - {
>> - process_info[i].need_to_restart = 1;
>> - }
>> - }
>> -
>> - /*
>> - * Send restart request to worker child.
>> - */
>> - kill(worker_pid, SIGUSR1);
>> -
>> - if (Req_info->kind == NODE_UP_REQUEST)
>> - {
>> - pool_log("failback done. reconnect host %s(%d)",
>> - BACKEND_INFO(node_id).backend_hostname,
>> - BACKEND_INFO(node_id).backend_port);
>> - }
>> - else if (Req_info->kind == PROMOTE_NODE_REQUEST)
>> - {
>> - pool_log("promotion done. promoted host %s(%d)",
>> - BACKEND_INFO(node_id).backend_hostname,
>> - BACKEND_INFO(node_id).backend_port);
>> - }
>> - else
>> - {
>> - pool_log("failover done. shutdown host %s(%d)",
>> - BACKEND_INFO(node_id).backend_hostname,
>> - BACKEND_INFO(node_id).backend_port);
>> - }
>> -
>> - switching = 0;
>> - Req_info->switching = false;
>> -
>> - /* kick wakeup_handler in pcp_child to notice that
>> - * failover/failback done
>> - */
>> - kill(pcp_pid, SIGUSR2);
>> -
>> - sleep(1);
>> -
>> - /*
>> - * Send restart request to pcp child.
>> - */
>> - kill(pcp_pid, SIGUSR1);
>> - for (;;)
>> - {
>> - sts = waitpid(pcp_pid, &status, 0);
>> - if (sts != -1)
>> - break;
>> - if (sts == -1)
>> - {
>> - if (errno == EINTR)
>> - continue;
>> - else
>> - {
>> - pool_error("failover: waitpid failed.
>> reason: %s", strerror(errno));
>> - return;
>> - }
>> - }
>> - }
>> - if (WIFSIGNALED(status))
>> - pool_log("PCP child %d exits with status %d by signal %d
>> in failover()", pcp_pid, status, WTERMSIG(status));
>> - else
>> - pool_log("PCP child %d exits with status %d in
>> failover()", pcp_pid, status);
>> -
>> - pcp_pid = pcp_fork_a_child(pcp_unix_fd, pcp_inet_fd,
>> pcp_conf_file);
>> - pool_log("fork a new PCP child pid %d in failover()", pcp_pid);
>> -}
>> -
>> -/*
>> - * health check timer handler
>> - */
>> -static RETSIGTYPE health_check_timer_handler(int sig)
>> -{
>> - POOL_SETMASK(&BlockSig);
>> - health_check_timer_expired = 1;
>> - POOL_SETMASK(&UnBlockSig);
>> -}
>> -
>> -
>> -/*
>> - * Check if we can connect to the backend
>> - * returns 0 for OK. otherwise returns backend id + 1
>> - */
>> -static int health_check(void)
>> -{
>> - POOL_CONNECTION_POOL_SLOT *slot;
>> - BackendInfo *bkinfo;
>> - static bool is_first = true;
>> - static char *dbname;
>> - int i;
>> -
>> - /* Do not execute health check during recovery */
>> - if (*InRecovery)
>> - return 0;
>> -
>> - Retry:
>> - /*
>> - * First we try with "postgres" database.
>> - */
>> - if (is_first)
>> - dbname = "postgres";
>> -
>> - for (i=0;i<pool_config->backend_desc->num_backends;i++)
>> - {
>> - /*
>> - * Make sure that health check timer has not been expired.
>> - * Before called health_check(),
>> health_check_timer_expired is
>> - * set to 0. However it is possible that while
>> processing DB
>> - * nodes health check timer expired.
>> - */
>> - if (health_check_timer_expired)
>> - {
>> - pool_log("health_check: health check timer has
>> been already expired before attempting to connect to %d th backend", i);
>> - return i+1;
>> - }
>> -
>> - bkinfo = pool_get_node_info(i);
>> -
>> - pool_debug("health_check: %d th DB node status: %d", i,
>> bkinfo->backend_status);
>> -
>> - if (bkinfo->backend_status == CON_UNUSED ||
>> - bkinfo->backend_status == CON_DOWN)
>> - continue;
>> -
>> - slot =
>> make_persistent_db_connection(bkinfo->backend_hostname,
>> -
>> bkinfo->backend_port,
>> -
>> dbname,
>> -
>> pool_config->health_check_user,
>> -
>> pool_config->health_check_password, false);
>> -
>> - if (is_first)
>> - is_first = false;
>> -
>> - if (!slot)
>> - {
>> - /*
>> - * Retry with template1 unless health check timer
>> is expired.
>> - */
>> - if (!strcmp(dbname, "postgres") &&
>> health_check_timer_expired == 0)
>> - {
>> - dbname = "template1";
>> - goto Retry;
>> - }
>> - else
>> - {
>> - pool_error("health check failed. %d th
>> host %s at port %d is down",
>> - i,
>> -
>> bkinfo->backend_hostname,
>> - bkinfo->backend_port);
>> - return i+1;
>> - }
>> - }
>> - else
>> - {
>> - discard_persistent_db_connection(slot);
>> - }
>> - }
>> -
>> - return 0;
>> -}
>> -
>> -/*
>> - * check if we can connect to the SystemDB
>> - * returns 0 for OK. otherwise returns -1
>> - */
>> -static int
>> -system_db_health_check(void)
>> -{
>> - int fd;
>> -
>> - /* V2 startup packet */
>> - typedef struct {
>> - int len; /* startup packet length */
>> - StartupPacket_v2 sp;
>> - } MySp;
>> - MySp mysp;
>> - char kind;
>> -
>> - memset(&mysp, 0, sizeof(mysp));
>> - mysp.len = htonl(296);
>> - mysp.sp.protoVersion = htonl(PROTO_MAJOR_V2 << 16);
>> - strcpy(mysp.sp.database, "template1");
>> - strncpy(mysp.sp.user, SYSDB_INFO->user, sizeof(mysp.sp.user) - 1);
>> - *mysp.sp.options = '\0';
>> - *mysp.sp.unused = '\0';
>> - *mysp.sp.tty = '\0';
>> -
>> - pool_debug("health_check: SystemDB status: %d", SYSDB_STATUS);
>> -
>> - /* if SystemDB is already down, ignore */
>> - if (SYSDB_STATUS == CON_UNUSED || SYSDB_STATUS == CON_DOWN)
>> - return 0;
>> -
>> - if (*SYSDB_INFO->hostname == '/')
>> - fd = connect_unix_domain_socket_by_port(SYSDB_INFO->port,
>> SYSDB_INFO->hostname, FALSE);
>> - else
>> - fd =
>> connect_inet_domain_socket_by_port(SYSDB_INFO->hostname, SYSDB_INFO->port,
>> FALSE);
>> -
>> - if (fd < 0)
>> - {
>> - pool_error("health check failed. SystemDB host %s at port
>> %d is down",
>> - SYSDB_INFO->hostname,
>> - SYSDB_INFO->port);
>> -
>> - return -1;
>> - }
>> -
>> - if (write(fd, &mysp, sizeof(mysp)) < 0)
>> - {
>> - pool_error("health check failed during write. SystemDB
>> host %s at port %d is down",
>> - SYSDB_INFO->hostname,
>> - SYSDB_INFO->port);
>> - close(fd);
>> - return -1;
>> - }
>> -
>> - read(fd, &kind, 1);
>> -
>> - if (write(fd, "X", 1) < 0)
>> - {
>> - pool_error("health check failed during write. SystemDB
>> host %s at port %d is down",
>> - SYSDB_INFO->hostname,
>> - SYSDB_INFO->port);
>> - close(fd);
>> - return -1;
>> - }
>> -
>> - close(fd);
>> - return 0;
>> -}
>> -
>> -/*
>> - * handle SIGCHLD
>> - */
>> -static RETSIGTYPE reap_handler(int sig)
>> -{
>> - POOL_SETMASK(&BlockSig);
>> - sigchld_request = 1;
>> - write(pipe_fds[1], "\0", 1);
>> - POOL_SETMASK(&UnBlockSig);
>> -}
>> -
>> -/*
>> - * Attach zombie processes and restart child processes.
>> - * reaper() must be called protected from signals.
>> - */
>> -static void reaper(void)
>> -{
>> - pid_t pid;
>> - int status;
>> - int i;
>> -
>> - pool_debug("reap_handler called");
>> -
>> - if (exiting)
>> - {
>> - pool_debug("reap_handler: exited due to exiting");
>> - return;
>> - }
>> -
>> - if (switching)
>> - {
>> - pool_debug("reap_handler: exited due to switching");
>> - return;
>> - }
>> -
>> - /* clear SIGCHLD request */
>> - sigchld_request = 0;
>> -
>> -#ifdef HAVE_WAITPID
>> - pool_debug("reap_handler: call waitpid");
>> - while ((pid = waitpid(-1, &status, WNOHANG)) > 0)
>> -#else
>> - pool_debug("reap_handler: call wait3");
>> - while ((pid = wait3(&status, WNOHANG, NULL)) > 0)
>> -#endif
>> - {
>> - if (WIFSIGNALED(status) && WTERMSIG(status) == SIGSEGV)
>> - {
>> - /* Child terminated by segmentation fault. Report
>> it */
>> - pool_error("Child process %d was terminated by
>> segmentation fault", pid);
>> - }
>> -
>> - /* if exiting child process was PCP handler */
>> - if (pid == pcp_pid)
>> - {
>> - if (WIFSIGNALED(status))
>> - pool_log("PCP child %d exits with status
>> %d by signal %d", pid, status, WTERMSIG(status));
>> - else
>> - pool_log("PCP child %d exits with status
>> %d", pid, status);
>> -
>> - pcp_pid = pcp_fork_a_child(pcp_unix_fd,
>> pcp_inet_fd, pcp_conf_file);
>> - pool_log("fork a new PCP child pid %d", pcp_pid);
>> - }
>> -
>> - /* exiting process was worker process */
>> - else if (pid == worker_pid)
>> - {
>> - if (WIFSIGNALED(status))
>> - pool_log("worker child %d exits with
>> status %d by signal %d", pid, status, WTERMSIG(status));
>> - else
>> - pool_log("worker child %d exits with
>> status %d", pid, status);
>> -
>> - if (status)
>> - worker_pid = worker_fork_a_child();
>> -
>> - pool_log("fork a new worker child pid %d",
>> worker_pid);
>> - }
>> -
>> - /* exiting process was watchdog process */
>> - else if (pool_config->use_watchdog &&
>> wd_is_watchdog_pid(pid))
>> - {
>> - if (!wd_reaper_watchdog(pid, status))
>> - {
>> - pool_error("wd_reaper failed");
>> - myexit(1);
>> - }
>> - }
>> -
>> - else
>> - {
>> - if (WIFSIGNALED(status))
>> - pool_debug("child %d exits with status %d
>> by signal %d", pid, status, WTERMSIG(status));
>> - else
>> - pool_debug("child %d exits with status
>> %d", pid, status);
>> -
>> - /* look for exiting child's pid */
>> - for (i=0;i<pool_config->num_init_children;i++)
>> - {
>> - if (pid == process_info[i].pid)
>> - {
>> - /* if found, fork a new child */
>> - if (!switching && !exiting &&
>> status)
>> - {
>> - process_info[i].pid =
>> fork_a_child(unix_fd, inet_fd, i);
>> -
>> process_info[i].start_time = time(NULL);
>> - pool_debug("fork a new
>> child pid %d", process_info[i].pid);
>> - break;
>> - }
>> - }
>> - }
>> - }
>> - }
>> - pool_debug("reap_handler: normally exited");
>> -}
>> -
>> -/*
>> - * get node information specified by node_number
>> - */
>> -BackendInfo *
>> -pool_get_node_info(int node_number)
>> -{
>> - if (node_number < 0 || node_number >= NUM_BACKENDS)
>> - return NULL;
>> -
>> - return &BACKEND_INFO(node_number);
>> -}
>> -
>> -/*
>> - * get number of nodes
>> - */
>> -int
>> -pool_get_node_count(void)
>> -{
>> - return NUM_BACKENDS;
>> -}
>> -
>> -/*
>> - * get process ids
>> - */
>> -int *
>> -pool_get_process_list(int *array_size)
>> -{
>> - int *array;
>> - int i;
>> -
>> - *array_size = pool_config->num_init_children;
>> - array = calloc(*array_size, sizeof(int));
>> - for (i = 0; i < *array_size; i++)
>> - array[i] = process_info[i].pid;
>> -
>> - return array;
>> -}
>> -
>> -/*
>> - * get process information specified by pid
>> - */
>> -ProcessInfo *
>> -pool_get_process_info(pid_t pid)
>> -{
>> - int i;
>> -
>> - for (i = 0; i < pool_config->num_init_children; i++)
>> - if (process_info[i].pid == pid)
>> - return &process_info[i];
>> -
>> - return NULL;
>> -}
>> -
>> -/*
>> - * get System DB information
>> - */
>> -SystemDBInfo *
>> -pool_get_system_db_info(void)
>> -{
>> - if (system_db_info == NULL)
>> - return NULL;
>> -
>> - return system_db_info->info;
>> -}
>> -
>> -
>> -/*
>> - * handle SIGUSR2
>> - * Wakeup all processes
>> - */
>> -static void wakeup_children(void)
>> -{
>> - kill_all_children(SIGUSR2);
>> -}
>> -
>> -
>> -static RETSIGTYPE wakeup_handler(int sig)
>> -{
>> - POOL_SETMASK(&BlockSig);
>> - wakeup_request = 1;
>> - write(pipe_fds[1], "\0", 1);
>> - POOL_SETMASK(&UnBlockSig);
>> -}
>> -
>> -/*
>> - * handle SIGHUP
>> - *
>> - */
>> -static RETSIGTYPE reload_config_handler(int sig)
>> -{
>> - POOL_SETMASK(&BlockSig);
>> - reload_config_request = 1;
>> - write(pipe_fds[1], "\0", 1);
>> - POOL_SETMASK(&UnBlockSig);
>> -}
>> -
>> -static void reload_config(void)
>> -{
>> - pool_log("reload config files.");
>> - pool_get_config(conf_file, RELOAD_CONFIG);
>> - if (pool_config->enable_pool_hba)
>> - load_hba(hba_file);
>> - if (pool_config->parallel_mode)
>> - pool_memset_system_db_info(system_db_info->info);
>> - kill_all_children(SIGHUP);
>> -
>> - if (worker_pid)
>> - kill(worker_pid, SIGHUP);
>> -}
>> -
>> -static void kill_all_children(int sig)
>> -{
>> - int i;
>> -
>> - /* kill all children */
>> - for (i = 0; i < pool_config->num_init_children; i++)
>> - {
>> - pid_t pid = process_info[i].pid;
>> - if (pid)
>> - {
>> - kill(pid, sig);
>> - }
>> - }
>> -
>> - /* make PCP process reload as well */
>> - if (sig == SIGHUP)
>> - kill(pcp_pid, sig);
>> -}
>> -
>> -/*
>> - * pause in a period specified by timeout. If any data is coming
>> - * through pipe_fds[0], that means one of: failover request(SIGUSR1),
>> - * SIGCHLD received, children wake up request(SIGUSR2 used in on line
>> - * recovery processing) or config file reload request(SIGHUP) has been
>> - * occurred. In this case this function returns 1.
>> - * otherwise 0: (no signal event occurred), -1: (error)
>> - * XXX: is it OK that select(2) error is ignored here?
>> - */
>> -static int pool_pause(struct timeval *timeout)
>> -{
>> - fd_set rfds;
>> - int n;
>> - char dummy;
>> -
>> - FD_ZERO(&rfds);
>> - FD_SET(pipe_fds[0], &rfds);
>> - n = select(pipe_fds[0]+1, &rfds, NULL, NULL, timeout);
>> - if (n == 1)
>> - read(pipe_fds[0], &dummy, 1);
>> - return n;
>> -}
>> -
>> -/*
>> - * sleep for seconds specified by "second". Unlike pool_pause(), this
>> - * function guarantees that it will sleep for specified seconds. This
>> - * function uses pool_pause() internally. If it informs that there is
>> - * a pending signal event, they are processed using CHECK_REQUEST
>> - * macro. Note that most of these processes are done while all signals
>> - * are blocked.
>> - */
>> -void pool_sleep(unsigned int second)
>> -{
>> - struct timeval current_time, sleep_time;
>> -
>> - gettimeofday(¤t_time, NULL);
>> - sleep_time.tv_sec = second + current_time.tv_sec;
>> - sleep_time.tv_usec = current_time.tv_usec;
>> -
>> - POOL_SETMASK(&UnBlockSig);
>> - while (sleep_time.tv_sec > current_time.tv_sec)
>> - {
>> - struct timeval timeout;
>> - int r;
>> -
>> - timeout.tv_sec = sleep_time.tv_sec - current_time.tv_sec;
>> - timeout.tv_usec = sleep_time.tv_usec -
>> current_time.tv_usec;
>> - if (timeout.tv_usec < 0)
>> - {
>> - timeout.tv_sec--;
>> - timeout.tv_usec += 1000000;
>> - }
>> -
>> - r = pool_pause(&timeout);
>> - POOL_SETMASK(&BlockSig);
>> - if (r > 0)
>> - CHECK_REQUEST;
>> - POOL_SETMASK(&UnBlockSig);
>> - gettimeofday(¤t_time, NULL);
>> - }
>> - POOL_SETMASK(&BlockSig);
>> -}
>> -
>> -/*
>> - * get_config_file_name: return full path of pgpool.conf.
>> - */
>> -char *get_config_file_name(void)
>> -{
>> - return conf_file;
>> -}
>> -
>> -/*
>> - * get_config_file_name: return full path of pool_hba.conf.
>> - */
>> -char *get_hba_file_name(void)
>> + * get_config_file_name: return full path of pool_hba.conf.
>> + */
>> +char *get_hba_file_name(void)
>> {
>> return hba_file;
>> }
>> -
>> -/*
>> - * trigger_failover_command: execute specified command at failover.
>> - * command_line is null-terminated string.
>> - */
>> -static int trigger_failover_command(int node, const char *command_line,
>> -
>> int old_master, int new_master, int old_primary)
>> -{
>> - int r = 0;
>> - String *exec_cmd;
>> - char port_buf[6];
>> - char buf[2];
>> - BackendInfo *info;
>> - BackendInfo *newmaster;
>> -
>> - if (command_line == NULL || (strlen(command_line) == 0))
>> - return 0;
>> -
>> - /* check failed nodeID */
>> - if (node < 0 || node > NUM_BACKENDS)
>> - return -1;
>> -
>> - info = pool_get_node_info(node);
>> - if (!info)
>> - return -1;
>> -
>> - buf[1] = '\0';
>> - pool_memory = pool_memory_create(PREPARE_BLOCK_SIZE);
>> - if (!pool_memory)
>> - {
>> - pool_error("trigger_failover_command:
>> pool_memory_create() failed");
>> - return -1;
>> - }
>> - exec_cmd = init_string("");
>> -
>> - while (*command_line)
>> - {
>> - if (*command_line == '%')
>> - {
>> - if (*(command_line + 1))
>> - {
>> - char val = *(command_line + 1);
>> - switch (val)
>> - {
>> - case 'p': /* failed node port */
>> - snprintf(port_buf,
>> sizeof(port_buf), "%d", info->backend_port);
>> -
>> string_append_char(exec_cmd, port_buf);
>> - break;
>> -
>> - case 'D': /* failed node database
>> directory */
>> -
>> string_append_char(exec_cmd, info->backend_data_directory);
>> - break;
>> -
>> - case 'd': /* failed node id */
>> - snprintf(port_buf,
>> sizeof(port_buf), "%d", node);
>> -
>> string_append_char(exec_cmd, port_buf);
>> - break;
>> -
>> - case 'h': /* failed host name */
>> -
>> string_append_char(exec_cmd, info->backend_hostname);
>> - break;
>> -
>> - case 'H': /* new master host name
>> */
>> - newmaster =
>> pool_get_node_info(new_master);
>> - if (newmaster)
>> -
>> string_append_char(exec_cmd, newmaster->backend_hostname);
>> - else
>> - /* no valid new
>> master */
>> -
>> string_append_char(exec_cmd, "");
>> - break;
>> -
>> - case 'm': /* new master node id */
>> - snprintf(port_buf,
>> sizeof(port_buf), "%d", new_master);
>> -
>> string_append_char(exec_cmd, port_buf);
>> - break;
>> -
>> - case 'r': /* new master port */
>> - newmaster =
>> pool_get_node_info(get_next_master_node());
>> - if (newmaster)
>> - {
>> -
>> snprintf(port_buf, sizeof(port_buf), "%d", newmaster->backend_port);
>> -
>> string_append_char(exec_cmd, port_buf);
>> - }
>> - else
>> - /* no valid new
>> master */
>> -
>> string_append_char(exec_cmd, "");
>> - break;
>> -
>> - case 'R': /* new master database
>> directory */
>> - newmaster =
>> pool_get_node_info(get_next_master_node());
>> - if (newmaster)
>> -
>> string_append_char(exec_cmd, newmaster->backend_data_directory);
>> - else
>> - /* no valid new
>> master */
>> -
>> string_append_char(exec_cmd, "");
>> - break;
>> -
>> - case 'M': /* old master node id */
>> - snprintf(port_buf,
>> sizeof(port_buf), "%d", old_master);
>> -
>> string_append_char(exec_cmd, port_buf);
>> - break;
>> -
>> - case 'P': /* old primary node id
>> */
>> - snprintf(port_buf,
>> sizeof(port_buf), "%d", old_primary);
>> -
>> string_append_char(exec_cmd, port_buf);
>> - break;
>> -
>> - case '%': /* escape */
>> -
>> string_append_char(exec_cmd, "%");
>> - break;
>> -
>> - default: /* ignore */
>> - break;
>> - }
>> - command_line++;
>> - }
>> - } else {
>> - buf[0] = *command_line;
>> - string_append_char(exec_cmd, buf);
>> - }
>> - command_line++;
>> - }
>> -
>> - if (strlen(exec_cmd->data) != 0)
>> - {
>> - pool_log("execute command: %s", exec_cmd->data);
>> - r = system(exec_cmd->data);
>> - }
>> -
>> - pool_memory_delete(pool_memory, 0);
>> - pool_memory = NULL;
>> -
>> - return r;
>> -}
>> -
>> -/*
>> - * Find the primary node (i.e. not standby node) and returns its node
>> - * id. If no primary node is found, returns -1.
>> - */
>> -static int find_primary_node(void)
>> -{
>> - BackendInfo *bkinfo;
>> - POOL_CONNECTION_POOL_SLOT *s;
>> - POOL_CONNECTION *con;
>> - POOL_STATUS status;
>> - POOL_SELECT_RESULT *res;
>> - bool is_standby;
>> - int i;
>> -
>> - /* Streaming replication mode? */
>> - if (pool_config->master_slave_mode == 0 ||
>> - strcmp(pool_config->master_slave_sub_mode,
>> MODE_STREAMREP))
>> - {
>> - /* No point to look for primary node if not in streaming
>> - * replication mode.
>> - */
>> - pool_debug("find_primary_node: not in streaming
>> replication mode");
>> - return -1;
>> - }
>> -
>> - for(i=0;i<NUM_BACKENDS;i++)
>> - {
>> - if (!VALID_BACKEND(i))
>> - continue;
>> -
>> - /*
>> - * Check to see if this is a standby node or not.
>> - */
>> - is_standby = false;
>> -
>> - bkinfo = pool_get_node_info(i);
>> - s =
>> make_persistent_db_connection(bkinfo->backend_hostname,
>> -
>> bkinfo->backend_port,
>> -
>> "postgres",
>> -
>> pool_config->sr_check_user,
>> -
>> pool_config->sr_check_password, true);
>> - if (!s)
>> - {
>> - pool_error("find_primary_node:
>> make_persistent_connection failed");
>> - return -1;
>> - }
>> - con = s->con;
>> - status = do_query(con, "SELECT pg_is_in_recovery()",
>> - &res, PROTO_MAJOR_V3);
>> - if (res->numrows <= 0)
>> - {
>> - pool_log("find_primary_node: do_query returns no
>> rows");
>> - }
>> - if (res->data[0] == NULL)
>> - {
>> - pool_log("find_primary_node: do_query returns no
>> data");
>> - }
>> - if (res->nullflags[0] == -1)
>> - {
>> - pool_log("find_primary_node: do_query returns
>> NULL");
>> - }
>> - if (res->data[0] && !strcmp(res->data[0], "t"))
>> - {
>> - is_standby = true;
>> - }
>> - free_select_result(res);
>> - discard_persistent_db_connection(s);
>> -
>> - /*
>> - * If this is a standby, we continue to look for primary
>> node.
>> - */
>> - if (is_standby)
>> - {
>> - pool_debug("find_primary_node: %d node is
>> standby", i);
>> - }
>> - else
>> - {
>> - break;
>> - }
>> - }
>> -
>> - if (i == NUM_BACKENDS)
>> - {
>> - pool_debug("find_primary_node: no primary node found");
>> - return -1;
>> - }
>> -
>> - pool_log("find_primary_node: primary node id is %d", i);
>> - return i;
>> -}
>> -
>> -static int find_primary_node_repeatedly(void)
>> +/* Call back function to unlink the file */
>> +/* Call back function to unlink the file */
>> +static void FileUnlink(int code, Datum path)
>> {
>> - int sec;
>> - int node_id = -1;
>> -
>> - /* Streaming replication mode? */
>> - if (pool_config->master_slave_mode == 0 ||
>> - strcmp(pool_config->master_slave_sub_mode,
>> MODE_STREAMREP))
>> - {
>> - /* No point to look for primary node if not in streaming
>> - * replication mode.
>> - */
>> - pool_debug("find_primary_node: not in streaming
>> replication mode");
>> - return -1;
>> - }
>> -
>> - /*
>> - * Try to find the new primary node and keep trying for
>> - * search_primary_node_timeout seconds.
>> - * search_primary_node_timeout = 0 means never timeout and keep
>> searching
>> - * indefinitely
>> + char* filePath = (char*)path;
>> + if (unlink(filePath) == 0) return;
>> + /*
>> + * We are already exiting the system just produce a log entry to
>> report an error
>> */
>> - pool_log("find_primary_node_repeatedly: waiting for finding a
>> primary node");
>> - for (sec = 0; (pool_config->search_primary_node_timeout == 0 ||
>> - sec <
>> pool_config->search_primary_node_timeout); sec++)
>> - {
>> - node_id = find_primary_node();
>> - if (node_id != -1)
>> - break;
>> - pool_sleep(1);
>> - }
>> - return node_id;
>> -}
>> -
>> -/*
>> -* fork a follow child
>> -*/
>> -pid_t fork_follow_child(int old_master, int new_primary, int old_primary)
>> -{
>> - pid_t pid;
>> - int i;
>> -
>> - pid = fork();
>> -
>> - if (pid == 0)
>> - {
>> - pool_log("start triggering follow command.");
>> - for (i = 0; i < pool_config->backend_desc->num_backends;
>> i++)
>> - {
>> - BackendInfo *bkinfo;
>> - bkinfo = pool_get_node_info(i);
>> - if (bkinfo->backend_status == CON_DOWN)
>> - trigger_failover_command(i,
>> pool_config->follow_master_command,
>> -
>> old_master, new_primary, old_primary);
>> - }
>> - exit(0);
>> - }
>> - else if (pid == -1)
>> - {
>> - pool_error("follow fork() failed. reason: %s",
>> strerror(errno));
>> - exit(1);
>> - }
>> - return pid;
>> + ereport(LOG,
>> + (errmsg("unlink failed for file at path \"%s\"",
>> filePath),
>> + errdetail("\"%s\"", strerror(errno))));
>> }
>>
>> ...
>>
>> [Message clipped]
>
>
>
>
> --
> Ahsan Hadi
> Snr Director Product Development
> EnterpriseDB Corporation
> The Enterprise Postgres Company
>
> Phone: +92-51-8358874
> Mobile: +92-333-5162114
>
> Website: www.enterprisedb.com
> EnterpriseDB Blog: http://blogs.enterprisedb.com/
> Follow us on Twitter: http://www.twitter.com/enterprisedb
>
> This e-mail message (and any attachment) is intended for the use of the
> individual or entity to whom it is addressed. This message contains
> information from EnterpriseDB Corporation that may be privileged,
> confidential, or exempt from disclosure under applicable law. If you are
> not the intended recipient or authorized to receive this for the intended
> recipient, any use, dissemination, distribution, retention, archiving, or
> copying of this communication is strictly prohibited. If you have received
> this e-mail in error, please notify the sender immediately by reply e-mail
> and delete this message.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20131011/9d491955/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: exmgr_1stCut_rebase.patch
Type: application/octet-stream
Size: 473461 bytes
Desc: not available
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20131011/9d491955/attachment-0001.obj>
More information about the pgpool-hackers
mailing list