[pgpool-general: 2048] Re: Suggestion about two datacenters connected through WAN
Tatsuo Ishii
ishii at postgresql.org
Mon Aug 19 07:27:45 JST 2013
I'm not familiar with pacemaker at all, so this is just a guess. Do
you sync PostgreSQL database cluster using DRBD? If so, I think you
should not do that. PostgreSQL modifies the database through the file
system mounted and in the mean time DRBD modifies the file system,
which would lead to corruption.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
> Hi Tatsuo.
>
>> What kind of problem do yo have with PostgreSQL streaming replication?
>
> I don't know if this is the right forum, but maybe you can help me. The issue does not concern directly pgpool, but I'd like to use pgpool if I solve this one. I already wrote to pgsql-general. Nobody answered yet.
>
> In summary, the main issue rises after I set up streaming replication - I am unable to stop postgresql service correctly on master. After issuing /etc/init.d/postgresql-9.2 stop the postmaster.pid remains on the filesystem and moreover it is corrupted. I am unable to delete it with rm command.
>
> It looks like this:
> [root at tstcaps01 ~]# ll /var/lib/pgsql/9.2/data/
> ls: cannot access /var/lib/pgsql/9.2/data/postmaster.pid: No such file or directory
> total 56
> drwx------ 7 postgres postgres 62 Jun 26 17:13 base
> drwx------ 2 postgres postgres 4096 Aug 18 00:25 global
> drwx------ 2 postgres postgres 17 Jun 26 09:54 pg_clog
> -rw------- 1 postgres postgres 5127 Aug 17 16:24 pg_hba.conf
> -rw------- 1 postgres postgres 1636 Jun 26 09:54 pg_ident.conf
> drwx------ 2 postgres postgres 4096 Jul 2 00:00 pg_log
> drwx------ 4 postgres postgres 34 Jun 26 09:53 pg_multixact
> drwx------ 2 postgres postgres 17 Aug 18 00:23 pg_notify
> drwx------ 2 postgres postgres 6 Jun 26 09:53 pg_serial
> drwx------ 2 postgres postgres 6 Jun 26 09:53 pg_snapshots
> drwx------ 2 postgres postgres 6 Aug 18 00:25 pg_stat_tmp
> drwx------ 2 postgres postgres 17 Jun 26 09:54 pg_subtrans
> drwx------ 2 postgres postgres 6 Jun 26 09:53 pg_tblspc
> drwx------ 2 postgres postgres 6 Jun 26 09:53 pg_twophase
> -rw------- 1 postgres postgres 4 Jun 26 09:53 PG_VERSION
> drwx------ 3 postgres postgres 4096 Aug 18 00:25 pg_xlog
> -rw------- 1 postgres postgres 19884 Aug 17 22:54 postgresql.conf
> -rw------- 1 postgres postgres 71 Aug 18 00:23 postmaster.opts
> ?????????? ? ? ? ? ? postmaster.pid
> -rw-r--r-- 1 postgres postgres 491 Aug 17 16:33 recovery.done
>
> Have you been in this kind of curious situation before? Did you solve it somehow?
>
> I will try to explain whole situation and how I got into it.
>
> The scenario of redundant environment is in the "graphic" representation... (http://www.asciiflow.com/#4899844131549967831)
>
> +------------------------------------+
> | WAN |
> +-----+-----+------------+ +-----v------+------------+
> |pgpool | | |pgpool | |
> +------------+------------+ +------------+------------+
> |pgsql |pgsql | |pgsql |pgsql |
> +------------+------------+ +------------+------------+
> |drbd-pri |drbd-sec | |drbd-pri |drbd-sec |
> +------------+------------+ +------------+------------+
> | pacemaker | | pacemaker |
> +-------------------------+ +--------------------------+
> | corosync | | corosync |
> +------------+------------+ +------------+------------+
> |node1 |node2 | |node1 |node2 |
> +------------+------------+ +------------+------------+
> TC1 TC2
>
> In one moment there is only one postgresql active in each technical center. Pgpool is currently not managed by pacemaker, because I did want to test it. After it works I will make it managed by pacemaker using pgpool-ha resource agent.
>
> Before streaming replication was established from TC1 to TC2, the migration of resources managed by pacemaker from node1 to node2 within TC1 has been successful.
> After I established streaming replication and tried to move resources (including pgsql) from node1 to node2, migration of postgres resource failed. And I ended up with aforementioned corrupted postmaster.pid file on the filesystem of node1. Pacemaker did actually kill postgres process but I think it somehow checks if the postmaster.pid still exists or not. If the pacemaker find postmaster.pid is still there it ends up with FAILED status.
> Now I am stucked with this postmaster.pid file and cannot continue further with debugging. I cannot start postgres server because even if I start it there are two identical postmaster.pid files. These are not clean conditions for testing and investigating.
>
> I would be grateful if I can get behind this issue. The day would be nicer then :-)
>
> Best regards,
> Michal Mistina
>
>
More information about the pgpool-general
mailing list