View Issue Details

IDProjectCategoryView StatusLast Update
0000662pgpool-HABugpublic2020-11-30 12:42
Reporterjeffty_w Assigned To 
PriorityhighSeveritymajorReproducibilityalways
Status closedResolutionopen 
PlatformLinuxOSRHELOS Version7.7
Summary0000662: PgPool with WatchDog Cluster doesn't work after reboot all the machines
DescriptionHi there,

I installed Pgpool + WatchDog as the example https://www.pgpool.net/docs/latest/en/html/example-cluster.html described on 3 machines.
Today they are powered off by accident. After power on them I found that db2 and db3 are in 'down' status.
Is there a recommended way to auto start all of the nodes even after reboot?

RPMs:
postgresql: 12.3
pgpool-II-pg12: 4.1.2

Machines: jeffty-db11, jeffty-db2, jeffty-db3
VIP: 192.168.1.208
Services Configured during the installation:
pgpool service is not enabled,
systemctl enable postgresql-12 on every machine.

cat /usr/lib/systemd/system/postgresql-12.service
ExecStartPre=/usr/pgsql-12/bin/postgresql-12-check-db-dir ${PGDATA}
ExecStart=/usr/pgsql-12/bin/postgres -D ${PGDATA}

And here is how I initiated the recovery nodes on db1 several months ago:
pcp_recovery_node -h 192.168.1.208 -p 9898 -U pgpool -n 1
pcp_recovery_node -h 192.168.1.208 -p 9898 -U pgpool -n 2


Now after power on those machines, only the db1's postgresql service is active. I started the pgpool service manually one by one and check the watchdog info on db1:
-bash-4.2$ pcp_watchdog_info -h 192.168.1.208 -p 9898 -U pgpool

3 YES jeffty-db1:9999 Linux jeffty-db1 jeffty-db1

jeffty-db1:9999 Linux jeffty-db1 jeffty-db1 9999 9000 4 MASTER
jeffty-db2:9999 Linux jeffty-db2 jeffty-db2 9999 9000 7 STANDBY
jeffty-db3:9999 Linux jeffty-db3 jeffty-db3 9999 9000 7 STANDBY

Looks good. and I tried

-bash-4.2$ psql -h 192.168.1.208 -p 9999 -U pgpool postgres -c "show pool_nodes"

 node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+--------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0 | jeffty-db1 | 5432 | up | 0.333333 | primary | 247 | true | 0 | | | 2020-11-28 14:04:24
 1 | jeffty-db2 | 5432 | down | 0.333333 | standby | 0 | false | 0 | | | 2020-11-28 14:01:26
 2 | jeffty-db3 | 5432 | down | 0.333333 | standby | 0 | false | 0 | | | 2020-11-28 14:01:28
(3 rows)

db2 & db3 are in 'down' status. Then I login them and tried:
db2#su - postgres
-bash-4.2$ psql -U postgres
psql: error: could not connect to server: could not connect to server: No such file or directory
    Is the server running locally and accepting
    connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

So I have to start postgresql manually as
/usr/pgsql-12/bin/pg_ctl -D /var/lib/pgsql/12/data/ start

Now I can use psql -U postgres to login db2, but it seems that they are not in replicated since I create a table on db2 and cannot find the new table in db1.

And finally after rebooting:
1. If I don't manually start postgresql service on db2 & db3, then it turns into only one node cluster, db1 acts as the only database instance.
2. If I manually start postgresql service on db2 & db3, it should still be one node instance since changes on db2 won't be synchronized to db1.

What steps I missed during this process? Any suggestion or guidance would be appreciated.

Best Regards,
Jeffty




Steps To Reproduce1. Install pgpool 4.1.2 and postgresql 12.3 on 3 machines as https://www.pgpool.net/docs/latest/en/html/example-cluster.html described.
2. enable the postgresql-12 services.
3. don't enable pgpool services.
4. reboot all of the machines.
TagsNo tags attached.

Activities

pengbo

2020-11-30 11:47

developer   ~0003615

This is not a issue related "pgpool-HA".
Could you close this issue and create a new one with project "Pgpool-II"?

jeffty_w

2020-11-30 12:33

reporter   ~0003616

Sorry for that.
New ticket: https://www.pgpool.net/mantisbt/view.php?id=664

pengbo

2020-11-30 12:42

developer   ~0003618

Close issue

Issue History

Date Modified Username Field Change
2020-11-29 05:52 jeffty_w New Issue
2020-11-30 11:47 pengbo Note Added: 0003615
2020-11-30 12:33 jeffty_w Note Added: 0003616
2020-11-30 12:42 pengbo Status new => closed
2020-11-30 12:42 pengbo Note Added: 0003618