[pgpool-general: 8280] Re: Pgpool reporting inconsistent statuses

Bo Peng pengbo at sraoss.co.jp
Fri Jul 1 13:57:39 JST 2022


Hello,

> Hi Bo,
> 
> Sorry for the radio silence, last week was pretty crazy. I have added 
> that setting to the primary server of my cluster, and the setting seems 
> to keep switching afterwards:
> 
> docker run --rm -e PGPASSWORD=adminpassword -it postgres:14 psql -h 
> 10.201.212.52 -p 5432 -U postgres postgres -c "SHOW synchronous_commit;"
>   synchronous_commit
> --------------------
>   remote_apply
> (1 row)
> 
> docker run --rm -e PGPASSWORD=adminpassword -it postgres:14 psql -h 
> 10.201.212.52 -p 5432 -U postgres postgres -c "SHOW synchronous_commit;"
>   synchronous_commit
> --------------------
>   on
> (1 row)
> 
> docker run --rm -e PGPASSWORD=adminpassword -it postgres:14 psql -h 
> 10.201.212.52 -p 5432 -U postgres postgres -c "SHOW synchronous_commit;"
>   synchronous_commit
> --------------------
>   on
> (1 row)
> 
> docker run --rm -e PGPASSWORD=adminpassword -it postgres:14 psql -h 
> 10.201.212.52 -p 5432 -U postgres postgres -c "SHOW synchronous_commit;"
>   synchronous_commit
> --------------------
>   remote_apply
> (1 row)
> 
> However, if I run the same command against port 5433 (so that I access 
> not pgpool but the primary server directly) I always get remote_apply.
> 
> If I run the same test I used to run (create the database, etc.) it 
> fails still, I guess that because of the parameter switching.
> 
> What could make this setting to not hold its state?

It is the expected behaviour of load balancing.
Pgpool-II routes the "SHOW synchronous_commit" queries to any PostgreSQL server.

If pgpool-II sends "SHOW synchronous_commit" to primary, the value will be "remote_apply".
If pgpool-II sends "SHOW synchronous_commit" to standby, the value will be "on".


Your "show pool_nodes" shows that two standby servers are in "async" state.
I think you need to configure "synchronous_standby_names" parameter in PostgreSQL.

  node_id |  hostname  | port | status | pg_status | lb_weight |  role   | pg_role |  replication_state | replication_sync_state 
---------+------------+------+--------+-----------+-----------+---------+---------+---------------------+------------------------
  0       | pgsql-0000 | 5433 | up     | up        | 0.333333  | primary | primary |                    |                        
  1       | pgsql-0001 | 5433 | up     | up        | 0.333333  | standby | standby |  streaming         | async                  
  2       | pgsql-0002 | 5433 | up     | up        | 0.333333  | standby | standby |  streaming         | async                  


Regards,

> > Thank you for sharing the logs.
> > 
> > I have figured out the cause.
> > In your test you ran the following queries.
> > 
> >   1. create database testdb.
> >   2. connect to database testdb via pgpool
> > 
> > Because of the replication deplay in streaming replication mode,
> > "testdb" was not yet applied in standby servers when pgpool tried to
> > conntect to "testdb".
> > 
> > There are three workarounds:
> > 
> > 1. Use synchronous replication with "synchronous_commit =
> > remote_apply" in PostgreSQL
> > 2. Use Pgpool-II snapshot isolation mode
> > 
> > https://www.pgpool.net/docs/latest/en/html/runtime-config-running-mode.html#GUC-SNAPSHOT-ISOLATION-MODE
> > 3. Add "sleep 1" after creating database testdb
> > 
> > Regards,
> > 
> >> Hello,
> >> 
> >> > Hi Bo,
> >> >
> >> > Thank you for your answer! The python script I am running is,
> >> > summarizing:
> >> >
> >> >              engine_pool =
> >> > create_engine('postgresql://postgres:password@ip:5432/testdb',
> >> > echo=True)
> >> >              if database_exists(engine_pool.url):
> >> >                  LOGGER.info('Dropping the DB')
> >> >                  drop_database(engine_pool.url)
> >> > #                time.sleep(1)
> >> >
> >> >              LOGGER.info('Creating the DB')
> >> >              create_database(engine_pool.url)
> >> > #            time.sleep(1)
> >> >              Session_pool = sessionmaker(bind=engine_pool)
> >> >
> >> >              LOGGER.info('Creating all tables')
> >> >              Base.metadata.create_all(engine_pool)
> >> > #            time.sleep(1)
> >> >
> >> >              try:
> >> >                  with Session_pool.begin() as session:
> >> >                      LOGGER.info('Adding record %d to primary', rand_id)
> >> >                      session.add(Message(message=f'Hello world
> >> > {rand_id}!'))
> >> >              except sqlalchemy.exc.OperationalError:
> >> >                  raise RuntimeError("Objects could not be committed to
> >> > db")
> >> >
> >> >              try:
> >> >                  with Session_pool.begin() as session:
> >> >                      for i in session.query(Message).all():
> >> >                          LOGGER.info('Checking record on primary')
> >> >                          if i.message != f'Hello world {rand_id}!':
> >> >                             raise RuntimeError('Data did not reach the
> >> > pool')
> >> >              except sqlalchemy.exc.OperationalError:
> >> >                  raise RuntimeError("Data is not present at the pool")
> >> >
> >> > Please, note the 3 sleep statements: If those statements are in, so the
> >> > sleeps are executed, everything works:
> >> 
> >> Thank you!
> >> 
> >> Because I would like to check the queries issued to Pgpool-II,
> >> could you turn on the following paramter and run your test again?
> >> 
> >>   log_per_node_statement = on
> >> 
> >> After your test, could you share the following informations?
> >> 
> >> - pgpool.conf
> >> - pgpool.log
> >> - logs of all PostgreSQL servers
> >> 
> >> > 2022-06-16 15:31:31,752 INFO sqlalchemy.engine.Engine select
> >> > pg_catalog.version()
> >> > 2022-06-16 15:31:31,756 INFO sqlalchemy.engine.Engine select
> >> > current_schema()
> >> > 2022-06-16 15:31:31,759 INFO sqlalchemy.engine.Engine show
> >> > standard_conforming_strings
> >> > 2022-06-16 15:31:31,762 INFO sqlalchemy.engine.Engine SELECT 1 FROM
> >> > pg_database WHERE datname='testdb'
> >> > 2022-06-16 15:31:31,828 INFO sqlalchemy.engine.Engine select
> >> > pg_catalog.version()
> >> > 2022-06-16 15:31:31,830 INFO sqlalchemy.engine.Engine select
> >> > current_schema()
> >> > 2022-06-16 15:31:31,832 INFO sqlalchemy.engine.Engine show
> >> > standard_conforming_strings
> >> > 2022-06-16 15:31:31,833 INFO sqlalchemy.engine.Engine CREATE DATABASE
> >> > testdb ENCODING 'utf8' TEMPLATE template1
> >> > 2022-06-16 15:31:32,423 INFO sqlalchemy.engine.Engine COMMIT using DBAPI
> >> > connection.commit(), DBAPI should ignore due to autocommit mode
> >> > 2022-06-16 15:31:33,511 INFO sqlalchemy.engine.Engine select
> >> > pg_catalog.version()
> >> > 2022-06-16 15:31:33,514 INFO sqlalchemy.engine.Engine select
> >> > current_schema()
> >> > 2022-06-16 15:31:33,517 INFO sqlalchemy.engine.Engine show
> >> > standard_conforming_strings
> >> > 2022-06-16 15:31:33,518 INFO sqlalchemy.engine.Engine BEGIN (implicit)
> >> > 2022-06-16 15:31:33,519 INFO sqlalchemy.engine.Engine select relname
> >> > from pg_class c join pg_namespace n on n.oid=c.relnamespace where
> >> > pg_catalog.pg_table_is_visible(c.oid) and relname=%(name)s
> >> > 2022-06-16 15:31:33,519 INFO sqlalchemy.engine.Engine [generated in
> >> > 0.00024s] {'name': 'messages'}
> >> > 2022-06-16 15:31:33,523 INFO sqlalchemy.engine.Engine
> >> > CREATE TABLE messages (
> >> > 	id SERIAL NOT NULL,
> >> > 	message VARCHAR,
> >> > 	PRIMARY KEY (id)
> >> > )
> >> >
> >> >
> >> > [15:31:33]
> >> > {/usr/local/lib64/python3.6/site-packages/sqlalchemy/log.py:118} INFO -
> >> > CREATE TABLE messages (
> >> > 	id SERIAL NOT NULL,
> >> > 	message VARCHAR,
> >> > 	PRIMARY KEY (id)
> >> > )
> >> >
> >> > [...]
> >> >
> >> > However, if instead of that, the sleep statements are out (so no sleeps
> >> > active), the same test fails:
> >> > 2022-06-16 15:43:29,575 INFO sqlalchemy.engine.Engine select
> >> > pg_catalog.version()
> >> > 2022-06-16 15:43:29,579 INFO sqlalchemy.engine.Engine select
> >> > current_schema()
> >> > 2022-06-16 15:43:29,582 INFO sqlalchemy.engine.Engine show
> >> > standard_conforming_strings
> >> > 2022-06-16 15:43:29,584 INFO sqlalchemy.engine.Engine SELECT 1 FROM
> >> > pg_database WHERE datname='testdb'
> >> > 2022-06-16 15:43:29,647 INFO sqlalchemy.engine.Engine select
> >> > pg_catalog.version()
> >> > 2022-06-16 15:43:29,650 INFO sqlalchemy.engine.Engine select
> >> > current_schema()
> >> > 2022-06-16 15:43:29,652 INFO sqlalchemy.engine.Engine show
> >> > standard_conforming_strings
> >> > 2022-06-16 15:43:29,653 INFO sqlalchemy.engine.Engine CREATE DATABASE
> >> > testdb ENCODING 'utf8' TEMPLATE template1
> >> > 2022-06-16 15:43:30,211 INFO sqlalchemy.engine.Engine COMMIT using DBAPI
> >> > connection.commit(), DBAPI should ignore due to autocommit mode
> >> > [15:43:30] {/home/felix.rubio-dalmau-b/test_pgsql.py:82} ERROR - Error:
> >> > (psycopg2.OperationalError) ERROR:  unable to read message kind
> >> > DETAIL:  kind does not match between main(53) slot[1] (45)
> >> >
> >> > Might you know if this a misconfiguration problem?
> >> >
> >> > Regards!
> >> >
> >> >
> >> > ---
> >> > Felix Rubio
> >> > "Don't believe what you're told. Double check."
> >> >
> >> > On 2022-06-16 07:26, Bo Peng wrote:
> >> > > Hello,
> >> > >
> >> > >> Hi everybody,
> >> > >>
> >> > >> I have managed to set up, from scratch, a postgresql 14 + pgpool 4.3.2
> >> > >> cluster, with streaming replication. I have also a simple test that
> >> > >> creates a database, a table on it, and adds some records, to then
> >> > >> check
> >> > >> on all the members of the cluster if replication has been successful.
> >> > >>
> >> > >> If I point my test to the primary node of the streaming replication
> >> > >> cluster, my test succeeds and all is OK.
> >> > >>
> >> > >> When I set up everything through pgpool, all seems to work:
> >> > >>
> >> > >>   node_id |  hostname  | port | status | pg_status | lb_weight |  role
> >> > >> | pg_role | select_cnt | load_balance_node | replication_delay |
> >> > >> replication_state | replication_sync_state | last_status_change
> >> > >> ---------+------------+------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> >> > >>   0       | pgsql-0000 | 5433 | up     | up        | 0.333333  |
> >> > >> primary
> >> > >> | primary | 0          | false             | 0                 |
> >> > >>            |                        | 2022-06-14 11:28:11
> >> > >>   1       | pgsql-0001 | 5433 | up     | up        | 0.333333  |
> >> > >> standby
> >> > >> | standby | 0          | false             | 0                 |
> >> > >> streaming         | async                  | 2022-06-14 11:28:11
> >> > >>   2       | pgsql-0002 | 5433 | up     | up        | 0.333333  |
> >> > >> standby
> >> > >> | standby | 0          | true              | 0                 |
> >> > >> streaming         | async                  | 2022-06-14 11:28:11
> >> > >>
> >> > >> If now run my test, this is what I get back:
> >> > >> "DETAIL:  kind does not match between main(53) slot[1] (45)"
> >> > >
> >> > > During you test, what queries were issued?
> >> > >
> >> > >> I have found an article claiming this is a problem caused by the
> >> > >> number
> >> > >> of connections. To this end, I have max_connections=40 in postgresql,
> >> > >> and num_init_children=7 and max_pool=5 in pgpool.conf. As 5*7 < 40, I
> >> > >> should ok on the side. Does anybody has a clue on what might be going
> >> > >> on, here?
> >> > >>
> >> > >> Regards!
> >> > >>
> >> > >> --
> >> > >> Felix Rubio
> >> > >> "Don't believe what you're told. Double check."
> >> > >> _______________________________________________
> >> > >> pgpool-general mailing list
> >> > >> pgpool-general at pgpool.net
> >> > >> http://www.pgpool.net/mailman/listinfo/pgpool-general
> >> 
> >> 
> >> --
> >> Bo Peng <pengbo at sraoss.co.jp>
> >> SRA OSS, Inc. Japan
> >> https://www.sraoss.co.jp/
> >> _______________________________________________
> >> pgpool-general mailing list
> >> pgpool-general at pgpool.net
> >> http://www.pgpool.net/mailman/listinfo/pgpool-general


-- 
Bo Peng <pengbo at sraoss.co.jp>
SRA OSS, Inc. Japan
https://www.sraoss.co.jp/


More information about the pgpool-general mailing list