[pgpool-general: 8236] Re: Pgpool reporting inconsistent statuses

Bo Peng pengbo at sraoss.co.jp
Wed Jun 22 18:07:29 JST 2022


Hello,

Thank you for sharing the logs.

I have figured out the cause.
In your test you ran the following queries.

  1. create database testdb. 
  2. connect to database testdb via pgpool

Because of the replication deplay in streaming replication mode,
"testdb" was not yet applied in standby servers when pgpool tried to conntect to "testdb".

There are three workarounds:

1. Use synchronous replication with "synchronous_commit = remote_apply" in PostgreSQL
2. Use Pgpool-II snapshot isolation mode
   https://www.pgpool.net/docs/latest/en/html/runtime-config-running-mode.html#GUC-SNAPSHOT-ISOLATION-MODE
3. Add "sleep 1" after creating database testdb

Regards,

> Hello,
> 
> > Hi Bo,
> > 
> > Thank you for your answer! The python script I am running is, 
> > summarizing:
> > 
> >              engine_pool = 
> > create_engine('postgresql://postgres:password@ip:5432/testdb', 
> > echo=True)
> >              if database_exists(engine_pool.url):
> >                  LOGGER.info('Dropping the DB')
> >                  drop_database(engine_pool.url)
> > #                time.sleep(1)
> > 
> >              LOGGER.info('Creating the DB')
> >              create_database(engine_pool.url)
> > #            time.sleep(1)
> >              Session_pool = sessionmaker(bind=engine_pool)
> > 
> >              LOGGER.info('Creating all tables')
> >              Base.metadata.create_all(engine_pool)
> > #            time.sleep(1)
> > 
> >              try:
> >                  with Session_pool.begin() as session:
> >                      LOGGER.info('Adding record %d to primary', rand_id)
> >                      session.add(Message(message=f'Hello world 
> > {rand_id}!'))
> >              except sqlalchemy.exc.OperationalError:
> >                  raise RuntimeError("Objects could not be committed to 
> > db")
> > 
> >              try:
> >                  with Session_pool.begin() as session:
> >                      for i in session.query(Message).all():
> >                          LOGGER.info('Checking record on primary')
> >                          if i.message != f'Hello world {rand_id}!':
> >                             raise RuntimeError('Data did not reach the 
> > pool')
> >              except sqlalchemy.exc.OperationalError:
> >                  raise RuntimeError("Data is not present at the pool")
> > 
> > Please, note the 3 sleep statements: If those statements are in, so the 
> > sleeps are executed, everything works:
> 
> Thank you!
> 
> Because I would like to check the queries issued to Pgpool-II,
> could you turn on the following paramter and run your test again?
> 
>   log_per_node_statement = on
> 
> After your test, could you share the following informations?
> 
> - pgpool.conf
> - pgpool.log
> - logs of all PostgreSQL servers
> 
> > 2022-06-16 15:31:31,752 INFO sqlalchemy.engine.Engine select 
> > pg_catalog.version()
> > 2022-06-16 15:31:31,756 INFO sqlalchemy.engine.Engine select 
> > current_schema()
> > 2022-06-16 15:31:31,759 INFO sqlalchemy.engine.Engine show 
> > standard_conforming_strings
> > 2022-06-16 15:31:31,762 INFO sqlalchemy.engine.Engine SELECT 1 FROM 
> > pg_database WHERE datname='testdb'
> > 2022-06-16 15:31:31,828 INFO sqlalchemy.engine.Engine select 
> > pg_catalog.version()
> > 2022-06-16 15:31:31,830 INFO sqlalchemy.engine.Engine select 
> > current_schema()
> > 2022-06-16 15:31:31,832 INFO sqlalchemy.engine.Engine show 
> > standard_conforming_strings
> > 2022-06-16 15:31:31,833 INFO sqlalchemy.engine.Engine CREATE DATABASE 
> > testdb ENCODING 'utf8' TEMPLATE template1
> > 2022-06-16 15:31:32,423 INFO sqlalchemy.engine.Engine COMMIT using DBAPI 
> > connection.commit(), DBAPI should ignore due to autocommit mode
> > 2022-06-16 15:31:33,511 INFO sqlalchemy.engine.Engine select 
> > pg_catalog.version()
> > 2022-06-16 15:31:33,514 INFO sqlalchemy.engine.Engine select 
> > current_schema()
> > 2022-06-16 15:31:33,517 INFO sqlalchemy.engine.Engine show 
> > standard_conforming_strings
> > 2022-06-16 15:31:33,518 INFO sqlalchemy.engine.Engine BEGIN (implicit)
> > 2022-06-16 15:31:33,519 INFO sqlalchemy.engine.Engine select relname 
> > from pg_class c join pg_namespace n on n.oid=c.relnamespace where 
> > pg_catalog.pg_table_is_visible(c.oid) and relname=%(name)s
> > 2022-06-16 15:31:33,519 INFO sqlalchemy.engine.Engine [generated in 
> > 0.00024s] {'name': 'messages'}
> > 2022-06-16 15:31:33,523 INFO sqlalchemy.engine.Engine
> > CREATE TABLE messages (
> > 	id SERIAL NOT NULL,
> > 	message VARCHAR,
> > 	PRIMARY KEY (id)
> > )
> > 
> > 
> > [15:31:33] 
> > {/usr/local/lib64/python3.6/site-packages/sqlalchemy/log.py:118} INFO -
> > CREATE TABLE messages (
> > 	id SERIAL NOT NULL,
> > 	message VARCHAR,
> > 	PRIMARY KEY (id)
> > )
> > 
> > [...]
> > 
> > However, if instead of that, the sleep statements are out (so no sleeps 
> > active), the same test fails:
> > 2022-06-16 15:43:29,575 INFO sqlalchemy.engine.Engine select 
> > pg_catalog.version()
> > 2022-06-16 15:43:29,579 INFO sqlalchemy.engine.Engine select 
> > current_schema()
> > 2022-06-16 15:43:29,582 INFO sqlalchemy.engine.Engine show 
> > standard_conforming_strings
> > 2022-06-16 15:43:29,584 INFO sqlalchemy.engine.Engine SELECT 1 FROM 
> > pg_database WHERE datname='testdb'
> > 2022-06-16 15:43:29,647 INFO sqlalchemy.engine.Engine select 
> > pg_catalog.version()
> > 2022-06-16 15:43:29,650 INFO sqlalchemy.engine.Engine select 
> > current_schema()
> > 2022-06-16 15:43:29,652 INFO sqlalchemy.engine.Engine show 
> > standard_conforming_strings
> > 2022-06-16 15:43:29,653 INFO sqlalchemy.engine.Engine CREATE DATABASE 
> > testdb ENCODING 'utf8' TEMPLATE template1
> > 2022-06-16 15:43:30,211 INFO sqlalchemy.engine.Engine COMMIT using DBAPI 
> > connection.commit(), DBAPI should ignore due to autocommit mode
> > [15:43:30] {/home/felix.rubio-dalmau-b/test_pgsql.py:82} ERROR - Error: 
> > (psycopg2.OperationalError) ERROR:  unable to read message kind
> > DETAIL:  kind does not match between main(53) slot[1] (45)
> > 
> > Might you know if this a misconfiguration problem?
> > 
> > Regards!
> > 
> > 
> > ---
> > Felix Rubio
> > "Don't believe what you're told. Double check."
> > 
> > On 2022-06-16 07:26, Bo Peng wrote:
> > > Hello,
> > > 
> > >> Hi everybody,
> > >> 
> > >> I have managed to set up, from scratch, a postgresql 14 + pgpool 4.3.2
> > >> cluster, with streaming replication. I have also a simple test that
> > >> creates a database, a table on it, and adds some records, to then 
> > >> check
> > >> on all the members of the cluster if replication has been successful.
> > >> 
> > >> If I point my test to the primary node of the streaming replication
> > >> cluster, my test succeeds and all is OK.
> > >> 
> > >> When I set up everything through pgpool, all seems to work:
> > >> 
> > >>   node_id |  hostname  | port | status | pg_status | lb_weight |  role
> > >> | pg_role | select_cnt | load_balance_node | replication_delay |
> > >> replication_state | replication_sync_state | last_status_change
> > >> ---------+------------+------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> > >>   0       | pgsql-0000 | 5433 | up     | up        | 0.333333  | 
> > >> primary
> > >> | primary | 0          | false             | 0                 |
> > >>            |                        | 2022-06-14 11:28:11
> > >>   1       | pgsql-0001 | 5433 | up     | up        | 0.333333  | 
> > >> standby
> > >> | standby | 0          | false             | 0                 |
> > >> streaming         | async                  | 2022-06-14 11:28:11
> > >>   2       | pgsql-0002 | 5433 | up     | up        | 0.333333  | 
> > >> standby
> > >> | standby | 0          | true              | 0                 |
> > >> streaming         | async                  | 2022-06-14 11:28:11
> > >> 
> > >> If now run my test, this is what I get back:
> > >> "DETAIL:  kind does not match between main(53) slot[1] (45)"
> > > 
> > > During you test, what queries were issued?
> > > 
> > >> I have found an article claiming this is a problem caused by the 
> > >> number
> > >> of connections. To this end, I have max_connections=40 in postgresql,
> > >> and num_init_children=7 and max_pool=5 in pgpool.conf. As 5*7 < 40, I
> > >> should ok on the side. Does anybody has a clue on what might be going
> > >> on, here?
> > >> 
> > >> Regards!
> > >> 
> > >> --
> > >> Felix Rubio
> > >> "Don't believe what you're told. Double check."
> > >> _______________________________________________
> > >> pgpool-general mailing list
> > >> pgpool-general at pgpool.net
> > >> http://www.pgpool.net/mailman/listinfo/pgpool-general
> 
> 
> -- 
> Bo Peng <pengbo at sraoss.co.jp>
> SRA OSS, Inc. Japan
> https://www.sraoss.co.jp/
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general


-- 
Bo Peng <pengbo at sraoss.co.jp>
SRA OSS, Inc. Japan
https://www.sraoss.co.jp/


More information about the pgpool-general mailing list