[Pgpool-general] General pgpool locking issues

Axel.Weber at rtl.de Axel.Weber at rtl.de
Mon Aug 7 12:55:51 UTC 2006


Thank you for you quick response.
I use this environment:
Gentoo server
[I--] [  ] dev-db/postgresql-8.0.4 (0)
[I--] [  ] dev-db/pgpool-3.0.1 (0) 

Sorry for missing the whole log statement, here it is:
pgpool: 2006-07-11 11:00:57 ERROR: pid 13899: pool_process_query: kind
does not match between backends master(E) secondary (D)


I will issue the select statement to the db itself as a next step.

Axel Weber

> You seem to use totally different envrionment this time. Last 
> time you used at least 7.4 or later version of PostgreSQL. 
> Now you use 7.3 or older version of PostgreSQL. I'm not sure 
> what version of pgpool now you are using.
> 
> Anyway, because your log seems imcomplete in the most important line, 
> 
> > pgpool: 2006-07-11 11:00:57 ERROR: pid 13899: 
> pool_process_query: kind 
> > does not match between backends master(E) secondar
> 
> this is just a guess: it seems your SELECT fails on master 
> (the log says master returns "E", which is an error message 
> packet from PostgreSQL). I don't know what secondary returned 
> since the log ends in a halfway, but it definitely returns 
> other than error packet. You can issue the select statement 
> to each PostgreSQL directly to see what happens.
> SELECT
> maz_media.id,maz_media.kassetten_id,maz_media.beschreibu
> ng,maz_media.title,maz_media.mandanten_id,maz_segment.timeanfa
> ng,maz_med
> ia.media_id,maz_media.entrytime,maz_media.entryoper,maz_media.dupop
> er,maz_segment.timelaenge,maz_segment.timeende from 
> maz_media, maz_segment where maz_media.id LIKE '%7%' and 
> maz_media.mandanten_id='1'
> and
> maz_media.media_id = maz_segment.media_id
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> 
> > Hi,
> > 
> > I got exactly the same behaviour, as I posted several weeks ago.
> > One strange thing about this all is, that the degeneration 
> happens on 
> > a regular schedule every 2 1/2 week.
> > 
> > Here you'll find my log and also the pgpool status result, 
> I hope this 
> > helps to analyse this.
> > I would like to emphasize that this particular select 
> statement lasts 
> > about 30 sec.
> > 
> > If you need further information, please don't hesitate to 
> get in touch 
> > with me.
> > 
> > Log:
> > 
> > pgpool: 2006-07-11 11:00:24 DEBUG: pid 13899: read kind 
> from frontend
> > Q(51)
> > pgpool: 2006-07-11 11:00:24 DEBUG: pid 13899: 
> pool_read_string: read 
> > all from pending data. po:0 len:0
> > pgpool: 2006-07-11 11:00:24 DEBUG: pid 13899: Query: SELECT 
> > maz_media.id,maz_media.kassetten_id,maz_media.beschreibu
> > 
> ng,maz_media.title,maz_media.mandanten_id,maz_segment.timeanfang,maz_m
> > ed 
> ia.media_id,maz_media.entrytime,maz_media.entryoper,maz_media.dupop
> > er,maz_segment.timelaenge,maz_segment.timeende from maz_media, 
> > maz_segment where maz_media.id LIKE '%7%' and 
> maz_media.mandanten_id='1'
> > and
> >  maz_media.media_id = maz_segment.media_id
> > pgpool: 2006-07-11 11:00:24 DEBUG: pid 13899: waiting for master 
> > completing the query
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: read kind 
> from backend P
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: pool_process_query:
> > waiting for secondary for data ready
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> pool_process_query: kind 
> > from backend: P
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> pool_read_string: read 
> > all from pending data. po:6 len:1017
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> pool_read_string: read 
> > all from pending data. po:6 len:1017
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: read kind 
> from backend 
> > pending data T len: 1016 po: 7
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> pool_process_query: kind 
> > from backend: T
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> pool_read_string: read 
> > all from pending data. po:12 len:1011
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> pool_read_string: read 
> > all from pending data. po:12 len:1011
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> RowDescription: type oid:
> > 1043
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: RowDescription: field
> > size: 65535
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> RowDescription: modifier:
> > 0
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> pool_read_string: read 
> > all from pending data. po:35 len:988
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> pool_read_string: read 
> > all from pending data. po:35 len:988
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> RowDescription: type oid:
> > 1043
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: RowDescription: field
> > size: 65535
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> RowDescription: modifier:
> > 0
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> pool_read_string: read 
> > all from pending data. po:58 len:965
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> pool_read_string: read 
> > all from pending data. po:58 len:965
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> RowDescription: type oid:
> > 1043
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: RowDescription: field
> > size: 65535
> > pgpool: 2006-07-11 11:00:28 DEBUG: pid 13899: 
> RowDescription: modifier:
> > 0
> >                         .
> >                         . (~ 30 sec. for select)
> >                         .
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13899: read kind 
> from backend 
> > pending data D len: 174 po: 476
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13899: 
> pool_process_query: kind 
> > from backend: D
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13899: AsciiRow: len: 6 data:
> > 211678
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13899: AsciiRow: len: 5 data:
> > S0671
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13899: AsciiRow: 
> len: 23 data:
> > T-D1 EINPACKEN MOTIV 67
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13899: AsciiRow: len: 4 data:
> > T-D1
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13899: AsciiRow: 
> len: 1 data: 1
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13899: AsciiRow: len: 5 data:
> > 19955
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13899: AsciiRow: len: 5 data:
> > 17136
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13899: AsciiRow: 
> len: 22 data:
> > 2000-10-26 13:02:00+00
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13899: AsciiRow: 
> len: 2 data: 
> > RC
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13899: AsciiRow: 
> len: 3 data: 
> > 504
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13899: AsciiRow: len: 5 data:
> > 20459
> > pgpool: 2006-07-11 11:00:57 ERROR: pid 13899: 
> pool_process_query: kind 
> > does not match between backends master(E) secondar
> > pgpool: 2006-07-11 11:00:57 LOG:   pid 13899: notice_backend_error:
> > master: 1 fail over request from pid 13899
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13891: 
> failover_handler called
> > pgpool: 2006-07-11 11:00:57 LOG:   pid 13891: starting degeneration.
> > shutdown master host clever(5432)
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13891: kill 13896
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13891: kill 13897
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13891: kill 13898
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13891: kill 13899
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13891: kill 13900
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13891: kill 13901
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13891: kill 13902
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13891: kill 13903
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13896: child 
> receives shutdown 
> > request signal 3
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13897: child 
> receives shutdown 
> > request signal 3
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13898: child 
> receives shutdown 
> > request signal 3
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13900: child 
> receives shutdown 
> > request signal 3
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13901: child 
> receives shutdown 
> > request signal 3
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13902: child 
> receives shutdown 
> > request signal 3
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13903: child 
> receives shutdown 
> > request signal 3
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 31938: I am 31938
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 31939: I am 31939
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 31940: I am 31940
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 31941: I am 31941
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 31942: I am 31942
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 31943: I am 31943
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 31944: I am 31944
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 31945: I am 31945
> > pgpool: 2006-07-11 11:00:57 LOG:   pid 13891: degeneration done.
> > shutdown master host clever(5432)
> > pgpool: 2006-07-11 11:00:57 DEBUG: pid 13891: reap_handler called
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: I am 31945 accept fd 6
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: Protocol 
> Major: 2 Minor: 
> > 0
> > database: SmartMaz user: abit
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: trying clear text 
> > password authentication
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: read kind 
> from backend 
> > pending data Z len: 0 po: 0
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: 
> pool_process_query: kind 
> > from backend: Z
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: read kind 
> from frontend
> > Q(51)
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: 
> pool_read_string: read 
> > all from pending data. po:0 len:0
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: Query:
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: read kind 
> from backend I
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: 
> pool_process_query: kind 
> > from backend: I
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: read kind 
> from backend 
> > pending data Z len: 0 po: 0
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: 
> pool_process_query: kind 
> > from backend: Z
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: read kind 
> from frontend
> > Q(51)
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: 
> pool_read_string: read 
> > all from pending data. po:0 len:0
> > pgpool: 2006-07-11 11:03:53 DEBUG: pid 31945: Query: select 
> version()
> > 
> > 
> > Pgpool status:
> > 
> > listen_addresses             | *
> > | host name(s) or IP address(es) to listen to
> >  port                         | 9999
> > | pgpool accepting port number
> >  socket_dir                   | /tmp
> > | pgpool socket directory
> >  backend_host_name            | clever
> > | master backend host name
> >  backend_port                 | 5432
> > | master backend port number
> >  secondary_backend_host_name  | smart
> > | secondary backend host name
> >  secondary_backend_port       | 5432
> > | secondary backend port number
> >  num_init_children            | 8
> > | # of children initially pre-forked
> >  child_life_time              | 0
> > | if idle for this seconds, child exits
> >  connection_life_time         | 0
> > | if idle for this seconds, connection closes
> >  child_max_connections        | 0
> > | if max_connections received, chile exits
> >  max_pool                     | 4
> > | max # of connection pool per child
> >  logdir                       | /var/run
> > | logging directory
> >  backend_socket_dir           | /tmp
> > | Unix domain socket directory for the PostgreSQL server
> >  replication_mode             | 1
> > | non 0 if operating in replication mode
> >  replication_strict           | 1
> > | non 0 if operating in strict mode
> >  replication_timeout          | 5000
> > | if secondary does not respond in this milli seconds, abort the 
> > | session
> >  load_balance_mode            | 0
> > | non 0 if operating in load balancing mode
> >  weight_master                | 0.500000
> > | weight of master
> >  weight_secondary             | 0.500000
> > | weight of secondary
> >  replication_stop_on_mismatch | 1
> > | stop replication mode on fatal error
> >  reset_query_list             | ABORT; RESET ALL; SET SESSION
> > AUTHORIZATION DEFAULT;  | queries issued at the end of session
> >  print_timestamp              | 1
> > | if true print time stamp to each log line
> >  master_slave_mode            | 0
> > | if true, operate in master/slave mode
> >  connection_cache             | 1
> > | if true, cache connection pool
> >  health_check_timeout         | 60
> > | health check timeout
> >  health_check_period          | 600
> > | health check period
> >  health_check_user            | postgres
> > | health check user
> >  insert_lock                  | 0
> > | insert lock
> >  ignore_leading_white_space   | 0
> > | ignore leading white spaces
> >  current_backend_host_name    | clever
> > | current master host name
> >  current_backend_port         | 5432
> > | current master port #
> >  replication_enabled          | 1
> > | non 0 if actually operating in replication mode
> >  master_slave_enabled         | 0
> > | non 0 if actually operating in master/slave
> >  num_reset_queries            | 3
> > | number of queries in reset_query_list
> >  server_status                | master(clever on 5432) up
> > secondary(smart on 5432) up | server status
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > > Thank you very much for the reply.  Its helpful.
> > 
> > You are welcome.
> > 
> > > We have now turned on load_balancing and have set the weight to 1 
> > > and 0 (for db0 and db1).  And we still are getting "kind 
> mismatches".
> > > 
> > > Is this weight distribution of 1/0 valid?  Or does 1/0 
> overwrite the
> > load
> > > balancing parameter.    Is this not a recommended setting?
> > 
> > Let me look into this.
> > 
> > > We do direct SELECT connects to db1 via port 5432, thus we wanted 
> > > pgpool to just select from db0 vs do a 50/50 load balancing.
> > > 
> > > If the 1/0 setting is wrong let us know and we will 
> change it so we 
> > > will do something like 9/1 type load balancing.  If the 
> 1/0 weight 
> > > setting is valid why would we still be seeing "kind mismatches" on
> > selects?
> > > 
> > > Thanks in advance for any suggestions/input here.
> > 
> > Can you please provide debugging out put of pgpool by 
> turing on -d and 
> > -n option? We need info for when "kind mismatches" erros occures in 
> > your setting.
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> > 
> > > Thanks,
> > > John
> > > 
> > > 
> > > 
> > > On 7/30/06, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> > > >
> > > > > We have been testing with pgpool for a little while here and 
> > > > > have run
> > > > into
> > > > > various locking (or lack there of) issues.
> > > > >
> > > > > Our setup is 2 databases (pg v8.1.4) on seperate 
> machines with 
> > > > > all
> > 
> > > > > db connections connected via pgpool.  We have run into 
> > > > > situations where we
> > > > the
> > > > > 1st or 2nd database was not getting the update or insert fast 
> > > > > enough,
> > > > and a
> > > > > select query hitting pgpool at the same time causing the a
> > "mismatch"
> > > > error
> > > > > causing pgpool to degrade and stop replicating.
> > > > >
> > > > > Although I assume this must be quite a normal issue, I do not 
> > > > > see many people talking about it here.
> > > > >
> > > > > Can someone give me a basic rundown of the "best 
> practices" for 
> > > > > pg
> > > > locking
> > > > > on a pretty database intensive web app?
> > > > >
> > > > > On Inserts the best practice is to lock the entire 
> table (LOCK 
> > > > > TABLE
> > > > ...) ?
> > > > >
> > > > > On Updates the best practice is to do a row lock 
> (SELECT ... FOR
> > > > UPDATE  or
> > > > > FOR SHARE) ?
> > > > > Should you do some sort of lock on SELECTS?
> > > > > Do you recommend everything be wrapped in Transactions?
> > > > >
> > > > > Any advice is appreciated.
> > > > >
> > > > > Thank you.
> > > >
> > > > What you have seen is a table updating timing 
> difference between 
> > > > master and secondary as yo suspected. Although there's 
> no way to 
> > > > avoid the timing difference itself, I think there are 
> some ways to 
> > > > avoid this problem.
> > > >
> > > > 1) use load balancing. Since this will SELECT one of 
> the servers,
> > > >    there will be no chance to see the "kind mismatch error".
> > > >
> > > > 2) use table locking. This will prevent the SELECT from 
> seeing yet
> > > >    updated data in the secondary server.
> > > >
> > > > I don't think SELECT FOR UPDATE will work, since the 
> lock does not 
> > > > block SELECT.
> > > > --
> > > > Tatsuo Ishii
> > > > SRA OSS, Inc. Japan
> > > >
> > 
> > _______________________________________________
> > Pgpool-general mailing list
> > Pgpool-general at pgfoundry.org
> > http://pgfoundry.org/mailman/listinfo/pgpool-general
> > 
> 


More information about the Pgpool-general mailing list