View Issue Details

IDProjectCategoryView StatusLast Update
0000271Pgpool-IIBugpublic2017-03-31 17:06
Reportersupp_k Assigned Tot-ishii  
PriorityurgentSeveritymajorReproducibilityrandom
Status closedResolutionopen 
OSCentOS OS Version6.x  
Product Version3.5.4 
Target Version3.6.3 
Summary0000271: balanced requests after started transaction
DescriptionWe see that in transaction pgpool balances requests to standby read-only backend. But this should happen because slave doesn't contain required data until the transaction committed on slave.


The bug is related to pgpool v3.6.
Additional InformationWe analysed the log records and did the conclusion:

First 5 lines show the transaction was initiated and for some period pgpool routes requests to the backend #0 only which is master. But then we see the request (2016-12-15T12:46:13.229382+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79083-1]) was routed to standby backend 0000001. At this we moment we face data inconsistency in the frontend because for sure the backend 0000001 doesn't contain the data which was created in backend #0 until it is commited. The SELECT for the data is performed within the same transaction in which it was INSERT-ed.





2016-12-15T12:46:12.007225+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78172-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:12.007229+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78172-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:12.007240+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78175-1] LOG: DB node id: 1 backend pid: 2946 statement: B message
2016-12-15T12:46:12.007242+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78175-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:12.007446+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78189-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: BEGIN
2016-12-15T12:46:12.007455+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78189-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:12.007480+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78194-1] LOG: DB node id: 1 backend pid: 2946 statement: Execute: BEGIN
2016-12-15T12:46:12.007481+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78194-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:12.008519+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78260-1] LOG: DB node id: 1 backend pid: 2946 statement: Parse: SELECT 1
2016-12-15T12:46:12.008525+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78260-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:12.008653+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78274-1] LOG: DB node id: 1 backend pid: 2946 statement: B message
2016-12-15T12:46:12.008659+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78274-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:12.008823+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78287-1] LOG: DB node id: 1 backend pid: 2946 statement: D message
2016-12-15T12:46:12.008825+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78287-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:12.008906+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78297-1] LOG: DB node id: 1 backend pid: 2946 statement: Execute: SELECT 1
2016-12-15T12:46:12.008915+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle[16742]: [78297-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.218025+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78372-1] LOG: DB node id: 1 backend pid: 2946 statement: B message
2016-12-15T12:46:13.218034+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78372-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.218197+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78386-1] LOG: DB node id: 1 backend pid: 2946 statement: Execute: SELECT CASE WHEN sr.sub_limit = -1 THEN 1 ELSE 0 END AS is_inf, sr.sub_limit - sr.curr_usage AS res_avail, sr.rt_instance_id AS rt_instance_id FROM resource_classes rc JOIN resource_types rt ON (rc.class_id = rt.class_id) JOIN subs_resources sr ON (sr.rt_id = rt.rt_id) JOIN subscriptions s ON (sr.sub_id = s.sub_id) WHERE rc.name = $1 AND s.owner_id = $2 AND s.is_active = 1 ORDER BY is_inf DESC, res_avail DESC, sr.rt_instance_id
2016-12-15T12:46:13.218203+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78386-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.221080+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78446-1] LOG: DB node id: 0 backend pid: 17007 statement: Parse: INSERT INTO accounts_serial DEFAULT VALUES
2016-12-15T12:46:13.221083+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78446-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.221220+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78460-1] LOG: DB node id: 0 backend pid: 17007 statement: D message
2016-12-15T12:46:13.221223+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78460-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.222001+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78520-1] LOG: DB node id: 0 backend pid: 17007 statement: Parse: INSERT INTO accounts_serial DEFAULT VALUES
2016-12-15T12:46:13.222006+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78520-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.222146+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78534-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.222150+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78534-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.222272+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78547-1] LOG: DB node id: 0 backend pid: 17007 statement: D message
2016-12-15T12:46:13.222276+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78547-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.222351+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78557-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: INSERT INTO accounts_serial DEFAULT VALUES
2016-12-15T12:46:13.222355+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78557-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.223461+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78629-1] LOG: DB node id: 0 backend pid: 17007 statement: Parse: select currval('accounts_serial_seq')
2016-12-15T12:46:13.223465+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78629-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.223592+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78642-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.223594+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78642-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.223708+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78655-1] LOG: DB node id: 0 backend pid: 17007 statement: D message
2016-12-15T12:46:13.223711+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78655-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.223814+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78665-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: select currval('accounts_serial_seq')
2016-12-15T12:46:13.223822+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78665-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.224918+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78742-1] LOG: DB node id: 0 backend pid: 17007 statement: Parse: DELETE FROM accounts_serial WHERE account_id = $1
2016-12-15T12:46:13.224926+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78742-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.225052+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78755-1] LOG: DB node id: 0 backend pid: 17007 statement: D message
2016-12-15T12:46:13.225056+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78755-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.225842+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78815-1] LOG: DB node id: 0 backend pid: 17007 statement: Parse: DELETE FROM accounts_serial WHERE account_id = $1
2016-12-15T12:46:13.225846+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78815-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.225965+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78829-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.225973+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78829-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.226088+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78842-1] LOG: DB node id: 0 backend pid: 17007 statement: D message
2016-12-15T12:46:13.226092+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78842-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.226179+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78852-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: DELETE FROM accounts_serial WHERE account_id = $1
2016-12-15T12:46:13.226183+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78852-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.227219+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78921-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.227222+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78921-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.227380+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78935-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into addresses (address, address2, address3, city, country, postal_code, state) values ($1, $2, $3, $4, $5, $6, $7)
2016-12-15T12:46:13.227383+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78935-2] #011RETURNING *
2016-12-15T12:46:13.227386+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78935-3] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.228185+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78995-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.228188+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [78995-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.228343+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79009-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into contacts (display_name) values ($1)
2016-12-15T12:46:13.228345+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79009-2] #011RETURNING *
2016-12-15T12:46:13.228348+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79009-3] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.229216+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79069-1] LOG: DB node id: 1 backend pid: 2946 statement: B message
2016-12-15T12:46:13.229220+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79069-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.229382+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79083-1] LOG: DB node id: 1 backend pid: 2946 statement: Execute: select account0_.account_id as account_1_3_0_, account0_.address_id as address21_3_0_, account0_.adm_contact_id as adm_con22_3_0_, account0_.aps_uuid as aps_uuid2_3_0_, account0_.ext_system_id as ext_sys23_3_0_, account0_.bill_contact_id as bill_co24_3_0_, account0_.brand_name as brand_na3_3_0_, account0_.ccp_version as ccp_vers4_3_0_, account0_.company_name as company_5_3_0_, account0_.customers_quantity as customer6_3_0_, account0_.l_country_code_default as l_countr7_3_0_, account0_.l_language_code_default as l_langua8_3_0_, account0_.l_variant_default as l_varian9_3_0_, account0_.ext_account_id as ext_acc10_3_0_, account0_.c_time as c_time11_3_0_, account0_.l_country_code as l_count12_3_0_, account0_.l_language_code as l_langu13_3_0_, account0_.l_variant as l_varia14_3_0_, account0_.is_locked as is_lock15_3_0_, account0_.note
2016-12-15T12:46:13.229388+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79083-2] as note16_3_0_, account0_.owner_id as owner_i25_3_0_, account0_.path as path17_3_0_, account0_.is_personal as is_pers18_3_0_, account0_.rt_instance_id as rt_inst19_3_0_, acc
2016-12-15T12:46:13.229390+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79083-3] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.230519+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79133-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.230524+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79133-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.230658+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79147-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into contact_properties (value, contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.230665+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79147-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.231363+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79199-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.231371+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79199-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.231498+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79213-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into contact_properties (value, contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.231502+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79213-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.232170+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79265-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.232174+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79265-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.232311+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79279-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into contact_properties (value, contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.232314+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79279-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.232971+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79331-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.232974+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79331-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.233138+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79345-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into contact_properties (value, contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.233146+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79345-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.233854+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79397-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.233856+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79397-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.233973+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79411-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into contact_properties (value, contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.233978+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79411-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.234643+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79463-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.234647+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79463-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.234790+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79477-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into contact_properties (value, contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.234799+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79477-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.235426+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79529-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.235430+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79529-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.235583+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79543-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into contact_properties (value, contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.235587+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79543-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.236258+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79595-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.236266+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79595-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.236424+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79609-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into contact_properties (value, contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.236429+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79609-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.237049+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79661-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.237054+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79661-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.237188+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79675-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into contact_properties (value, contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.237191+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79675-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.237771+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79727-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.237782+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79727-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.237907+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79741-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into contact_properties (value, contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.237915+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79741-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.238572+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79793-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.238575+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79793-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.238715+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79807-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into contact_properties (value, contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.238719+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79807-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.239370+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79859-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.239373+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79859-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.239513+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79873-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into accounts (address_id, adm_contact_id, aps_uuid, ext_system_id, bill_contact_id, brand_name, ccp_version, company_name, customers_quantity, l_country_code_default, l_language_code_default, l_variant_default, ext_account_id, c_time, l_country_code, l_language_code, l_variant, is_locked, note, owner_id, path, is_personal, rt_instance_id, tech_contact_id, account_type, account_id) values ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26)
2016-12-15T12:46:13.239518+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79873-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.240744+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79925-1] LOG: DB node id: 1 backend pid: 2946 statement: B message
2016-12-15T12:46:13.240748+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79925-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.240901+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79939-1] LOG: DB node id: 1 backend pid: 2946 statement: Execute: select user0_.user_id as user_id1_193_0_, identity1_.identity_id as identity1_102_1_, account2_.account_id as account_1_3_2_, user0_.scope_id as scope_i16_193_0_, user0_.aps_uuid as aps_uuid2_193_0_, user0_.contact_id as contact17_193_0_, user0_.date_format as date_for3_193_0_, user0_.deleting as deleting4_193_0_, user0_.dns_path_of_domain_part as dns_path5_193_0_, user0_.domain_id as domain_i6_193_0_, user0_.is_enabled as is_enabl7_193_0_, user0_.auth_identity_id as auth_id18_193_0_, user0_.l_country_code as l_countr8_193_0_, user0_.l_language_code as l_langua9_193_0_, user0_.l_variant as l_varia10_193_0_, user0_.locked_by as locked_11_193_0_, user0_.member_id as member_12_193_0_, user0_.sub_id as sub_id13_193_0_, user0_.timezone as timezon14_193_0_, user0_.type as type15_193_0_, identity1_.system_id as system_i4_102_1_,
2016-12-15T12:46:13.240906+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79939-2] identity1_.login as login2_102_1_, identity1_.login_lowered as login_lo3_102_1_, account2_.address_id as address21_3_2_, account2_.adm_contact_id as adm_con22_3_2_, account2_.aps_u
2016-12-15T12:46:13.240908+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79939-3] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.241819+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79997-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.241827+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79997-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.241969+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [80011-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: select localident0_.identity_id as identity1_109_0_, localident0_.pwd_expired as pwd_expi2_109_0_, localident0_.pwd_ctime as pwd_ctim3_109_0_, localident0_.pwd_hash as pwd_hash4_109_0_, localident0_.pwd_type as pwd_type5_109_0_ from local_identities localident0_ where localident0_.identity_id=$1
2016-12-15T12:46:13.241977+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [80011-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.242963+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [80069-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.242971+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [80069-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.243102+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [80083-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: select role0_.role_id as role_id1_160_, role0_.account_id as account_6_160_, role0_.description as descript2_160_, role0_.name as name3_160_, role0_.admin as admin4_160_, role0_.role_type as role_typ5_160_ from roles role0_ where role0_.account_id=$1 and role0_.name=$2 and (role0_.role_type=$3 or role0_.role_type=$4)
2016-12-15T12:46:13.243106+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [80083-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.243851+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [80133-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.243853+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [80133-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.244015+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [80147-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into roles (account_id, description, name, admin, role_type) values ($1, $2, $3, $4, $5)
2016-12-15T12:46:13.244024+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [80147-2] #011RETURNING *
2016-12-15T12:46:13.244028+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [80147-3] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.245087+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [80207-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.245091+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [80207-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.245241+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [80221-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: select privilege0_.privilege_id as privileg1_144_, privilege0_.allow_locked as allow_lo2_144_, privilege0_.area as area3_144_, privilege0_.name as name4_144_, privilege0_1_.group_id as group_id1_143_ from privileges privilege0_ left outer join privilege_groups_content privilege0_1_ on privilege0_.privilege_id=privilege0_1_.privilege_id where privilege0_.name=$1
TagsNo tags attached.

Activities

supp_k

2016-12-16 04:59

reporter   ~0001225

One more strange case:
if we do pcp_detach_node for the SLAVE backend and then pcp_attach_node then the problem of "incorrect balancing" disappeares.

supp_k

2016-12-17 02:33

reporter   ~0001226

Please pay attention to the following information. In the following piece of code the requests are coming with interval less then ONE millisecond. It is possible that the 1st request is not complete yet. Does it make sense? Probably pgpool has not created the "flag" that balancing is prohibited after this moment?


2016-12-15T12:46:13.238715+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79807-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into contact_properties (value, contact_id, name) values ($1, $2, $3)
2016-12-15T12:46:13.238719+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79807-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.239370+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79859-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.239373+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79859-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.239513+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79873-1] LOG: DB node id: 0 backend pid: 17007 statement: Execute: insert into accounts (address_id, adm_contact_id, aps_uuid, ext_system_id, bill_contact_id, brand_name, ccp_version, company_name, customers_quantity, l_country_code_default, l_language_code_default, l_variant_default, ext_account_id, c_time, l_country_code, l_language_code, l_variant, is_locked, note, owner_id, path, is_personal, rt_instance_id, tech_contact_id, account_type, account_id) values ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26)
2016-12-15T12:46:13.239518+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79873-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.240744+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79925-1] LOG: DB node id: 1 backend pid: 2946 statement: B message
2016-12-15T12:46:13.240748+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79925-2] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.240901+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79939-1] LOG: DB node id: 1 backend pid: 2946 statement: Execute: select user0_.user_id as user_id1_193_0_, identity1_.identity_id as identity1_102_1_, account2_.account_id as account_1_3_2_, user0_.scope_id as scope_i16_193_0_, user0_.aps_uuid as aps_uuid2_193_0_, user0_.contact_id as contact17_193_0_, user0_.date_format as date_for3_193_0_, user0_.deleting as deleting4_193_0_, user0_.dns_path_of_domain_part as dns_path5_193_0_, user0_.domain_id as domain_i6_193_0_, user0_.is_enabled as is_enabl7_193_0_, user0_.auth_identity_id as auth_id18_193_0_, user0_.l_country_code as l_countr8_193_0_, user0_.l_language_code as l_langua9_193_0_, user0_.l_variant as l_varia10_193_0_, user0_.locked_by as locked_11_193_0_, user0_.member_id as member_12_193_0_, user0_.sub_id as sub_id13_193_0_, user0_.timezone as timezon14_193_0_, user0_.type as type15_193_0_, identity1_.system_id as system_i4_102_1_,
2016-12-15T12:46:13.240906+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79939-2] identity1_.login as login2_102_1_, identity1_.login_lowered as login_lo3_102_1_, account2_.address_id as address21_3_2_, account2_.adm_contact_id as adm_con22_3_2_, account2_.aps_u
2016-12-15T12:46:13.240908+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79939-3] LOCATION: pool_proto_modules.c:3085
2016-12-15T12:46:13.241819+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79997-1] LOG: DB node id: 0 backend pid: 17007 statement: B message
2016-12-15T12:46:13.241827+03:00 srv-60b2f6e99b9b lesk plesk 192.168.102.240(59893) idle in transaction[16742]: [79997-2] LOCATION: pool_proto_modules.c:3085

serk

2016-12-19 06:10

reporter   ~0001231

Hi Pgpool developers!

Please consider the logic that reproduces the error.

In approximately 30% of cases the attached class (TestPgpool.java) reproduces the issue.
The target testing DB has 2 tables used for testing (tables.ddl). We did several modifications in pgpool code that show the cases (as we could detect it) when Pgpool does balancing instead of forwarding the SELECT request to MASTER backend after DDL statement was processed.

In several runnings one will see messages: "USE BACKEND: {1} THOUGH THE WRITE TRANSACTION IS ACTIVE!!!!!!!!" which say that previously the UPDATE statement was executed but the next SELECT method is forwarded to SLAVE – which should not happen.

We enforced balancing ratio in pgpool by proportion MASTER/SLAVE as 0/1000. Thus the majority of requests will be forwarded to SLAVE. The attached configuration assumes the 2 backends are synchronous and backend #0 is master

Notes: The attached DIFF file also contains patch for the issue 0000231.
issues_271_244.zip (4,453 bytes)

t-ishii

2016-12-20 17:47

developer   ~0001242

It seems there's no evidence that the SELECT in question was performed in an explicit transaction in your log.

supp_k

2016-12-20 18:07

reporter   ~0001243

Please have a look at the attached log file.
The issue is reproduced when time lag between requests is very small. This is why we create attached java multithread java test.

Regarding the attached log:
1) at 2016-12-20T12:02:38.200806 there was a write statement
2) Later (2016-12-20T12:02:38.202535) one can see that the new SELECT request was balanced to the node 1 which is read-only slave.
test7.log (139,187 bytes)   
test7.log (139,187 bytes)   

supp_k

2016-12-20 18:08

reporter   ~0001244

Multuthread java test:

import java.sql.*;
import java.util.*;
import java.util.concurrent.CyclicBarrier;
import java.util.concurrent.locks.*;

public class TestPgpoolLoad {

    private static final int TEST_THREADS_NUM = 40;

    public static void main(String[] args) throws Exception {

        try (Connection conn = DriverManager.getConnection("jdbc:postgresql://10.28.78.103:5432/postgres", "postgres", "password")) {

            Lock lock = new ReentrantLock();

            
            conn.setAutoCommit(false);

            CyclicBarrier cb = new CyclicBarrier(TEST_THREADS_NUM);

            try (PreparedStatement stmtSelect = conn.prepareStatement("select * from read_table where id <= ?");
                PreparedStatement stmtWrite = conn.prepareStatement("update write_table set id = 1 where id = ?")) {

                List<Thread> thrs = new ArrayList<>();
                for (int i = 0; i < TEST_THREADS_NUM; i++) {
                    boolean isRead = i < TEST_THREADS_NUM - 1;
                    boolean isSleep = i < (TEST_THREADS_NUM / 2);
                    thrs.add(new TestReader(isSleep, cb, isRead, isRead ? stmtSelect : stmtWrite, lock));
                }

                for (Thread thr : thrs) {
                    thr.start();
                }

                Thread.sleep(1000);

                for (Thread thr : thrs) {
                    thr.join();
                }

                System.out.println("ok");
            } catch (Exception ex) {
                ex.printStackTrace();

                throw ex;
            }
            conn.commit();
        }
    }

    public static class TestReader extends Thread {

        private final CyclicBarrier cb;
        private final boolean readThread;
        private final PreparedStatement stmt;
        private final Lock lock;
        private final boolean isSleep;
        private final boolean isRead;

        TestReader(boolean isSleep, CyclicBarrier cb, boolean readThread, PreparedStatement readStmt, Lock lock) {
            this.isSleep = isSleep;
            this.cb = cb;
            this.isRead = readThread;
            this.readThread = readThread;
            this.stmt = readStmt;
            this.lock = lock;
        }

        @Override
        public void run() {
            try {
                cb.await();

                if (!isRead) {
                    Thread.sleep(1);
                }

                lock.lock();
                stmt.setInt(1, 1);
                stmt.execute();

            } catch (Exception ex) {
                ex.printStackTrace();
            } finally {
                lock.unlock();
            }

        }

    }

}

supp_k

2016-12-20 18:11

reporter   ~0001245

Also there is the transaction start mentioned in the log:

2016-12-20T12:02:38.032665+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(43669) idle[30813]: [208-1] 2016-12-20 12:02:38: pid 30813:LOG: DB node id: 0 backend pid: 31091 statement: Parse: BEGIN
2016-12-20T12:02:38.032667+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(43669) idle[30813]: [208-2] 2016-12-20 12:02:38: pid 30813:LOCATION: pool_proto_modules.c:3089
2016-12-20T12:02:38.032800+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(43669) idle[30813]: [211-1] 2016-12-20 12:02:38: pid 30813:LOG: DB node id: 1 backend pid: 13293 statement: Parse: BEGIN
2016-12-20T12:02:38.032806+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(43669) idle[30813]: [211-2] 2016-12-20 12:02:38: pid 30813:LOCATION: pool_proto_modules.c:3089
2016-12-20T12:02:38.033030+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(43669) idle[30813]: [224-1] 2016-12-20 12:02:38: pid 30813:LOG: DB node id: 0 backend pid: 31091 statement: B message
2016-12-20T12:02:38.033032+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(43669) idle[30813]: [224-2] 2016-12-20 12:02:38: pid 30813:LOCATION: pool_proto_modules.c:3089
2016-12-20T12:02:38.033178+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(43669) idle[30813]: [227-1] 2016-12-20 12:02:38: pid 30813:LOG: DB node id: 1 backend pid: 13293 statement: B message
2016-12-20T12:02:38.033179+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(43669) idle[30813]: [227-2] 2016-12-20 12:02:38: pid 30813:LOCATION: pool_proto_modules.c:3089
2016-12-20T12:02:38.033435+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(43669) idle[30813]: [239-1] 2016-12-20 12:02:38: pid 30813:LOG: DB node id: 0 backend pid: 31091 statement: Execute: BEGIN
2016-12-20T12:02:38.033437+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(43669) idle[30813]: [239-2] 2016-12-20 12:02:38: pid 30813:LOCATION: pool_proto_modules.c:3089
2016-12-20T12:02:38.033577+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(43669) idle[30813]: [244-1] 2016-12-20 12:02:38: pid 30813:LOG: DB node id: 1 backend pid: 13293 statement: Execute: BEGIN
2016-12-20T12:02:38.033578+03:00 srv-60b2f6e99b9b lesk plesk 10.192.36.35(43669) idle[30813]: [244-2] 2016-12-20 12:02:38: pid 30813:LOCATION: pool_proto_modules.c:3089

t-ishii

2016-12-20 18:15

developer   ~0001247

But the process id is different (30813 vs. 16742). So I can not confirm that the SELECT and DML are in a same transaction.

supp_k

2016-12-20 18:24

reporter   ~0001248

It seems you missed the freshest attached log file. Please have a look at the freshest "test7.log".

Also the pgpool settings are:

backend_hostname0 = 'a.db.node'
backend_port0 = 15432
backend_weight0 = 0
backend_data_directory0 = '/var/lib/pgsql/9.5/data'
backend_flag0 = 'ALLOW_TO_FAILOVER'

backend_hostname1 = 'b.db.node'
backend_port1 = 15432
backend_weight1 = 1000
backend_data_directory1 = '/var/lib/pgsql/9.5/data'
backend_flag1 = 'ALLOW_TO_FAILOVER'


enable_pool_hba = on
pool_passwd = 'pool_passwd'
authentication_timeout = 60
ssl = off




num_init_children = 240
max_pool = 1
child_life_time = 0
child_max_connections = 1
connection_life_time = 0
client_idle_limit = 0




log_destination = 'syslog'
log_line_prefix = '%t: pid %p:'
log_connections = on
log_hostname = on
log_statement = on
log_per_node_statement = on
log_standby_delay = 'none'
syslog_facility = 'LOCAL0'
syslog_ident = 'pgpool'

debug_level = 5
log_error_verbosity = verbose
client_min_messages = warning
log_min_messages = debug1

supp_k

2016-12-21 05:42

reporter   ~0001250

I have updated the test scenario. Please look at the source code below.

There are N-1 threads that read Value which was previously inserted by the Single thread. The N threads start simultaneously in front of the barrier. There is a semaphore (ReentrantLock) that prevents logic from parallel queries but we still have very high concurrency mode and minimal time period between receiving response and sending of the next query. After the INSERT threads completes insert operation it sets valatile flag (writePerformed) and all SELECT threads start trying to fetch the value. If the value was not returned then it definely means the SELECT was balanced to the node which doesn't contain the value inserted so far.

Try to launch the test several times and you will see the exception in stdout. In our environment we reproduce the issue in 50%.

If you wish I can create fully functional test case: debug it in the local QA environment and provide you so that you will be able to use it in your CI solution.



package com;

import java.sql.*;
import java.util.*;
import java.util.concurrent.CyclicBarrier;
import java.util.concurrent.locks.*;

public class TestPgpool {

    private static final int TEST_THREADS_NUM = 40;
    static volatile boolean writePerformed = false; // variable will be set into TRUE when the write thread did INSERT operation

    public static void main(String[] args) throws Exception {
        
        Random random = new Random(System.currentTimeMillis());

        try (Connection conn = DriverManager.getConnection("jdbc:postgresql://10.28.78.103/plesk", "plesk", "jjHCF7sGHq")) {

            Lock lock = new ReentrantLock();

            conn.setAutoCommit(false); // here we define explicit transaction
            CyclicBarrier cb = new CyclicBarrier(TEST_THREADS_NUM);

            final int valueTest = random.nextInt();
            System.out.println("test value:" + valueTest);

            try (PreparedStatement stmtSelect = conn.prepareStatement("select * from write_table where id = ?");
                    PreparedStatement stmtWrite = conn.prepareStatement("insert into write_table(id) values(?)")) {

                List<Thread> thrs = new ArrayList<>();
                for (int i = 0; i < TEST_THREADS_NUM; i++) {
                    boolean isRead = i > 0;
                    //boolean isSleep = i < (TEST_THREADS_NUM / 2);
                    thrs.add(new TestReader(valueTest, cb, isRead, isRead ? stmtSelect : stmtWrite, lock));
                }

                for (Thread thr : thrs) {
                    thr.start();
                }

                Thread.sleep(1000);

                for (Thread thr : thrs) {
                    thr.join();
                }

                
            } catch (Exception ex) {
                ex.printStackTrace();

                throw ex;
            }
            conn.commit();
        }
    }

    public static class TestReader extends Thread {

        private final CyclicBarrier cb;

        private final PreparedStatement stmt;
        private final Lock lock;

        private final boolean isRead;
        private final int valueTest;

        TestReader(int valueTest, CyclicBarrier cb, boolean readThread, PreparedStatement stmt, Lock lock) {
            this.valueTest = valueTest;
            this.cb = cb;
            this.isRead = readThread;
            this.stmt = stmt;
            this.lock = lock;
        }

        @Override
        public void run() {
            try {
                cb.await();

                lock.lock();
                stmt.setInt(1, valueTest);
                if (isRead) {
                    try (ResultSet rs = stmt.executeQuery()) {
                        if (writePerformed) {
                            if (!rs.next()) {
                                throw new RuntimeException("The request was balanced. The value was inserted but the the following select returned nothing");
                            } else {
                                int val = rs.getInt(1);
                                System.out.println("recevied value:" + val);
                            }
                        } else {
                            // write was not performed so one should not expect the value is in the database
                        }
                    }
                } else {
                    stmt.execute();
                    writePerformed = true; // set mark the write performed. since this moment the READ threads will be trying to read the value.
                }

            } catch (Exception ex) {
                ex.printStackTrace();
                throw new RuntimeException(ex);
            } finally {
                lock.unlock();
            }

        }

    }

}

t-ishii

2016-12-21 13:39

developer   ~0001254

I have took a look at test7.log and confirmed the issue. Looking into the source code now...

t-ishii

2016-12-21 14:24

developer   ~0001255

Is it possible to provide a Java debug trace when the issue occurs?

Pgpool-II decides the transaction is writing when a write query is sent and backend returns the "ready for query" message to Pgpool-II. My guess is, client sends parse message for a select *before* Pgpool-II receives the "ready for query message". If it actually happens, the select can be sent to standby because the writing flag is not set yet.

supp_k

2016-12-21 15:18

reporter   ~0001256

Yes, please have a look at the attached test10.zip file. It contains pgpool log and the corresponding java/jdbc log.
test10.zip (15,231 bytes)

supp_k

2016-12-21 18:11

reporter   ~0001257

'Probably it makes sense to set the flag on the receiving of a WRITE query. In this case the pgpool's state machine is free of the problem and even the possibility (in this particular case only).

t-ishii

2016-12-21 21:35

developer   ~0001258

It's not that simple. By looking at the jdbc log, I found that:

1) "prepare select" is was sent to node 1
2) write query was sent to node 0
3) bind/execute request for 1) was coming. They were sent to node 1 (oops!)

If we simply redirect 3) to node 0, it will fail because there's no prepared statement for the select on node 0. So the only solution is sending prepare select to node 0 before 3). Actually the code for the logic was in 3.4. I seem to forgot to port it to 3.5 or later:-)

supp_k

2016-12-21 21:54

reporter   ~0001259

Still I don't understand the real cause of the problem because I'm not an expert in Pgpool's state machine and Postgres protocol.

But from my point of view it seems the Parse message should be sent to all backends if load balancing is activated. This is True becase if for instance there are no DML requests Pgpool still has this right to balance requests.

Right now from the logs I see the Parse is forwarded to the node 1 (from the test10.log) and later after the INSERT is complete Pgpool still forwards BIND/EXECUTE for the Selects to the node 1..... this is what I see.

But I see you already have the solution which was forgotten in the past ))) It seems the lost code should be restored )))

t-ishii

2016-12-21 23:56

developer   ~0001260

> But from my point of view it seems the Parse message should be sent to all backends if load balancing is activated.

Actually we did this long time ago and people blamed us because that way was ineffective and slow.

supp_k

2016-12-22 00:04

reporter   ~0001261

Ok, probably there are some reasons exist. But we still need to fix the problem.

supp_k

2016-12-22 18:24

reporter   ~0001262

Would it be possible to restore the lost code? We still facing problems because of the issue (((

elepuser

2016-12-23 20:57

reporter   ~0001263

Hi Guys,

recently under not very high load we faced exactly the same issue. It resulted in data loses and integrity problems.
The program logic was inserting data but than occasionally it failed to fetch the same data within the same transaction. First moment we decided that postgres violates the ACID concept, but later revealed problem in pgpool's behaviour (((

I see there is the fix for the problem exists. Can you please apply it please ASAP? We look forward to obtain the fixed build.

t-ishii

2016-12-26 15:06

developer   ~0001264

> Would it be possible to restore the lost code?

Of course. But it would not be straight forward because of the message handling logic difference between Pgpool-II 3.5 and pre 3.5. Also I need a tool to send arbitrary message stream for testing/debugging. I'm working on developing it...

supp_k

2016-12-28 20:01

reporter   ~0001271

Do you have any estimations when you will be able to provide the patch for testing? We look forward to obtain it.

t-ishii

2017-01-04 18:06

developer   ~0001277

From December 29 to January 3rd my office was closed because of national holidays. I would like to finish the work by the end of this month.

supp_k

2017-01-04 19:17

reporter   ~0001278

Happy new year and Merry Christmas Mr. T-Ishii!
We look forward to obtain the fixed version. Lots of our activities are delayed because of this problem. If possible we'd like to participate in testing as soon as you are ready to release the fix. Probably if we participate you will be able to accomplish with the fix faster than the end of the month with a better quality.

elepuser

2017-01-10 05:15

reporter   ~0001286

Yes, is it possible to provide the fix earlier than the end of the month?
Recently we faced several more issues of this type. Every time it takes much efforts to rollback the business logic and restore data ((( We are also ready to participate in the testing.

PLEASE if possible provide the fix faster!

supp_k

2017-01-25 19:21

reporter   ~0001295

Dear Tatsui Ishii,

do you have any information about estimated timelines when you will be able to provide the fix? We are experiencing a lot of troubles because of the issue (((

Please is possible provide the fix or the information wehn it will be available.
Thank you!

supp_k

2017-01-25 19:22

reporter   ~0001296

...I'm sorry for the mistake in your name!!

elepuser

2017-01-29 07:35

reporter   ~0001310

Any updates on the issue?

From the message thread I see there is a fundamental problem in pgpool(((
The attached by supp_k test shows that the problem easily reproduced when the time lag between response from pgpool and the next request is minimal. Also the PARSE message is being sent to one node only. Just wonder how do you guys are hoing to fix the trouble if you dont want to send the parse request to all backends?

t-ishii

2017-01-30 18:05

developer   ~0001312

We don't need to send PARSE message to all backends.

If we know that a write query was sent in the transaction, then succeeding SELECT is sent to the primary node. If we already send a parse message for the SELECT to other than the primary, then we will send the parse message to the primary.

I'm working on the implementation in this direction now.

supp_k

2017-02-06 16:40

reporter   ~0001329

Hi T-Ishii,
are there any updates about this problem?
We are ready to start testing it.

elepuser

2017-02-07 17:48

reporter   ~0001336

Guys,

any updates on the issue? Can you please provide the fix?

t-ishii

2017-02-08 10:27

developer   ~0001337

Still working on it. However I'm pretty busy now and I'm not sure when the work will finish.

t-ishii

2017-02-14 16:12

developer   ~0001341

I'm getting a slight progress. With over 800 lines diff, now in certain case the issue seems to be fixed. Here is the case described by pgproto (https://github.com/tatsuo-ishii/pgproto).

However I am experiencing a regression with our regression test suit, and definitely I need to work on it.

# Test data for bug271.
# In an explicit transaction, SELECT is parsed, DML issued, same SELECT/bind issued.
# In this case Pgpool-II should parse resend to primary node.
#

# Create test table
'Q' "DROP TABLE IF EXISTS pgproto_test1"
'Y'
'Q' "CREATE TABLE pgproto_test1(i INT)"
'Y'

# Start a transaction
'P' "S1" "BEGIN" 0
'B' "" "S1" 0 0 0
'E' "" 0
'C' 'S' "S1"

# Issue SELECT
'P' "S2" "SELECT 1" 0
#'B' "" "S1" 0 0 0
#'E' "" 0
#'C' 'S' "S1"

# Issue INSERT
'P' "S1" "INSERT INTO pgproto_test1 VALUES(1)" 0
'B' "" "S1" 0 0 0
'E' "" 0
'C' 'S' "S1"
#'S'
#'Y'

# Issue SELECT. This should be sent to primary node.
#'P' "S1" "SELECT 1" 0
'B' "" "S2" 0 0 0
'E' "" 0
'C' 'S' "S2"

# Issue COMMIT
'P' "S1" "COMMIT" 0
'B' "" "S1" 0 0 0
'E' "" 0
'C' 'S' "S1"
'S'
'Y'
'X'

t-ishii

2017-02-23 15:20

developer   ~0001353

I finally managed to create the initial cut of the patch I was talking about. The patch is relatively large (1652 lines of diff) and I would appreciate if you guys test it out. The patch is attached as "bug271.diff". It was originally created against master branch, also can be applied to 3.6 and 3.5 stable trees as well.

t-ishii

2017-02-23 15:20

developer  

bug271.diff (54,508 bytes)   
bug271.diff (54,508 bytes)   

t-ishii

2017-02-27 18:10

developer   ~0001366

The discussion has been continued at pgpool-hackers mailing list (pgpool-hackers: 2043].

enruquekl

2017-03-17 07:37

reporter   ~0001378

The issue was reproduced by us in Pgpool 3.6.1 several times in production environment. Seems the bug is very problem since it is not closed for several months. This fact is very painful (((

Are there any estimations for the fix?

t-ishii

2017-03-17 14:22

developer   ~0001379

See the mailing archive discussions. I am working with Sergey, and he kindly have been testing my patches. I cannot say the estimated date for the fix but I believe we are getting closer to the end.

t-ishii

2017-03-31 17:05

developer   ~0001402

Now that fixes are incorporated into the stable branch and discussions/activities have been moved to the mailing list, I'm going to close this item.
(I asked Sergey and he agreed on it).

Issue History

Date Modified Username Field Change
2016-12-16 04:46 supp_k New Issue
2016-12-16 04:59 supp_k Note Added: 0001225
2016-12-17 02:33 supp_k Note Added: 0001226
2016-12-19 06:10 serk File Added: issues_271_244.zip
2016-12-19 06:10 serk Note Added: 0001231
2016-12-20 09:35 t-ishii Assigned To => t-ishii
2016-12-20 09:35 t-ishii Status new => assigned
2016-12-20 17:47 t-ishii Note Added: 0001242
2016-12-20 17:47 t-ishii Status assigned => feedback
2016-12-20 18:07 supp_k File Added: test7.log
2016-12-20 18:07 supp_k Note Added: 0001243
2016-12-20 18:07 supp_k Status feedback => assigned
2016-12-20 18:08 supp_k Note Added: 0001244
2016-12-20 18:11 supp_k Note Added: 0001245
2016-12-20 18:15 t-ishii Note Added: 0001247
2016-12-20 18:24 supp_k Note Added: 0001248
2016-12-21 05:42 supp_k Note Added: 0001250
2016-12-21 13:39 t-ishii Note Added: 0001254
2016-12-21 14:24 t-ishii Note Added: 0001255
2016-12-21 14:25 t-ishii Status assigned => feedback
2016-12-21 15:18 supp_k File Added: test10.zip
2016-12-21 15:18 supp_k Note Added: 0001256
2016-12-21 15:18 supp_k Status feedback => assigned
2016-12-21 18:11 supp_k Note Added: 0001257
2016-12-21 21:35 t-ishii Note Added: 0001258
2016-12-21 21:54 supp_k Note Added: 0001259
2016-12-21 23:56 t-ishii Note Added: 0001260
2016-12-22 00:04 supp_k Note Added: 0001261
2016-12-22 18:24 supp_k Note Added: 0001262
2016-12-23 20:57 elepuser Note Added: 0001263
2016-12-26 15:06 t-ishii Note Added: 0001264
2016-12-28 20:01 supp_k Note Added: 0001271
2017-01-04 18:06 t-ishii Note Added: 0001277
2017-01-04 19:17 supp_k Note Added: 0001278
2017-01-10 05:15 elepuser Note Added: 0001286
2017-01-25 19:21 supp_k Note Added: 0001295
2017-01-25 19:22 supp_k Note Added: 0001296
2017-01-29 07:35 elepuser Note Added: 0001310
2017-01-30 18:05 t-ishii Note Added: 0001312
2017-02-06 16:40 supp_k Note Added: 0001329
2017-02-07 17:48 elepuser Note Added: 0001336
2017-02-08 10:27 t-ishii Note Added: 0001337
2017-02-14 16:12 t-ishii Note Added: 0001341
2017-02-23 15:20 t-ishii Note Added: 0001353
2017-02-23 15:20 t-ishii File Added: bug271.diff
2017-02-23 15:21 t-ishii Status assigned => feedback
2017-02-27 18:10 t-ishii Note Added: 0001366
2017-03-17 07:37 enruquekl Note Added: 0001378
2017-03-17 14:22 t-ishii Note Added: 0001379
2017-03-31 17:05 t-ishii Note Added: 0001402
2017-03-31 17:06 t-ishii Status feedback => closed
2017-03-31 17:06 t-ishii Target Version => 3.6.3