[pgpool-general: 1425] Re: SELECT count(*) FROM pg_catalog.pg_class AS c WHERE c.relname = 'import_jobs' AND c.relpersistence = 'u' sort of hangs.

web at mr-paradox.net web at mr-paradox.net
Tue Feb 26 06:25:29 JST 2013


On Mon, Feb 18, 2013 at 08:24:16PM -0800, David Kerr wrote:
- 
- On Feb 16, 2013, at 2:09 AM, Tatsuo Ishii <ishii at postgresql.org> wrote:
- 
- >> Howdy!
- >> 
- >> I posted this to the PG general list and it was suggested that I ask it here instead.
- >> 
- >> I'll add the additional detail that this only seems to happen with jruby (jdbc) and not with ruby. Any ideas?
- >> 
- >> This query is coming from PgPool I believe.
- >> 
- >> SELECT count(*) FROM pg_catalog.pg_class AS c WHERE c.relname = 'import_jobs' AND c.relpersistence = 'u'
- >> 
- >> This is a very small database, like 10/15 tables, it's basically empty.
- >> If i run this query manually, it comes back immediatly.
- >> However according to this it's been running for about an hour.
- >> 
- >> v3=# select now();
- >>              now
- >> -------------------------------
- >> 2013-02-15 21:49:34.470466+00
- >> (1 row)
- >> 
- >> -[ RECORD 5
- >> +]----+---------------------------------------------------------------------------------------------------------
- >> datid            | 16389
- >> datname          | v3
- >> pid              | 13905
- >> usesysid         | 20415
- >> usename          | workling
- >> application_name |
- >> client_addr      | 10.0.0.132
- >> client_hostname  |
- >> client_port      | 58009
- >> backend_start    | 2013-02-15 20:49:09.098768+00
- >> xact_start       | 2013-02-15 20:49:09.189625+00
- >> query_start      | 2013-02-15 20:49:09.198742+00
- >> state_change     | 2013-02-15 20:49:09.198743+00
- >> waiting          | f
- >> state            | active
- >> query            | SELECT count(*) FROM pg_catalog.pg_class AS c WHERE c.relname = 'import_jobs' AND
- >> +c.relpersistence = 'u'
- >> 
- >> (I actually restarted the application and now there are 4 of these)
- >> 
- >> Any idea why the query would hang like this?
- >> 
- >> I don't see it waiting on locks or anything like that.
- >> 
- >> any ideas?
- > 
- > Yes, the query is generated by pgpool-II.
- > What pgpool and PostgreSQL version?
- > 
- > What does ps command show?
- > 
- > ps aux|grep pgpool
- > --
- > Tatsuo Ishii
- > SRA OSS, Inc. Japan
- > English: http://www.sraoss.co.jp/index_en.php
- > Japanese: http://www.sraoss.co.jp
- > 
- 
- 
- Still running into this.
- 
- I did a gdb trace on the processes, not sure if that's helpful.
- 
- Program received signal SIGINT, Interrupt.
- 0x00007f83d834fe82 in __libc_recv (fd=<value optimized out>, buf=0xb42c40, n=8192, flags=0) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:30
- 30          return INLINE_SYSCALL (recvfrom, 6, fd, buf, n, flags, NULL, NULL);
- #0  0x00007f83d834fe82 in __libc_recv (fd=<value optimized out>, buf=0xb42c40, n=8192, flags=0) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:30
- #1  0x00000000005a92f6 in recv (port=0x234c820, ptr=0xb42c40, len=8192) at /usr/include/bits/socket2.h:45
- #2  secure_read (port=0x234c820, ptr=0xb42c40, len=8192) at be-secure.c:304
- #3  0x00000000005b3fab in pq_recvbuf () at pqcomm.c:824
- #4  0x00000000005b43ab in pq_getbyte () at pqcomm.c:865
- #5  0x0000000000658b01 in SocketBackend (argc=<value optimized out>, argv=<value optimized out>, username=<value optimized out>) at postgres.c:342
- #6  ReadCommand (argc=<value optimized out>, argv=<value optimized out>, username=<value optimized out>) at postgres.c:490
- #7  PostgresMain (argc=<value optimized out>, argv=<value optimized out>, username=<value optimized out>) at postgres.c:3922
- #8  0x0000000000614db1 in BackendRun () at postmaster.c:3619
- #9  BackendStartup () at postmaster.c:3304
- #10 ServerLoop () at postmaster.c:1367
- #11 0x0000000000617a1c in PostmasterMain (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1127
- #12 0x00000000005b64b0 in main (argc=1, argv=0x2309d50) at main.c:199
- Detaching from program: /usr/pgsql-9.2/bin/postgres, process 850
- 
- 
- I've heard from some folks at Amazon, that there was a problem that they saw where the Postgres client socket was continuing 
- to send data even after a FIN message for that TCP session.
- 
- Which caused problems with the logic used by the security groups. I'm not sure if that's what's going on here, but
- the server definitely seems like it's looking for a connection.


More info on this:

Here's an strace of it happening on the Postgres side.

sendto(10, "1\0\0\0\0042\0\0\0\0043\0\0\0\4T\0\0\0\36\0\1count\0\0\0\0\0"..., 77, 0, NULL, 0) = 77
recvfrom(10, "P\0\0\0\210pgpool3725\0SELECT count(*) "..., 8192, 0, NULL, NULL) = 248
gettimeofday({1361824432, 162813}, NULL) = 0
gettimeofday({1361824432, 162884}, NULL) = 0
gettimeofday({1361824432, 163033}, NULL) = 0
gettimeofday({1361824432, 163105}, NULL) = 0
gettimeofday({1361824432, 163175}, NULL) = 0
lseek(13, 0, SEEK_END)                  = 163840
lseek(14, 0, SEEK_END)                  = 40960
lseek(15, 0, SEEK_END)                  = 90112
gettimeofday({1361824432, 163579}, NULL) = 0
gettimeofday({1361824432, 163657}, NULL) = 0
gettimeofday({1361824432, 163731}, NULL) = 0
gettimeofday({1361824432, 163801}, NULL) = 0
gettimeofday({1361824432, 163898}, NULL) = 0
sendto(10, "1\0\0\0\0042\0\0\0\0043\0\0\0\4T\0\0\0\36\0\1count\0\0\0\0\0"..., 77, 0, NULL, 0) = 77
recvfrom(10,

It just sits at "recvfrom(10," until the process ends.

I'm having trouble catching it on the pgpool side, I'll keep trying.

Does this give you any insight onto where i could focus testing?

Thanks


More information about the pgpool-general mailing list