Hi Yakuta, Yutaka Sato wrote: > Hi, > > In message <_A4372@delegate-en.ML_> on 02/20/09(23:17:57) > you Eddie Stassen <pcqiqbdyi-qghxypmesja6.ml@delegate.org> wrote: > |We are running delegate as a caching proxy NNTP server (on Solaris 10) > |using NFS mounted drives as a cache spool. Now and then we find 'stuck' > |delegated processes which take up a lot of CPU cycles. Running truss, > |it appears that the process is getting an ESTALE error when reading from > |the spool (not sure why this is, possibly a file deleted while it was > |being read), however it seems delegated is ignoring the error and > |sitting in a tight loop continuously retrying to read. I have had a > |look through the source code to try and figure out where this is > |happening, but without much success so far. > | > |Could you perhaps have a look and see if a fix could be made, or perhaps > |point me to the correct place in the code so I can make a temporary hack? > > The first step should be locating where it occurs. The output of > truss might imply where it is. Or you can inspect the process > with mdb for example as "mdb -p ProcessId" and dump the calling > stack with the "$c" command. > > I have managed to get the following information which might help to pinpoint the problem. Frist of all, truss shows this output: $ truss -p 26448 read(23, 0x0043A124, 8192) Err#151 ESTALE read(23, 0x0043A124, 8192) Err#151 ESTALE read(23, 0x0043A124, 8192) Err#151 ESTALE (continuous loop) Filehandle 23 points to a file on an NFS mounted volume as shown in the lsof output a the end. pstack shows this: $ pstack 26448 26448: /opt/local/news/delegate-test/sbin/delegated +=/opt/local/news/delegat ff04a748 _read (41daa8, 3a31f0, a9, 0, 43c124, ffbb7fb8) + c 0024be60 _Z6fgetsZPKciS0_iS0_iP6__FILE (32b730, 193, ffbb7bb8, 400, ffbb7bb8, ffbb7fb6) + 21c 0028f304 _Z8fgetsTeePKciS0_iS0_iP6__FILES2_S2_i (32b730, 402, ffbb7bb8, 400, ffbb7bb8, 400) + 134 0029080c _Z8readPartP6__FILES0_S0_iPPKcS2_iS2_iS2_PiS4_ (ffba7640, 3a31f0, ffff, 0, ffbb7fb7, 1) + e0 00290d6c _Z14decodeBODYpartP6__FILES0_S0_PKcPS2_iS2_S2_iS2_iS2_iS2_ (41daa8, 32b400, 0, 0, ffbd8168, 7) + 144 00291c48 _Z14decodeMIMEpartPPKcP6__FILES3_S3_ii (fffffffe, 41daa8, 3a31f0, 0, 7, 0) + 6a4 00293178 _Z10decodeMIMEP6__FILES0_S0_iii (41daa8, 3a31f0, 0, 7, 1, 0) + 30 00294cbc _Z13PGPdecodeMIMEP6__FILES0_S0_iii (41daa8, 3a31f0, 0, 7, 1, 0) + 170 0019af1c _Z18NNTPrelay_responseP7_ClientPKcS2_S2_iP10RespStatusiiP7_ServerP6__FILES8_ (ffffffff, ffbdb378, ffbde198, 3a31f0, 0, 3) + 4f4 0019b710 _Z19NNTPrelay_responseXP7_ClientPKcS2_S2_iP10RespStatusiiP7_ServerP6__FILES8_ (ffbdeae0, ffbdb378, ffbdd998, ffbde198, 0, 348604) + 5c 0019c448 _Z17NNTPrelay_requestP5DGCtxP7SessionP7_ClientiP7_Server (0, 411c00, ffbdeae0, 1, dc, 100) + b78 001a1848 _Z13service_nntpXP5DGCtxP7Session (ffbf27b0, ffbe1cc0, 2, 8400, ffbe1cc0, ffbe1cc0) + 2b8 001a1cfc _Z12service_nntpP5DGCtx (ffbf27b0, 0, 0, 2e, 2e, ffbf3358) + 84 000c0e1c _Z14execSpecialistP5DGCtxiP6__FILEi (ffbf27b0, 2e, 3a31d0, 1f20, ffffffff, 2ea800) + 408 000c1474 _Z16execGeneralist1AP5DGCtxiii (ffbf27b0, 2e, 2e, ffffffff, 0, 0) + c0 000c2da0 _Z15execGeneralist1P5DGCtxiii (ffbf27b0, 2e, 2e, ffffffff, 0, 0) + 120 002abd64 _Z11alloca_callP9AllocaArg (ffbea9f0, ffbeaa04, ffbeaaac, 177, ff250000, 345800) + b8 00285ab0 _Z14randstack_calliPFiPKvzEz (2, c2c80, ffbf27b0, 2e, 2e, 2) + dc 0007ba50 _Z14execGeneralistP5DGCtxiii (ffffffff, 2e, 2e, ffffffff, ffbeb518, ff250000) + c0 000503bc _Z14ExecGeneralistP5DGCtxii (ffbf27b0, 2e, 0, 2da140, 0, ff250000) + 5c 000508d8 _Z12call_client1P5DGCtxP3Efd (ffbf27b0, ffbfbcc0, 345800, 2e, ffbeb518, 1004000) + 494 00051ce0 _Z11EXEC_clientP5DGCtxPKcS2_P3Efd (ffbf27b0, ffbfb770, 0, ffbfbcc0, 776, 36d800) + 194 0005b884 _Z17forkOnetimeServerP5DGCtxiP3Efdi (ffbf27b0, 8, ffbfbcc0, 0, 0, ffff9c00) + 260 00061a30 _Z5_mainiPPKc (0, 0, 345800, 2da000, 36c400, 36c800) + 31b4 000646e4 _Z6_mainxiPPKc (2, 3a3330, 36d0c8, 0, ffbf1d78, 646dc) + 8 0006530c _Z13delegate_mainiPPKc (2, 3a3330, 60, 0, ffbf9de8, 2d9c00) + b48 002abd64 _Z11alloca_callP9AllocaArg (ffbffb28, 0, ffbffbe4, f6, 36d0c8, 345800) + b8 00285ab0 _Z14randstack_calliPFiPKvzEz (1, 647c4, 2, 3a3330, 60, 1) + dc 00065770 _Z5mainXiPPKc (2de000, 3a3330, 0, 0, 0, 0) + 3d8 00049514 main (2, ffbffccc, ffbffcd8, 3a3238, ff230140, 0) + 8 00048b4c _start (0, 0, 0, 0, 0, 0) + 5c lsof output at this time: $ lsof -p 26448 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME delegated 26448 news cwd VDIR 85,3 512 11658 /opt/local/news/delegate-test/work/119 delegated 26448 news txt VREG 85,3 3814852 4068 /opt/local/news/delegate-test/sbin/delegated delegated 26448 news txt VREG 85,0 60904 2381 /usr/lib/libz.so.1 delegated 26448 news txt VREG 85,0 1331216 6768 /usr/sfw/lib/libssl.so.0.9.7 delegated 26448 news txt VREG 85,0 41088 10965 /lib/libgen.so.1 delegated 26448 news txt VREG 85,0 51176 42966 /lib/libuutil.so.1 delegated 26448 news txt VREG 85,0 20572 10962 /lib/libdoor.so.1 delegated 26448 news txt VREG 85,0 1452900 2762 /usr/sfw/lib/libcrypto.so.0.9.7 delegated 26448 news txt VREG 85,0 143684 44750 /lib/libscf.so.1 delegated 26448 news txt VREG 85,0 792916 2404 /lib/libm.so.2 delegated 26448 news txt VREG 85,0 100972 1193 /lib/libmd.so.1 delegated 26448 news txt VREG 85,0 1629616 10827 /lib/libc.so.1 delegated 26448 news txt VREG 85,0 25596 10977 /lib/libmp.so.2 delegated 26448 news txt VREG 85,0 1536500 45114 /usr/sfw/lib/libstdc++.so.6.0.3 delegated 26448 news txt VREG 85,0 46692 45110 /usr/sfw/lib/libgcc_s.so.1 delegated 26448 news txt VREG 85,0 27100 10949 /lib/libcmd.so.1 delegated 26448 news txt VREG 85,5 1356 347 /var (/dev/md/dsk/d5) delegated 26448 news txt VREG 85,0 52920 778 /lib/libpam.so.1 delegated 26448 news txt VREG 85,0 868608 5962 /lib/libnsl.so.1 delegated 26448 news txt VREG 85,0 6624 6545 /platform/sun4u-us3/lib/libc_psr/libc_psr_hwcap1.so.1 delegated 26448 news txt VREG 85,0 10456 44677 /lib/libdl.so.1 delegated 26448 news txt VREG 85,0 93928 10484 /lib/libsocket.so.1 delegated 26448 news txt VREG 85,0 21472 10982 /lib/libpthread.so.1 delegated 26448 news txt VREG 85,0 280964 2179 /lib/ld.so.1 delegated 26448 news txt VREG 85,5 164 1832 /var/ld/ld.config delegated 26448 news 0r VCHR 13,2 0 /devices (/devices) (like character special /devices/pseudo/mm@0:null) delegated 26448 news 1w VREG 85,3 4797448 4102 /opt/local/news/delegate-test/log/stdout.log delegated 26448 news 2w VREG 85,3 4797448 4102 /opt/local/news/delegate-test/log/stdout.log delegated 26448 news 3r DOOR 0t0 53 /var/run/name_service_door (door to nscd[880]) (FA:->0x6002c965340) delegated 26448 news 4u VREG 348,2 0 113324681 /tmp (swap) delegated 26448 news 5u VREG 348,2 0 95974300 /tmp (swap) delegated 26448 news 6w VREG 85,0 0000000X 4146 /opt/local/news/delegate-test/log/119.18 delegated 26448 news 7u VREG 85,3 1152 4072 /opt/local/news/delegate-test/adm/initlog/_119_ delegated 26448 news 8u IPv4 TCP no TCP/UDP/IP information available delegated 26448 news 9u VREG 85,3 1860 4128 /opt/local/news/delegate-test/adm/svstats9/_119_ delegated 26448 news 10w VREG 85,3 94995 4108 /opt/local/news/delegate-test/log/errors.log.2 delegated 26448 news 11w VREG 85,3 165887 4098 /opt/local/news/delegate-test/log/abort/119 delegated 26448 news 12w VREG 85,0 0000000X 4156 /opt/local/news/delegate-test/log/119.nntp.18 delegated 26448 news 13u IPv4 0x60038f54000 0t0 UDP *:65535 delegated 26448 news 14u IPv4 0x60032a0cfc0 0t0 UDP *:65535 delegated 26448 news 15w VREG 85,3 4797448 4102 /opt/local/news/delegate-test/log/stdout.log delegated 26448 news 16u unix 105,420 0t0 55050244 /devices/pseudo/tl@0:ticots->(socketpair: 0x1a4) (0x6002cc3d028) delegated 26448 news 17u VREG 85,3 3627510 18364 /opt/local/news/spool/delegate/cache/nntp/europe.isp.giganews.com/LIST/active delegated 26448 news 18u unix 105,423 0t0 55050244 /devices/pseudo/tl@0:ticots->(socketpair: 0x1a7) (0x6002cbe3568) delegated 26448 news 19u unix 105,424 0t0 55050244 /devices/pseudo/tl@0:ticots->(socketpair: 0x1a8) (0x600341e8ca0) delegated 26448 news 20u VREG 85,5 1356 347 /var (/dev/md/dsk/d5) delegated 26448 news 21u unix 105,425 0t0 55050244 /devices/pseudo/tl@0:ticots->(socketpair: 0x1a9) (0x6002cc3cad0) delegated 26448 news 22u unix 105,426 0t0 55050244 /devices/pseudo/tl@0:ticots->(socketpair: 0x1aa) (0x600341e8e68) delegated 26448 news 23u VREG 358,30 398959 449 /opt/local/news/spool/delegate/cache/nntp/all/b (filer:/vol12) delegated 26448 news 24u VREG 85,3 11 23381 /opt/local/news/delegate/act/locks/PORT/119.0 delegated 26448 news 25r VCHR 13,2 0 /devices (/devices) (like character special /devices/pseudo/mm@0:null) delegated 26448 news 26u VREG 348,2 0 115782853 /tmp (swap) delegated 26448 news 27u VREG 348,2 229370 00000000X /tmp (swap) delegated 26448 news 46u IPv4 TCP no TCP/UDP/IP information available Hope this helps. Eddie > Cheers, > Yutaka > -- > 9 9 Yutaka Sato <y.sato@delegate.org> http://delegate.org/y.sato/ > ( ~ ) National Institute of Advanced Industrial Science and Technology > _< >_ 1-1-4 Umezono, Tsukuba, Ibaraki, 305-8568 Japan > Do the more with the less -- B. Fuller >