CPU 100% Normal 2%, Syslogd seemed to lockup…



  • First off, I want to say how impressed I am with this firewall. Its so easy to configure and use, I love it…

    I'm running pfsense on an 800 mhz, 256 megs ram, and 256 meg flash ATA "thin client". So, its the embedded version. I haven't had any issues over the last few weeks. I use a cable modem with dhcp for WAN.

    Version  1.2.2
    built on Thu Jan 8 23:09:11 EST 2009
    Platform embedded

    My issue; I had about 4 hours tonight where the cpu was maxed out. I ran top from console and found syslogd to be at 100%.

    pfsense:~#  top
    last pid: 18276;  load averages:  2.14,  2.25,  2.16    up 2+06:46:36  23:24:16
    36 processes:  2 running, 32 sleeping, 2 zombie
    CPU states: 15.5% user,  2.4% nice, 79.8% system,  2.4% interrupt,  0.0% idle
    Mem: 25M Active, 8864K Inact, 33M Wired, 24K Cache, 27M Buf, 160M Free
    Swap:

    PID USERNAME  THR PRI NICE  SIZE    RES STATE    TIME  WCPU COMMAND
      237 root        1 118    0  3236K  1120K RUN    43:15 100.00% syslogd
    1264 root        2  44  20  3340K  1040K select  2:57  0.00% slbd
      355 _dhcp      1  44    0  3132K  1284K select  1:30  0.00% dhclient
      610 root        1  4    0  6124K  4084K kqread  0:59  0.00% lighttpd
      422 root        1  -8    0  3132K  784K piperd  0:55  0.00% logger

    I searched the forums and found a few older posts saying there was a bug but it was corrected with version 1.2.1.... But I'm running the latest version... I did a reboot and things have settled down. Any suggestions on what causes this? Is there a better way to "unlock" it than a reboot?

    Any suggestions are appreciated..

    Thanks,
    Tim



  • I'm running into the same problem for a few days now. Tried turning off logging for the firewall and a couple of other things, but it still has problems….

    The only package I have installed is squid. Also haven't added any crazy rules to the firewall either..



  • Cases alike haven't giving any way to replicate why syslogd sometimes go to 100% CPU usage. After a reboot you might never see it again.
    plz upgrade to 1.2.3 http://blog.pfsense.org/?p=377
    http://snapshots.pfsense.org/FreeBSD7/RELENG_1_2/



  • Thanks perry, i'll give it a try and report back if any issues come up.



  • Ok, so exactly 4 hours after I upgraded/rebooted to 1.2.3-prerelease, I receive the same problem..Though in an ssh session, top only shows syslogd using 10-13%. There does not seem to be any performance issues..Everything is still running smoothly. Is this something wrong on the php side??

    Attached is the screenshot




  • My issue; I had about 4 hours tonight where the cpu was maxed out. I ran top from console and found syslogd to be at 100%.

    Me too.

    I'm running in a VirtualBox VM on a 2.4GHz P4.

    When I first boot up, I see about 35% CPU load, almost all of which is in interrupts. (Perhaps that's what I get for using $7 gbit ethernet cards.)

    Under pfsense 1.22, after a while (a few hours?), I would see CPU usage shoot to 100% with either inetd or syslogd taking up most of the time.

    I updated to the 2/11 1.23 version, and the behavios is similar, but when the CPU usage shoots up, the user percentage goes way up, but no individual task gets a high percentage. Oddly, both inetd and syslogd keep incrementing in total CPU time, however. Throughput is fine -I can move 5+ mbps through my 6mbps link fine, although Starcraft is giving me 2 bars of latency and is laggy. The web interface is quite sluggish.

    We have another VM with a web site, and port 80 at the WAN address gets redirected to the server VM. I use the external address to allow my wife to work on the site with her laptop from home or on the road. Coincidentally, this redirection works from the LAN for a while, then stops working after a few hours.

    When I tracked down just what inetd is doing, it became clear that it was handling port redirection and I got suspicious that this might be due to the NAT rules for references to our external address.

    What a coincidence! Surely that's the problem, right?!

    Well, half-right.

    I stuck the LAN address in her hosts file and disabled the NAT redirection for internal access to the external address figuring that if we stop doing the redirection it would stop stalling inetd and syslogd.

    It made the inetd time go to near-zero, but syslogd is still racking up the time.

    Any ideas? Perhaps this afflicts people named "Tim" in particular?

    If nothing else, I thought you would like to hear that 1.2.3 doesn't necessarily irradicate the problem.

    - Tim.



  • Is there a better way to "unlock" it than a reboot?

    I logged into the box with ssh, went to the command prompt, found the syslogd process with ps -A | grep syslogd, then did kill -6 to try to dump core on syslogd.

    I didn't get my core, but it did kill syslogd, and my CPU usage went back to normal.

    Other instances of syslogd popped up and disappeared at regular intervals thereafter, so it seems that syslogd is supposed to appear, handle a log entry, and vanish.

    Anyhow, it seems you can recover from the syslogd lock by finding and killing the process.

    It shouldn't be too hard to hack around the issue by adding a check for syslogd tasks with more than a minute's processor time and kill them.

    That still doesn't really address my inetd problem when I enable NAT reflection, though…

    I'm new to hacking around in bsd / linux, but it seems like I should be able to attach to the stalled process with gdb and do a back-trace on the stack to find where he's stuck. I'll let you know if I have any luck.

    -Tim.



  • The issue persists in 1.2.3 RC 1.

    About 5 minutes after reboot, top shows syslogd elapsed time rising by 1 second each second, but the usage still shows a very low percentage.

    Any ideas for how to resolve this besides killing syslogd from ssh?

    - Tim.



  • When I first boot up, I see about 35% CPU load, almost all of which is in interrupts. (Perhaps that's what I get for using $7 gbit ethernet cards.)…

    Running :
    1.2.2
    built on Thu Jan 8 22:39:31 EST 2009
    FreeBSD 7.0-RELEASE-p8 i386

    I am having the same problem since yesterday i jus added a Gbit 8 € card  :P RTL Chipset This one http://www.tp-link.com/products/product_des.asp?id=2
    It uses the "re" driver as all Realtek Nics do.
    I also updated the Dashboard yesterday,don't know if there could be some issue .
    My "top" looks like this:

    PID USERNAME  THR PRI NICE  SIZE    RES STATE    TIME  WCPU COMMAND
    205 root            1 114    0  3236K  1252K RUN    20.9H  81.05%  syslogd

    Maybe there is a problem with the driver for this cards

    Is there any way to compleatly disable the syslog to temporarily solve the problem?



  • I just did a reboot about 4 hours ago and the problem was solved without any aparent reason.
    I did not change anything i just rebooted .



Locked