CPU 100% Normal 2%, Syslogd seemed to lockup…

tkadams

First off, I want to say how impressed I am with this firewall. Its so easy to configure and use, I love it…

I'm running pfsense on an 800 mhz, 256 megs ram, and 256 meg flash ATA "thin client". So, its the embedded version. I haven't had any issues over the last few weeks. I use a cable modem with dhcp for WAN.

Version 1.2.2
built on Thu Jan 8 23:09:11 EST 2009
Platform embedded

My issue; I had about 4 hours tonight where the cpu was maxed out. I ran top from console and found syslogd to be at 100%.

pfsense:~# top
last pid: 18276; load averages: 2.14, 2.25, 2.16 up 2+06:46:36 23:24:16
36 processes: 2 running, 32 sleeping, 2 zombie
CPU states: 15.5% user, 2.4% nice, 79.8% system, 2.4% interrupt, 0.0% idle
Mem: 25M Active, 8864K Inact, 33M Wired, 24K Cache, 27M Buf, 160M Free
Swap:

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
237 root 1 118 0 3236K 1120K RUN 43:15 100.00% syslogd
1264 root 2 44 20 3340K 1040K select 2:57 0.00% slbd
355 _dhcp 1 44 0 3132K 1284K select 1:30 0.00% dhclient
610 root 1 4 0 6124K 4084K kqread 0:59 0.00% lighttpd
422 root 1 -8 0 3132K 784K piperd 0:55 0.00% logger

I searched the forums and found a few older posts saying there was a bug but it was corrected with version 1.2.1.... But I'm running the latest version... I did a reboot and things have settled down. Any suggestions on what causes this? Is there a better way to "unlock" it than a reboot?

Any suggestions are appreciated..

Thanks,
Tim

brenix

I'm running into the same problem for a few days now. Tried turning off logging for the firewall and a couple of other things, but it still has problems….

The only package I have installed is squid. Also haven't added any crazy rules to the firewall either..

Perry

Cases alike haven't giving any way to replicate why syslogd sometimes go to 100% CPU usage. After a reboot you might never see it again.
plz upgrade to 1.2.3 http://blog.pfsense.org/?p=377
http://snapshots.pfsense.org/FreeBSD7/RELENG_1_2/

brenix

Thanks perry, i'll give it a try and report back if any issues come up.

brenix

Ok, so exactly 4 hours after I upgraded/rebooted to 1.2.3-prerelease, I receive the same problem..Though in an ssh session, top only shows syslogd using 10-13%. There does not seem to be any performance issues..Everything is still running smoothly. Is this something wrong on the php side??

Attached is the screenshot

Picture0001.jpg_thumb

FishOuttaWater

My issue; I had about 4 hours tonight where the cpu was maxed out. I ran top from console and found syslogd to be at 100%.

Me too.

I'm running in a VirtualBox VM on a 2.4GHz P4.

When I first boot up, I see about 35% CPU load, almost all of which is in interrupts. (Perhaps that's what I get for using $7 gbit ethernet cards.)

Under pfsense 1.22, after a while (a few hours?), I would see CPU usage shoot to 100% with either inetd or syslogd taking up most of the time.

I updated to the 2/11 1.23 version, and the behavios is similar, but when the CPU usage shoots up, the user percentage goes way up, but no individual task gets a high percentage. Oddly, both inetd and syslogd keep incrementing in total CPU time, however. Throughput is fine -I can move 5+ mbps through my 6mbps link fine, although Starcraft is giving me 2 bars of latency and is laggy. The web interface is quite sluggish.

We have another VM with a web site, and port 80 at the WAN address gets redirected to the server VM. I use the external address to allow my wife to work on the site with her laptop from home or on the road. Coincidentally, this redirection works from the LAN for a while, then stops working after a few hours.

When I tracked down just what inetd is doing, it became clear that it was handling port redirection and I got suspicious that this might be due to the NAT rules for references to our external address.

What a coincidence! Surely that's the problem, right?!

Well, half-right.

I stuck the LAN address in her hosts file and disabled the NAT redirection for internal access to the external address figuring that if we stop doing the redirection it would stop stalling inetd and syslogd.

It made the inetd time go to near-zero, but syslogd is still racking up the time.

Any ideas? Perhaps this afflicts people named "Tim" in particular?

If nothing else, I thought you would like to hear that 1.2.3 doesn't necessarily irradicate the problem.

- Tim.

FishOuttaWater

Is there a better way to "unlock" it than a reboot?

I logged into the box with ssh, went to the command prompt, found the syslogd process with ps -A | grep syslogd, then did kill -6 to try to dump core on syslogd.

I didn't get my core, but it did kill syslogd, and my CPU usage went back to normal.

Other instances of syslogd popped up and disappeared at regular intervals thereafter, so it seems that syslogd is supposed to appear, handle a log entry, and vanish.

Anyhow, it seems you can recover from the syslogd lock by finding and killing the process.

It shouldn't be too hard to hack around the issue by adding a check for syslogd tasks with more than a minute's processor time and kill them.

That still doesn't really address my inetd problem when I enable NAT reflection, though…

I'm new to hacking around in bsd / linux, but it seems like I should be able to attach to the stalled process with gdb and do a back-trace on the stack to find where he's stuck. I'll let you know if I have any luck.

-Tim.

FishOuttaWater

The issue persists in 1.2.3 RC 1.

About 5 minutes after reboot, top shows syslogd elapsed time rising by 1 second each second, but the usage still shows a very low percentage.

Any ideas for how to resolve this besides killing syslogd from ssh?

- Tim.

rt_rex

…

When I first boot up, I see about 35% CPU load, almost all of which is in interrupts. (Perhaps that's what I get for using $7 gbit ethernet cards.)…

Running :
1.2.2
built on Thu Jan 8 22:39:31 EST 2009
FreeBSD 7.0-RELEASE-p8 i386

I am having the same problem since yesterday i jus added a Gbit 8 € card :P RTL Chipset This one http://www.tp-link.com/products/product_des.asp?id=2
It uses the "re" driver as all Realtek Nics do.
I also updated the Dashboard yesterday,don't know if there could be some issue .
My "top" looks like this:

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
205 root 1 114 0 3236K 1252K RUN 20.9H 81.05% syslogd

Maybe there is a problem with the driver for this cards

Is there any way to compleatly disable the syslog to temporarily solve the problem?

rt_rex

I just did a reboot about 4 hours ago and the problem was solved without any aparent reason.
I did not change anything i just rebooted .

hi-cpu.JPG_thumb