Possible Problem with Squidguard Schedules causing Kernel Panic.

duanes

Are there any known problems with the Schedule feature of Squidguard (running AMD64 Current 2.02 release) ?

I am getting frequent crashes (3-5 per day) when schedule is enabled. Developers should may be able to find the crash uploads (host name is vfw-ka-nat1). I also found a few reports that the built in NIC (em0 from older nvidia chipset) caused crashes for some until the NIC was disabled in bios, even though the nic is not being used. I have not disable it yet because my crashes stopped when I turned off the schedule that was on one of my squidguard group acls.

Tell me what further info you need and I'll gather the data.

THANKS !!

duanes

I disabled schedule and the system ran for four days without a crash.

I re-enabled Squidguard schedule yesterday morning and the system ran all day without a problem, but the crashed around 5pm. During the day, the schedule changes the filters for lunch and a 15 minute morning break and 15 minute evening break and again at the end of business at 4pm.

From the below data, it looks like a missing file or something, but the problem does not occur when schedules are not being used.

Here is the trap:
Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address = 0x18
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff806eecc8
stack pointer = 0x28:0xffffff800002f980
frame pointer = 0x28:0xffffff800002f9c0
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = resume, IOPL = 0
current process = 11 (idle: cpu1)

0xffffff0002661000: tag ufs, type VREG
usecount 1, writecount 1, refcount 3 mountedhere 0
flags ()
v_object 0xffffff0012297948 ref 0 pages 1
lock type ufs: EXCL by thread 0xffffff0077712000 (pid 20236)
ino 18939893, on dev ad4s1a

0xffffff0005d88000: tag ufs, type VREG
usecount 1, writecount 1, refcount 3 mountedhere 0
flags ()
v_object 0xffffff0011f9a510 ref 0 pages 1
lock type ufs: EXCL by thread 0xffffff0071d07ba0 (pid 20207)
ino 18940030, on dev ad4s1a

jimp

Is that on a full install or NanoBSD?

The panic seems to be in the filesystem code so my first guess would be the hard drive or install media (CF, SSD, etc) is failing or the partition is corrupted.

Schedules would cause more disk writes at the times it switches, so there is more going on to trigger a failure. It would probably happen with any periodic filesystem write if the media is beginning to die.

duanes

This is a full install it is looking less like a schedule problem.

I am now leaning more to the built in NIC driver (enc0). I disabled the built in NIC on this board but the scheduled is ENABLED. The system has been running for 5 days without a kernel panic now. I was getting several panic's per day.

All other settings are the same. - Although, in thinking back, I disabled the "Use PowerD" checkmark as well.

I saw that it appeared to be a filesystem issue earlier, so I ran disk burn in for a while and found nothing. I swapped the HD with another, but that didn't seem to make a difference.

I will reenable the PowerD option now and see if the problem returns.

jimp

If the HDD is OK it could be the disk controller, cpu/memory/overheating/etc - basically anything else hardware, but if the crash is always in the filesystem code, the HDD is the most likely target, then cables, then controller/motherboard.

As for enc0, that's IPsec, you can't disable that, perhaps you're thinking of a different driver?

duanes

Sorry, my bad. - the driver doesn't show up any longer and was for an older nVidia chipset, running Athlon 4400 CPU with built in 10/100 NIC. Maybe it was nv0 that was listed.

When I get a chance, I'll search more for the info. I was trying too many things at once in a desperate effort to stem the kernel panics. Now, I'll unwind some of the changes and see if I can find the cause.

BTW: THANKS for your help ! It is appreciated.

D.

duanes

I got the crash again this morning.

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
Fatal double fault
rip = 0xffffffff80734ad2
rsp = 0xffffff802e7a9fb0
rbp = 0xffffff802e7aa080
cpuid = 0; apic id = 00
panic: double fault
cpuid = 0
KDB: enter: panic

The only change was to enable powerd. The crash occurred with the opening of the business as traffic began increasing. Apparently, something on the system does not play well with powerd.

jimp

Possibly, if it's that easy to reproduce only with powerd enabled, it may be worth checking on a BIOS update to see if the manufacturer fixed and power management issues.

If the fix is as easy as disabling powerd, at least you can stabilize it.