Solved: Bandwidthd core dump

makesense

Installed Bandwdthd on two identical pfSense systems (Dell 2650)- When accessing bandwidthd screens, all appears OK and I like what it can do for me but after each graphing run (every 2.5 min or 10 min interval) a syslog entry is made with the following:

kernel: pid 12345 (bandwidthd), uid 0: exited on signal 8 (core dumped)

A search of the forum shows one mention of it with no resolution. I hope I can provide enough info here to allow someone to assist me in fixing this annoying problem. I'm running pfsense 1.2.3-RC3. Below is my bandwidthd.conf file. Shelling out and running top shows 4 bandwidthd processes and from the webGUI Staus/Services shows bandwidthd running. Just how much should I care about fixing these bandwidthd core dumps?

This file was automatically generated by the pfSense

package management system. Changing this file

will lead to it being overwritten again when

the package manage resyncs.

####################################################

Bandwidthd.conf

Commented out options are here to provide

documentation and represent defaults

Subnets to collect statistics on. Traffic that

matches none of these subnets will be ignored.

Syntax is either IP Subnet Mask or CIDR

subnet 192.168.80.0/23

Device to listen on

Bandwidthd listens on the first device it detects

by default. Run "bandwidthd -l" for a list of

devices.

dev "fxp0"

###################################################

Options that don't usually get changed

An interval is 2.5 minutes, this is how many

intervals to skip before doing a graphing run

Graph cutoff is how many k must be transfered by an

ip before we bother to graph it

graph_cutoff 1

#Put interface in promiscuous mode to score to traffic
#that may not be routing through the host machine.
promiscuous true

#Log data to cdf file htdocs/log.cdf

#Read back the cdf file on startup

#Libpcap format filter string used to control what bandwidthd see's
#Please always include "ip" in the string to avoid strange problems

#Draw Graphs - This default to true to graph the traffic bandwidthd is recording
#Usually set this to false if you only want cdf output or
#you are using the database output option. Bandwidthd will use very little
#ram and cpu if this is set to false.

#Set META REFRESH seconds (default 150, use 0 to disable).
???

jimp

@makesense:

I'm running pfsense 1.2.3-RC3

Upgrade to -RELEASE and try again. Few people are interested in chasing down a bug on an RC version since it may have been fixed in -RELEASE.

makesense

Will do and report back.

makesense

upgraded to 1.2.3 RELEASE
I actually have two identical pfsense boxes. Both are still core dumping on bandwidthd
typical/sample from syslog– messages are as follows:
Jun 7 09:40:03 kernel: pid 55027 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 09:30:02 kernel: pid 51791 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 09:27:37 kernel: pid 50897 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 09:20:01 kernel: pid 48348 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 09:10:00 kernel: pid 44929 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 08:59:59 kernel: pid 41478 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 08:49:58 kernel: pid 38074 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 08:39:57 kernel: pid 34739 (bandwidthd), uid 0: exited on signal 8 (core dumped)

and after reloading bandwidthd, syslog from the other pfsense:
Jun 7 10:59:48 kernel: pid 14984 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 10:56:27 kernel: pid 13857 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 10:53:06 kernel: pid 12670 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 10:49:45 kernel: pid 11553 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 10:46:24 kernel: pid 10435 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 10:29:28 kernel: pid 4646 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 10:19:36 kernel: pid 1339 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 10:19:27 kernel: pid 1235 (bandwidthd), uid 0: exited on signal 8 (core dumped)
Jun 7 09:39:23 last message repeated 3 times
Jun 7 09:39:23 bandwidthd: Packet Encoding: Ethernet
Jun 7 09:39:23 last message repeated 3 times
Jun 7 09:39:23 kernel: fxp0: promiscuous mode enabled
Jun 7 09:39:23 bandwidthd: Opening fxp0
Jun 7 09:39:23 bandwidthd: Monitoring subnet 192.168.80.0 with netmask 192.168.80.0

Alan87i

I have the exact same problem .
One thing I notice is If I just leave it be the date I last reinstalled or saved the config never changes.
What is the code for listing all the files including the bandwidthd config file in shell?
I want to uninstall from packages and then from shell delete any left over files from bandwidthd Reboot and re install.

Alan87i

I uninstalled Bandwidthd
Backed up the configuration with NO package info (was only package installed) I checked the xml file to be sure bandwidthd was not listed any where. Then reloaded the config and reinstalled the bandwidthd package. This time my original config settings were all blank and so far just over 1/2 hour I have had no core dumps in the sys log.

If you have other packages installed Backup the whole config after uninstalling bandwidthd and manually edit the bandwidthd package info out of the xml file and try it.

Alan87i

Yup lasted for 3 hours Back to core dumped again
What gives with this program working fine on 3 other installs and not on this one!

makesense

Not glad, but somewhast pleased to see that someone else besides the 2 pfsense boxes I have exhibit this problem with bandwidthd.

I'm running it on a Dell PERC 2650 (dual CPU, 18 GB raid with 2GB RAM)

What do you see in STATUS/SERVICES?

Each time I uninstalled/re-installed bandwidthd, status/services showed duplicate entries for bandwidthd. Editing the XML was the only way to keep it to one entry but it still core dumps after a short period of time (1 to 2 hours). I can stop/start it and it will eventually start doing core dumps again.

Why us?

Alan87i

Only see bandwidthd and the 3 normal services.
I have one system on a dell 280 no problems. the system with issues is an IBM Model number escapes me at the moment. When It starts core dumping I notice the graph seems to be live still and on the correct time line. But the top left corner where the time and date are displayed the time is locked at the point where the sys logs show the first fer core dumps. And it dumps every 2 too 3 minutes.

makesense

exact same issue with time in the top left corner on both mine too.
Startup appears normal (from syslog)
Jun 8 09:39:16 bandwidthd: Packet Encoding: Ethernet
Jun 8 09:39:16 kernel: fxp1: promiscuous mode enabled
Jun 8 09:39:16 bandwidthd: Opening fxp1
Jun 8 09:39:16 bandwidthd: Monitoring subnet 192.168.80.0 with netmask 192.168.80

from the console, shelling out and running top, how many bandwidthd processes do you have running? - I have four running because I have bandwidthd listed twice in my STATS/SERVICES. I have to do the manual XML edit to fix this but can't until non-peak usage hours late late tonight. With yours, do your have two running or just one? I thought my having multiple entries in Status/Services was the problem but not so.

Alan87i

I have 4 processes of it running too , That's normal I think.
Try this Uncheck the Draw graphs option on the config page. Did that at Noon today it's now 5:30 PM and I didn't see any core dumps.

Alan87i

@Alan87i:

I have 4 processes of it running too , That's normal I think.
Try this Uncheck the Draw graphs option on the config page. Did that at Noon today it's now 5:30 PM and I didn't see any core dumps.

Scratch that 2 hours later it started again

makesense

It may have something to do with the database. I see different dates in the upper left on Daily, Weekly and Monthly. Daily and monthly are keeping updated with current date/time +/- three minute because of the update interval. Whereas Weekly has yesterday's date. But I've also seen the date wrong on Daily too.

When you try to access it too soon it says (as it should) it has nothing to graph and to refer to the README. No readme in /usr/local/bandwidthd

makesense

I see you use (or have used) ntop. How does it compare to bandwidthd?

Alan87i

Ntop is beyond basic, It charts everything. I have only checked it a few times as I have had little time to study up on it and all it's functions. I like B d for it's easy to read graphs and counter. And it's got just and only what I want to see at the top of the page.

makesense

I changed a few settings for the config of Bwd. Then from the console,shelled out and did a chmod 666 (was 600)for the bandwidthd.core file in
/usr/local/bandwidthd and so far no core dumps. It is not quite past the five hour max we have seen. But I usually get core dumps within 1 to 2 hours.
Here's the settings:

skip interval=6 (default)
graph cutoff=1024 (default is 100)
prom. on=checked
output_cdf =unchecked
recover_cdf=unchecked
filter=no entry/blank/empty
Draw Graphs = checked
meta refresh = 333 (default is 150)

:o

Alan87i

Setting the graph cut off too 100 stopped the core dumps. I have ran this on 4 different box's now and all but this machine I set the cut off to 5 ( Make it work + More accurate I figure). First time I have see this issue caused from the cut off setting. It's a 3 Gig cpu with 1 Gb of memory using 14 % at the most. I'm going to try 50 this morning and let it run all day.
Will report back if it starts dumping again.

makesense

;D
Good news. My core dumps are gone too. On the graph cutoff setting….How low can we go!?!
I have dual XEON 2.4 and 2GB RAM using 4 Intel 82558 Pro/100 NICs
RAID:
Partition Percent Capacity Free Used Size
/dev/aacd0s1a 0% 26.52 GB 124.98 MB 28.95 GB
/dev/md0 1% 3.29 MB 28.00 KB 3.61 MB
devfs 100% 0.00 KB 1.00 KB 1.00 KB
devfs 100% 0.00 KB 1.00 KB 1.00 KB
Totals : 0% 26.52 GB 125.01 MB 28.96 GB

Alan87i

I set the problem box to 50 for the cut off this morning and it has not core dumped all day. Will try 25 tomorrow morning. My dell 280 is set @ 5 and has been for months. I wonder if the built in Nic on the IBM is the fault. Both have the same Gigabit dual lan card for the Lan and Opt1 interface. Should have bought 2 of those dells!!

makesense

The lower the beter. I'll keep bumping my ol' dell 2650 down and keep posting. Also saw in the expired bounty posts references to NIC sensitivity. I don't use the onboard nics on the 2650. Went with two dual intel pro 100s. What's the CPU/ram in the 280?