Bandwidthd issues?



  • Hi,

    With some recent v2.2 updates, it seems that bandwidthd is broken - and worse yet, when installed it kills the Web Configurator (GUI). Are others seeing this? Has anyone found a fix?

    Thanks!



  • I suspect I have the same problem - I upgraded a spare APU.1C to 2.1.5, played with it, then upgraded to 2.2. the webGUI won't come up. I upgraded a few hours later to a newer 2.2-BETA build and still no webGUI. Then I stopped on Friday afternoon. This spare system has bandwidthd on it.
    I will have access to it again on Sunday, so will have a look then, remove bandwidthd and see if that fixes it. Then try and see what bit of bandwidthd could be breaking the webGUI???



  • If it helps, see this post, where I added a link to the way to manually remove Bandwidthd (https://forum.pfsense.org/index.php?topic=81696.msg446461#msg446461).

    When I do this, and reinstall … the GUI comes back up. Then if I reinstall Bandwidthd, and hit "Save" within Bandwidthd -> the GUI is killed again.

    Hope this helps!



  • I had some problems with bandwidthd during a recent upgrade (19-Sep), and reported them here: https://forum.pfsense.org/index.php?topic=81908.msg448103#msg448103

    I never lost the webgui though.  I did remove the package, reinstall, and got segfaults, so I removed it again.  After upgrading the next day, I re-installed bandwidthd and it's been running fine ever since.

    So, it's not a universal problem.  I'm currently running:

    2.2-BETA (amd64)
    built on Tue Sep 23 13:29:41 CDT 2014
    FreeBSD 10.1-PRERELEASE
    
    


  • OK, and really odd now - I have manually removed bandwidthd (/cf/conf/config.xml) … so it's not shown in the installed packages, nor in the menu. So the GUI is working fine ... and bandwidthd is running! Not sure I can explain it, but hopefully it helps someone out .. :).



  • 2.2-BETA (amd64)
    built on Sat Sep 27 14:17:44 CDT 2014
    FreeBSD 10.1-PRERELEASE

    I had this in system.log

    Sep 28 09:46:05 apu22 php-fpm[54340]: /rc.newwanip: Creating rrd update script
    Sep 28 09:46:05 apu22 bandwidthd: Opening re0
    Sep 28 09:46:05 apu22 bandwidthd: Opening re0
    Sep 28 09:46:05 apu22 bandwidthd: Packet Encoding: Ethernet
    Sep 28 09:46:05 apu22 bandwidthd: Packet Encoding: Ethernet
    Sep 28 09:46:06 apu22 kernel: done.
    Sep 28 09:46:07 apu22 php-fpm[54340]: /rc.newwanip: pfSense package system has detected an ip change 0.0.0.0 ->  192.168.111.100 ... Restarting packages.
    Sep 28 09:46:07 apu22 check_reload_status: Starting packages
    Sep 28 09:46:07 apu22 check_reload_status: Reloading filter
    Sep 28 09:46:08 apu22 php: rc.bootup: ROUTING: setting default route to 10.49.0.250
    Sep 28 09:46:08 apu22 kernel: done.
    Sep 28 09:46:08 apu22 php-fpm[54340]: /rc.start_packages: Restarting/Starting all packages.
    Sep 28 09:46:08 apu22 kernel: done.
    Sep 28 09:46:08 apu22 php-fpm[54340]: /rc.start_packages: You should specify an interface for bandwidthd to listen on. Exiting.
    Sep 28 09:46:08 apu22 check_reload_status: Updating all dyndns
    Sep 28 09:46:08 apu22 kernel: .done.
    Sep 28 09:46:09 apu22 check_reload_status: Could not connect to /var/run/php-fpm.socket
    Sep 28 09:46:10 apu22 check_reload_status: Could not connect to /var/run/php-fpm.socket
    ... could not connect message then repeats a few times a second...
    
    

    It seems that something in starting (or even in this case not starting) bandwidthd has resulted in php-fpm going AWOL.
    There were no php-fpm processes on the system.
    I found this useful command in /etc/rc and ran it from the console:

    /usr/local/sbin/php-fpm -c /usr/local/lib/php.ini -y /usr/local/lib/php-fpm.conf -RD
    

    Now I got php-fpm processes like this:

    [2.2-BETA][root@apu22.localdomain]/var/run(25): ps aux | grep php
    root   20815   0.0  1.4 216392 28572  -  Ss   10:13AM  0:00.01 php-fpm: master
    root   20854   0.0  2.3 220488 45824  -  S    10:13AM  0:00.44 php-fpm: pool li
    root   38102   0.0  0.1  18816  2300 u0  S+   10:14AM  0:00.00 grep php
    

    and files in /var/run like:

    [2.2-BETA][root@apu22.localdomain]/var/run(26): ls -l php*
    -rw-r--r--  1 root  wheel   5 Sep 28 10:13 php-fpm.pid
    srw-------  1 root  wheel   0 Sep 28 10:13 php-fpm.socket
    -rw-r--r--  1 root  wheel  58 Sep 28 09:45 php_modules_load_errors.txt
    

    The webGUI now works. I removed package bandwidthd and rebooted. It all came up fine.

    At least this might help someone get their webGUI running again so they can use it to easily remove bandwidthd.

    I wonder what in bandwidthd is causing php-fpm to disappear?



  • Being a N00b with new hardware, I wasn't sure what i'd done wrong. I can confirm that this was the issue for me too.

    Once I installed Bandwithd, the firewall would work, but i'd get a 503 tryign to access the WebGUI. Tried and tested everything else out. I'll have to be really slow and carefull to add packages now (on the 18-10-2014 snapshot)



  • Hmm, I can't reproduce this problem.  I've upgraded several times, with the bandwidthd package installed, as well as remove & re-install the package, no problem.  Currently running:

    bandwidthd 2.0.1_6 pkg v.0.5, and 
    pfsense 2.2-BETA (amd64) built on Thu Oct 16 18:20:50 CDT 2014 FreeBSD 10.1-RC2
    


  • Very odd - definitely having this issue here. In fact, very odd - I manually removed BandwidthD, so I can get back to the GUI. It says it's not installed, but it's running (and storing data to my PostgreSQL database, as configured earlier).

    Any things I can help check?

    Thanks!



  • Does anything show up in /tmp/php_errors.txt?

    This line, from Phil's log below, looks strange.  Can you verify you have configured an interface for bandwidthd?

    Sep 28 09:46:08 apu22 php-fpm[54340]: /rc.start_packages: You should specify an interface for bandwidthd to listen on. Exiting.
    


  • Hi,

    Nothing here at least … but my BandwidthD is running, even though it says it's not installed ... :(.

    Do I need to reinstall? I can, to check this ... but then I lose the GUI, and it seems to be a very manual process to go back?

    Thanks!



  • Hi,

    OK, some very odd behavior with BandwidthD …  :(. A couple items, below,

    1. I updated my install, but pfSense still says BandwdithD is not installed - even though I now have 2 copies of bandwidthd running it seems. Not sure how / why. Is there a way to fully / cleanly uninstall BandwidthD, to try again?
    2. I reinstalled BandwidthD from the GUI -> yep, my GUI is broken again (clearly the BandwidthD install breaks it!). Nothing in /tmp/php_errors.txt though. Any other places to look?
    3. When I try to manually run /usr/pbi/bandwidthd-amd64/local/bandwidthd/bandwidthd now, I get the error message: Shared object "libpq.so.5" not found, required by "bandwidthd"

    Suggestions?

    Thanks!



  • And a couple more observations,
    4) The command above works to get the GUI going again (/usr/local/sbin/php-fpm -c /usr/local/lib/php.ini -y /usr/local/lib/php-fpm.conf -RD). Thanks!
    5) I then removed and reinstalled BandwidthD … now it seems to be broken (no GUI to configure it, unable to restart it).

    Thoughts?

    Thanks!



  • Sorry for the barrage of postings, but trying to help debug this. A couple more observations …
    6) I updated pfSense, and then BandwidthD gets reinstalled (as I have it installed now) -> does not work from the GUI, and I have to execute the command above to get the GUI restarted.
    7) I can manually start bandwidthd after reboot, by executing /usr/local/etc/rc.d/bandwidthd.sh start

    Thanks!



  • The sad truth is, this install from the Pfsense library has been borked for TWO years!!!! Does the author not maintain it, or how does that work? Very unfortunate.

    Unfortunately I am a little confused about this "patch". I didn't see where it can be downloaded, or how to install it to make it work. Do I need to manually make the code change at the terminal?



  • The patch mentioned near the start of this thread has been applied already, you don't need to edit anything.  It fixed bandwidthd so that it would start up correctly with a full install version of pfSense.  (Prior to that, it only started on embedded versions).

    Problems mentioned later in the thread, such as bandwidthd causing php-fpm (and thus the webgui) to fail, have not been addressed, tracked down or patched.  Patches welcome.

    Note that the later issues do not apply universally; ie, I continue to use bandwidthd without issue on 64bit 2.2Beta



  • Hi,

    I admit, still a bit confused the fact that some installations see this, others don't. Any thoughts on what to check?

    Thanks!



  • I split part of the old thread into this thread, leaving here only the part referencing something that might still be an issue for some.

    @arrmo:

    I admit, still a bit confused the fact that some installations see this, others don't. Any thoughts on what to check?

    I'm not seeing this either, with bandwidthd running on a handful of different systems on latest 2.2 snapshot.

    If you're still seeing this, make sure you're on the latest 2.2 snapshot and can still replicate, and please provide the following info:

    1. which platform, full or nano? (not sure? If you have a Diagnostics>Nanobsd menu, it's nano. otherwise it's full)
    2. 32 or 64 bit? (not sure? look for i386 or amd64 on dashboard)
    3. what other packages do you have installed?
    4. have you installed anything manually at the command line via 'pkg'?
    5. is there anything about your setup or configuration that's atypical? anything really out of the ordinary you're doing


  • @lshantz:

    The sad truth is, this install from the Pfsense library has been borked for TWO years!!!!

    That's not the truth, it's one of the most widely-used packages. I'm not aware of any issues in versions prior to 2.2, the things discussed in this thread and the one where this was originally posted were specific to 2.2 back in the alpha days. It's working fine in 2.2 now as well.



  • Hi,

    Still trying to get this working. It sort of works … but breaks the GUI (I can work around that), but also creates two copies of bandwidthd on upgrade (that's more painful). Is there a way to fully de-install, so I can try a re-install? It could be my machine, as others don't see this, but not sure what all to remove to clean it up.

    Thanks!



  • Hi,

    FYI - did a complete uninstall of bandwidthd, then reinstalled. All looked good, then I rebooted … :(. Again I had two copies of bandwidthd running, and it broke my GUI.

    So the reboot after seems critical (to demonstrate the problem).

    Thanks.



  • Anyone who can replicate the issue with bandwidthd breaking the web interface, is there anything unusual about your setup? Seems it's easy for some to replicate, but most of us, including myself with at least a handful of 2.2 systems running bandwidthd, have never been able to replicate. Could someone share a config backup from a system that exhibits the issue?



  • Yep, I can get that to you - but is there an easy way to remove any sensitive information?

    Thanks!



  • For what its worth, installing bandwidthd seemed to bork my webGUI too.  More specifically, after installation I started configuring it (i.e., selecting the LAN interface and identifying the subnet), I hit save and that's when the webGUI went down.

    I restarted the pfsense box and managed to get back in.  bandwidthd isn't running, and I don't plan to enable it for fear of not being able to get back in next time it goes down.

    My setup isn't hugely special.  I'm running 2.2-RC (amd64), Jan 02 build, with packages apinger, darkstat and snort.



  • @reggie14:

    For what its worth, installing bandwidthd seemed to bork my webGUI too.  More specifically, after installation I started configuring it (i.e., selecting the LAN interface and identifying the subnet), I hit save and that's when the webGUI went down.

    I restarted the pfsense box and managed to get back in.  bandwidthd isn't running, and I don't plan to enable it for fear of not being able to get back in next time it goes down.

    My setup isn't hugely special.  I'm running 2.2-RC (amd64), Jan 02 build, with packages apinger, darkstat and snort.

    Same thing happened to me. But after a reboot it's running without issues.



  • Hi,

    Do you have bandwidthd running on boot (i.e. is it enabled)? That may be your "fix" - below is why I say this …

    I tried a few cases,

    1. Upgrade my release - breaks on reboot, and 2 copies of bandwidthd are running.

    2. Reboot, with bandwidthd enabled. Again, 2 copies are running, and I can see this in the log,
      Jan  5 13:24:37 pfSense bandwidthd: Monitoring subnet 255.255.255.0 with netmask 255.255.255.0
      Jan  5 13:24:37 pfSense bandwidthd: Monitoring subnet 255.255.255.0 with netmask 255.255.255.0
      Jan  5 13:24:37 pfSense bandwidthd: Opening bge0
      Jan  5 13:24:37 pfSense bandwidthd: Packet Encoding: Ethernet
      Jan  5 13:24:37 pfSense bandwidthd: Opening bge0
      Jan  5 13:24:37 pfSense bandwidthd: Packet Encoding: Ethernet

    3. Disable bandwidthd, reboot ... then all is good, GUI doesn't break. So bandwidthd seems to be the culprit. I then manually started bandwidthd (from the GUI ... enable and save). Only a single copy runs (log below), and GUI stays up,
      Jan  6 06:58:51 pfSense bandwidthd: Monitoring subnet 255.255.255.0 with netmask 255.255.255.0
      Jan  6 06:58:51 pfSense bandwidthd: Opening bge0
      Jan  6 06:58:51 pfSense bandwidthd: Packet Encoding: Ethernet

    So it seems that reboot with bandwidhd is the issue ... and 2 copies of bandwidthd are started for some reason (confirmed by ps aux, and also in the logs). Any idea why this happens (and how to fix it)?

    Thanks!



  • Hi,

    OK, perhaps another interesting finding here (that I admit, I stumbled on to accidentally …  ;)).

    In the case where things break, I see the following in the logs. I did make this happen once with manual (command line) restarts of bandwidthd, but can't seem to make it happen again ... :(.

    Jan 6 09:28:11 lighttpd[27258]: (mod_fastcgi.c.1754) connect failed: No such file or directory on unix:/var/run/php-fpm.socket
    Jan 6 09:28:11 lighttpd[27258]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 0 load: 1
    Jan 6 09:28:14 lighttpd[27258]: (mod_fastcgi.c.2848) fcgi-server re-enabled: unix:/var/run/php-fpm.socket

    Followed by,
    sshlockout[43240]: sshlockout/webConfigurator v3.0 starting up

    Thoughts? I think the sshlockout may be what is causing this, but why is it happening?

    Thanks!



  • Hi,

    OK, a bit more here - hoping to get some thoughts on this … ;).

    It seems that the GUI is broken when I stop the two operating bandwidthd processes after boot (more on this below). When I boot up, there are two bandwidthd processes ... does anyone else see this? I did an uninstall / reinstall, get the same thing. This itself doesn't kill the GUI, it's when I ssh in and stop bandwidthd ... then the GUI breaks.

    If I restart php-fpm, then after that I can can stop and start bandwidthd as much as I want - the GUI stays up.

    Thoughts?

    Thanks!



  • And one more thing … stopping bandwidthd also kills ntopng (not just the GUI / php-fpm). Very odd ... :(



  • Hi,

    I have a change I want to try (locally), but it seems that files inside /usr/local/etc/rc.d get recreated on boot - and I admit, I can't find the source file (in text format at least … ;)). If anyone has any pointers I'd appreciate it, just trying to debug.

    Thanks!



  • Hi,

    Hoping someone else is smarter than me here (that wouldn't be difficult … :(). I want to change /usr/local/etc/rc.d/bandwidthd.sh as follows,

    Current:
    rc_start() {
            cd /usr/pbi/bandwidthd-amd64/local/bandwidthd
    LD_LIBRARY_PATH=/usr/pbi/bandwidthd-amd64/local/lib /usr/pbi/bandwidthd-amd64/local/bandwidthd/bandwidthd
    cd -
    }

    Updated:
    rc_start() {
            cd /usr/pbi/bandwidthd-amd64/local/bandwidthd
            if [ -z "ps auxw | grep "[/]usr/pbi/bandwidthd-amd64/local/bandwidthd/bandwidthd"|awk '{print $2}'" ];then
                    LD_LIBRARY_PATH=/usr/pbi/bandwidthd-amd64/local/lib /usr/pbi/bandwidthd-amd64/local/bandwidthd/bandwidthd
            fi
            cd -
    }

    This is just to avoid multiple copies of bandwidthd from being started (as I see all the time). But I can't figure out how /usr/local/etc/rc.d/bandwidthd.sh is getting generated on boot.

    Help?!?!

    Thanks very much!



  • Take a look at /usr/local/pkg/bandwidthd.inc; it writes both bandwidthd.sh and bandwidthd.conf



  • AFAIK it is normal to have 4 bandwidthd processes:

    ps aux | grep bandwidthd
    root       41074  0.0  2.4 15748  5576  -  S     9:41PM  0:00.02 /var/bandwidthd/bandwidthd
    root       41237  0.0  2.4 15748  5588  -  S     9:41PM  0:00.01 /var/bandwidthd/bandwidthd
    root       41425  0.0  2.4 15748  5576  -  S     9:41PM  0:00.01 /var/bandwidthd/bandwidthd
    root       41449  0.0  2.4 15748  5576  -  S     9:41PM  0:00.01 /var/bandwidthd/bandwidthd
    root       44012  0.0  0.9 10396  1952  1  S+    9:42PM  0:00.01 grep bandwidthd
    
    

    I think they are related to the recording of daily, weekly, monthly and yearly data/graphs. Each updates data/graphs at different intervals.



  • Hmmm … at least when logging to an external database (PostgreSQL) this causes problems -> I have confirmed multiple entries for the same points in time, and the daily totals in PostgreSQL are 2x what they should be (due to 2 processes running).

    Thoughts?

    Thanks!



  • Could it be this is why we're seeing different results (working / not working)? I think you're letting bandwidthd generate "local" info, but I'm logging to an external database? Just thinking out loud, trying to figure it out.

    It is also interesting that the path to your bandwidth process is different - this looks to be processing the data (as it's inside var, right?), but mine is the rc.d service / daemon? I'm just trying to stop more than one of those existing, as I am getting double entries in the database (not a good thing).

    Thoughts?

    Thanks very much!



  • Hmm, I have 8 running:

    [2.2-RC][root@pfsense.localdomain]/root: ps axfw | grep bandwidthd
    93615  -  S        0:05.36 /usr/pbi/bandwidthd-amd64/local/bandwidthd/bandwidthd
    93674  -  S        0:04.71 /usr/pbi/bandwidthd-amd64/local/bandwidthd/bandwidthd
    93823  -  S        0:04.30 /usr/pbi/bandwidthd-amd64/local/bandwidthd/bandwidthd
    94051  -  S        0:04.26 /usr/pbi/bandwidthd-amd64/local/bandwidthd/bandwidthd
    94636  -  S        0:05.40 /usr/pbi/bandwidthd-amd64/local/bandwidthd/bandwidthd
    94844  -  S        0:04.75 /usr/pbi/bandwidthd-amd64/local/bandwidthd/bandwidthd
    95011  -  S        0:04.31 /usr/pbi/bandwidthd-amd64/local/bandwidthd/bandwidthd
    95137  -  S        0:04.26 /usr/pbi/bandwidthd-amd64/local/bandwidthd/bandwidthd

    Mine is a full 64bit install, with both "generate CDF" and "recover CDF" enabled, but no PostgreSQL logging enabled.  Bandwidthd runs without issues for me for the past few months, but I haven't tried to reconcile traffic reported by bandwidthd to that reported by the rrd graphs in pfSense.  Bandwidthd would be more helpful for me if it could track IPv6 traffic, so I don't use it much.

    The nano and full pfSense installs are treated differently by bandwidthd: check /usr/local/pkg/bandwidthd.inc starting around line 302.  The nano config for the $rc['start'] stanza runs from lines 315 to 348, while the stanza for full install is lines 351-353.  So, I think your different command lines are expected (maybe not bug-free :) but expected).



  • Thanks for the awesome pointer! That does help. I admit, I'm not sure how these packages and files fit together - stupidity on my part!

    I modified /usr/local/pkg/bandwidthd.inc (locally), and I now avoid the double processes … so one of my problems is resolved. But this also helped me to see the conditions to cause php-fpm to die - more below,

    If I go in to /usr/local/etc/rc.d, I can start and stop bandwidthd (and with my change, I only get one copy of the process now). But I also found the conditions to cause php-fpm to die. If I do a Save from the GUI under bandwidthd (which updates /usr/local/etc/rc.d/bandwidth.sh!), then after this if I run bandwidthd.sh, telling it to stop -> php-fpm dies. If I restart php-fpm (manually) ... I can start or stop bandwidth as much as I want, php-fpm never dies again. Only after another Save (and regeneration of bandwidth.sh), then stopping bandwidthd (the first time) kills php-fpm. I also tried this by just manually stopping bandwidthd, by executing /usr/bin/killall bandwidthd ... and this definitely kills php-fpm. If I restart php-fpm, I can't kill php-fpm again (with starts or stops of bandwidthd) ... until I do a Save from the GUI again, then stopping bandwidthd kills php-fpm.

    Does this make sense? It seems odd to me, but it is very repeatable.

    Thoughts?

    Thanks again for the help!



  • BTW (and here comes my stupid, uneducated question … :() - how can doing a killall to bandwidthd kill other processes (php-fpm, but also ntopng)? It really does seem to kill them - but I can't understand why. Figuring this out really is the root cause.

    Thanks!!!



  • My home system had 8 bandwidthd processes, for some unknown reason - I guess this is one of your problems with it starting twice under some conditions.
    I did the killall command by hand from the command line to see if that would also break php-fpm, and put "-v" so it would tell me what it thinks it killed:

    [2.2-RC][root@testoffice-rt-01.xxx]/usr/local/etc/rc.d: ps aux | grep bandwidthd
    root    17460   0.0  2.7 15748  6072  -  S     8:29AM    0:00.45 /var/bandwidthd/bandwidthd
    root    17871   0.0  2.6 15748  6004  -  S     8:29AM    0:00.33 /var/bandwidthd/bandwidthd
    root    18178   0.0  2.5 15748  5804  -  S     8:29AM    0:00.10 /var/bandwidthd/bandwidthd
    root    18334   0.0  2.5 15748  5808  -  S     8:29AM    0:00.05 /var/bandwidthd/bandwidthd
    root    18587   0.0  2.7 15748  6072  -  S     8:29AM    0:00.45 /var/bandwidthd/bandwidthd
    root    18876   0.0  2.6 15748  5988  -  S     8:29AM    0:00.33 /var/bandwidthd/bandwidthd
    root    18962   0.0  2.5 15748  5804  -  S     8:29AM    0:00.10 /var/bandwidthd/bandwidthd
    root    19024   0.0  2.5 15748  5808  -  S     8:29AM    0:00.06 /var/bandwidthd/bandwidthd
    root    77011   0.0  0.9 10396  1960  0  S+    8:48AM    0:00.01 grep bandwidthd
    [2.2-RC][root@testoffice-rt-01.xxx]/usr/local/etc/rc.d: /usr/bin/killall -v bandwidthd
    kill -TERM 19024
    kill -TERM 18962
    kill -TERM 18876
    kill -TERM 18587
    kill -TERM 18334
    kill -TERM 18178
    kill -TERM 17871
    kill -TERM 17460
    
    

    All worked as expected, and my php-fpm and webGUI still works.
    Then I did a few save on the Bandwidthd webGUI page. No problem there either, 4 old processes go away, 4 new ones start.
    This system is using local bandwidthd data. I will try in a while with "Log data to a PostgreSQL database" option.



  • (which updates /usr/local/etc/rc.d/bandwidth.sh!)

    Various scripts an conf files in pfSense are generated from the GUI and startup code, like this one. Once you discover exactly what needs to be changed in the script, then we can change the PHP code to generate the script correctly.


Log in to reply