RRD Graphs stop working



  • Hi all,

    I see a strange behaviour on the RRD Graphs section.The quality tab stoped recording/drawing data for 2 days.
    This is quite critical cz we rely on this in order to check ISP services on some locations.
    Any idea how to make it work again?

    thanks



  • Is your drive full?



  • Nope. Used only 4%. It was the first I checked.



  • any other ideas?



  • @nikkon:

    Any idea how to make it work again?

    Yep. Simple. Start dealing out a lot of info.
    Right now, I'm having a hard time trying to understand what doesn't work well on your system.

    @nikkon:

    I see a strange behaviour on the RRD Graphs section.The quality tab stoped recording/drawing data for 2 days.

    Images are being generated from the 'rrd' files.
    Check timestamp of the all the 'png' images, they are here: /tmp
    These png images are created when you view the corresponding Status=> RRD Graphs page.
    Check if RRD files are updates regularly - you can find them here : /var/db/rrd

    GUI test: The quality rrd chart is being fed by a tool called 'apinger'. You can see it here : Status => Services, its called "apinger Gateway Monitoring Daemon" : is it running ?
    PUI test: Enter into SSH, and execute

    ps ax | grep 'apinger'
    
    

    is it running ?

    Check this file : /var/etc/apinger.conf
    Does it look good ?

    Check this file /var/run/apinger.status
    Is it updated regularly ?

    Last, but not least: what is your setup page "System => Gateways" ?

    @nikkon:

    This is quite critical cz we rely on this in order to check ISP services on some locations.

    Restarting apinger doesn't do the job ?
    Logs say something about apinger being stopped ?

    Understand that this one is running for years on my pfsense box, and I'm pretty sure I'm not the only one.
    Your question boils down to: what is so different with your setup ?



  • Also, when you did your update did you change from 32 bit to 64 bit or the other way around?  Any changes like that?



  • Hummmm, that question would open this door : https://doc.pfsense.org/index.php/Upgrade_Guide#Changing_architecture_.2832-bit_to_64-bit_or_vice_versa.29_during_upgrade
    You think nikkon would hide that kind of information  ???

    I presume: it was working for day, weeks, and with no-one touching the box, ping graphing stopped.



  • space on system:
    df -h
    Filesystem                                Size    Used  Avail Capacity  Mounted on
    /dev/ufsid/55104bbfa2fe12f2    29G    1.1G    26G    4%    /
    devfs                                        1.0K    1.0K      0B  100%    /dev
    /dev/md0                                3.4M    188K    2.9M    6%    /var/run
    devfs                                        1.0K    1.0K      0B  100%    /var/dhcpd/dev
    procfs                                      4.0K    4.0K      0B  100%    /proc

    –-------------------

    ps ax | grep 'apinger'
    69052  -  Ss      1:27.05 /usr/local/sbin/apinger -c /var/etc/apinger.conf
    3911  0  S+      0:00.00 grep apinger

    ========================

    /var/etc/apinger.conf look good.

    cat /var/run/apinger.status
    8.8.8.8|5.12.101.58|WAN_PPPOE|33690|33676|1428046888|16.352ms|0.0%|none

    Because my ISP uses PPPoE i got only gateways like 10.0.0.1 so i have added 8.8.8.8 in the gate section as Monitored IP.

    The files under /var/db/rrd seem to be accesed today:
    ]/var/db/rrd: ls -la
    total 5104
    drwxr-xr-x  2 nobody  wheel    512 Apr  3 00:56 .
    drwxr-xr-x  18 root    wheel    1024 Apr  3 09:56 ..
    -rw-r--r--  1 nobody  wheel  98784 Mar 30 12:13 WAN_PPPOE-quality.rrd
    -rw-r--r--  1 nobody  wheel  393168 Apr  3 10:42 ipsec-packets.rrd
    -rw-r--r--  1 nobody  wheel  393168 Apr  3 10:42 ipsec-traffic.rrd
    -rw-r--r--  1 nobody  wheel  393168 Apr  3 10:42 lan-packets.rrd
    -rw-r--r--  1 nobody  wheel  393168 Apr  3 10:42 lan-traffic.rrd
    -rw-r--r--  1 nobody  wheel  393168 Apr  3 10:42 ovpns1-packets.rrd
    -rw-r--r--  1 nobody  wheel  393168 Apr  3 10:42 ovpns1-traffic.rrd
    -rw-r--r--  1 nobody  wheel  49720 Apr  3 10:42 ovpns1-vpnusers.rrd
    -rw-r--r--  1 nobody  wheel  588592 Apr  3 10:42 system-mbuf.rrd
    -rw-r--r--  1 nobody  wheel  735320 Apr  3 10:42 system-memory.rrd
    -rw-r--r--  1 nobody  wheel  245976 Apr  3 10:42 system-processor.rrd
    -rw-r--r--  1 nobody  wheel  245976 Apr  3 10:42 system-states.rrd
    -rw-r--r--  1 root    wheel    5955 Apr  3 00:56 updaterrd.sh
    -rw-r--r--  1 nobody  wheel  393168 Apr  3 10:42 wan-packets.rrd
    -rw-r--r--  1 nobody  wheel  393168 Apr  3 10:42 wan-traffic.rrd


    Arpinf restart didn't changed anything.



  • @Gertjan:

    Hummmm, that question would open this door : https://doc.pfsense.org/index.php/Upgrade_Guide#Changing_architecture_.2832-bit_to_64-bit_or_vice_versa.29_during_upgrade
    You think nikkon would hide that kind of information  ???

    I presume: it was working for day, weeks, and with no-one touching the box, ping graphing stopped.

    No updates…all boxes are fresh 2.2.1 installs.
    On all boxes i use Watchdog...so every process will be restarted of fails.



  • I'd do another one of those fresh installs - Wipe the drive.

    Also be sure that Gateway monitoring is on.



  • ok it seems to work now after 3 restarts of the arping.
    strange.
    thx for helping.



  • That is strange.  Glad it was nothing major.



  • @nikkon:

    cat /var/run/apinger.status
    8.8.8.8|5.12.101.58|WAN_PPPOE|33690|33676|1428046888|16.352ms|0.0%|none

    Because my ISP uses PPPoE i got only gateways like 10.0.0.1 so i have added 8.8.8.8 in the gate section as Monitored IP.

    Your "8.8.8.8" is considered as the gateway.
    Check Status => Interfaces: what is "Gateway IPv4" ?
    Are you sure that "8.8.8.8" is always replying to a ping (this is a main Google DNS server google-public-dns-a.google.com) - I wouldn't consider "8.8.8.8" as a valid ping replier. Its a very loaded DNS server, and in case of overload it starts to ditch less-important protocols, like 'ICMP'.

    When you say : "10.0.0.1" you are talking about your WAN IP gateway, right ? So you have a router-behind-router setup ?
    If possible, hook up pfSense directly to the net.



  • I use 8.8.8.8 as monitored IP in order to have apring working…if I use ISP gate alocated via pppoe is always down, even if i have net.
    this can't be changed. it's on the ISP.
    Yes 10.0.0.1 is my wan gate.



  • I would reduce my pings from every 1 second to every 5 or ten seconds in that case.

    Its just as useful for you and less likely to have your packets dropped as a nuisance.

    I did same and my apinger started behaving better.



  • @kejianshi:

    I would reduce my pings from every 1 second to every 5 or ten seconds in that case.

    Its just as useful for you and less likely to have your packets dropped as a nuisance.

    I did same and my apinger started behaving better.

    right thx.


Log in to reply