Suspect RRD Graph
Two days ago I moved our IPSEC tunnels over to our pfSense box for testing. Each night we have an offsite backup job that's run which has been running successfully but I noticed something odd with the RRD traffic and packet graphs from the past two nights. The outbound spike in traffic is showing up in both the WAN and Opt WAN interface. The IPSEC tunnels in question are on the WAN interface which in addition to the outbound traffic is also seeing the inbound traffic associated with the offsite backup.
At this point I'm unsure if the issue is with the RRD graphs or if the traffic is actually getting sent out each interface. I'm going to try to do a packet capture on both WAN interfaces tonight to see. Also worth noting is the the Opt WAN interface is our default gateway in our setup.
Attached are some images of the traffic and packet RRD graphs in question.
I wanted to add one more bit of detail. The Wan and the Opt Wan interface are in a failover gateway group with the Opt Wan in Tier 1 and the Wan on Tier 2.
Ok I finally had an opportunity to dig into this odd rrd graph stuff today and there's definitely something going on here.
I did some packet captures today while the offsite backups was running and was not able to see any of the outbound traffic shown in the rrd graph on the Opt WAN interface. I've also started monitoring and collecting trend information from the pfSense using Zenoss. The snmp data and resulting graphs also do not see this outbound traffic on the Opt Wan port.
Today I upgraded to the newest snapshot and if the behavior is still there I'll be filing a bug tomorrow.
The problem is still there with the newest build. Here's a new image that shows the problem comparing the output of the built-in rrd graphs and graphs generated using snmp counters.
I've opened up a ticket for this issue which can be found at http://redmine.pfsense.org/issues/1395
Looks like the more recent builds of RC2 have fixed this problem. I don't upgrade my pfSense box that often so it's hard to say when the problem was resolved but looking at my historical rrd graphs it looks like it was in the first half of May.
I've had to reopen this bug as this behavior has returned in recent builds. Luckily this is a minor issue and snmp counters do return correct data.