How to troubleshoot - lost packets.



  • I have a VOIP phone, when talking on the phone I will have period of 1 or 2 seconds where the audio will drop. The connection would remain but no audio briefly.

    Could be anything, the phone, VOIP server, Internet connection.... whatever.

    My kid has also been complaining about 'server disconnected' error when playing online games. It just glitches for 1 or 2 seconds. Could be anything, remote server, internet connection.... whatever.

    I also have CCTV camera monitoring of a remote site, so it relies on the Internet.

    I was talking on the VOIP phone, watching the CCTV camera. I noticed when the VOIP phone lost audio, the CCTV monitors also froze. Image just pauses for 1 or 2 seconds and resumes.

    Hmmm...

    So I started a constant PING from my Windows desktop to the firewall.

    The ping is always < 1ms

    Then the pieces started falling into place. When the VOIP phone loses audio, or my kid screams about his game I see a ping timeout to the firewall. Usually the next 1 or 2 ping times will be > 500ms.

    ping.gif

    It looks like there is something wrong at the pfSense firewall side of things.

    There is essentially no load on the firewall, load average tends to be something like 0.12, 0.21, 0.20 so certainly not overworked.

    The hardware is PC Engines APU2 running 2.4.5-RELEASE

    Anyone have a suggestion how to debug the pfSense side of things? Any logs I should be watching on the pfSense box?



  • Does it occur often and at the same time interval?
    It is not necessarily Pfsense that is the cause of the problem, the easiest way to determine if it is are to start with a single PC directly connected to your APU2 and see if that eliminates the problem.



  • @bobbenheim Thanks for your reply. I had actually started timing the packet loss shortly after my post.

    Found something interesting for sure.

    It happens every 15 minutes, starting from the top of the hour. So 8:00, 8:15, 8:45, 9:00 will all experience the momentary packet loss.

    My guess, something is happening every 15 minutes on my pfSense install.... now just to figure out what it is.



  • @geyser said in How to troubleshoot - lost packets.:

    @bobbenheim Thanks for your reply. I had actually started timing the packet loss shortly after my post.

    Found something interesting for sure.

    It happens every 15 minutes, starting from the top of the hour. So 8:00, 8:15, 8:45, 9:00 will all experience the momentary packet loss.

    My guess, something is happening every 15 minutes on my pfSense install.... now just to figure out what it is.

    Do you have any additional packages installed?



  • I have more clues!

    I have been watching the CPU usage from the dashboard. It hovers around 8%, then right at the 15 minute mark it spikes up to 97% and the packets drop.

    So, it is for sure something going on inside this box and not something else on the network or the Internet connection.

    I have NO packages installed on this box, 100% pfSense only code. The CPU spikes but I do not see any traffic spike in the traffic graphs.

    Any ideas what tasks pfSense does every 15 minutes?



  • If it is every 15 minutes to the second, that sounds like a cron task of some sort. But off the top of my head I'm not aware of any default one that runs on that frequency.

    Look in your pfSense system log and see what, if anything, is being logged there that may be related. One possibility is your ISP DHCP lease on the WAN renewing, but every 15 minutes would be an extremely abnormal interval for that. Plus your "downtimes" don't seem long enough to cover a DHCP release/renew cycle.

    There are other posts here with similar issues of high intermittent latency and CPU spikes. Many of those folks (but not all) are running virtualized on Hyper-V. But some bare-metal users are reporting the same thing. Go read through some of the posts in the Installs and Upgrades sub-forum here.



  • @bmeeks Found something.

    Under system logs > DNS resolver

    It is running a process every 15 minutes, exactly when the spikes are occuring.

    The process is labeled filterdns and it seems to be doing something with IP addresses I have defined under the IP aliases section of pfSense.

    I'm not aware of anything I have created called filterdns, is that something build in?



  • @geyser said in How to troubleshoot - lost packets.:

    @bmeeks Found something.

    Under system logs > DNS resolver

    It is running a process every 15 minutes, exactly when the spikes are occuring.

    The process is labeled filterdns and it seems to be doing something with IP addresses I have defined under the IP aliases section of pfSense.

    I'm not aware of anything I have created called filterdns, is that something build in?

    filterdns is a built-in pfSense service. It runs to resolve the IP address of any FQDN (fully qualified domain name) you might specify in an alias. So what do you have defined for FQDN aliases? Anything that might produce a lot of IP addresses for filterdns to work with?

    Sounds like you are hitting some of the issues some other folks are complaining about in the 2.4.5 release. If you have some custom FQDN aliases defined, try turning them off for a test.



  • @bmeeks I have four different alias groups, with a total of 21 entries between them.

    Of the 21 entries 19 are just IP addresses and two are hostnames that would require resolution.

    Of the two that require resolution, one resolves to a single IP address and the other contains a list of about 30 IP addresses.

    I'll have to mess around with those and see what is triggering it.

    I think you may be correct, I upgraded to this version of pfSense 11 days ago and never noticed this problem before that.



  • Can you try This and see if it makes a difference, it had apparently fixed it for others with a similar problem as yours.


  • LAYER 8 Global Moderator

    Pretty sure aliases default update every 300 seconds, 5 minutes.. Not 15 - so some edit would of had to have been done if your saying that is running every 15 minutes.

    Where are you seeing that filterdns would run every 15 minutes exactly?

    Example - here is mine, shows it would run every 300 seconds (5 minutes)

    [2.4.5-RELEASE][admin@sg4860.local.lan]/: ps x | grep filterdns
    10747  -  Is       0:00.00 /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 300 -c /var/etc/filterdns.conf -d 1
    


  • Another update :-(

    I removed ALL aliases and all rules that used them.

    Top of the hour came around, same issue. CPU spike and packet loss.

    However, there is no longer any entry under the system logs > DNS resolver

    Which I guess is to be expected, because I removed all aliases.

    Apparently my issue was not related to the aliases.


  • LAYER 8 Global Moderator

    @geyser said in How to troubleshoot - lost packets.:

    there is no longer any entry under the system logs > DNS resolver

    What was this entry exactly?



  • Well, @johnpoz is correct. The filterdns cron task should normally run every 5 minutes and not every 15 minutes.

    Also, the amount of alias "work" you mentioned should be trivial. The other folks having issues had very large alias lists.



  • @johnpoz The system log was showing stuff like this:

    Apr 8 22:30:54 filterdns Adding Action: pf table: sample_network host: 123.145.151.118
    Apr 8 22:30:54 filterdns Adding Action: pf table: sample_network host: 123.145.151.117
    Apr 8 22:30:54 filterdns Adding Action: pf table: sample_network host: 123.145.151.116
    Apr 8 22:30:54 filterdns Adding Action: pf table: sample_network host: 123.145.151.115
    Apr 8 22:30:54 filterdns Adding Action: pf table: sample_network host: 123.145.151.114
    Apr 8 22:30:54 filterdns Adding Action: pf table: sample_network host: dyn.example.com
    Apr 8 22:30:54 filterdns merge_config: configuration reload

    I was attracted to it, because it runs every 15 minutes. After removing the aliases these entries are gone but the problem persists.


  • LAYER 8 Global Moderator

    There is nothing in your system log?

    So your sure its pfsense - and could not be infrastructure going to pfsense.. Can you plug something directly into pfsense port, and get a running ping going?



  • @johnpoz Nope, nothing with any frequency.

    Under general there are normal looking entries, like dyndns triggering etc. Nothing with any 15 minute frequency.

    Under gateways, no entry for days.

    Routing, nothing at all.

    Nothing under any system logs tabs contain anything that occurs every 15 minutes that I can find.



  • When in doubt... roll back.

    I could not easily find a previous version of pfSense, seems they don't like to allow those.

    I got out my old APU box (vs the APU2) which is running build 2.3.5-RELEASE-p2 and it works perfectly. No packet-loss or issues at all.



  • I don't know if it helps, but we have the same symptoms here, showing up recently, and I updated to 2.4.5 not too long ago (updated packages more recently). The system setup is surely very different, but it seems it only happens for us when pfblockerNG-devel is enabled; I can in fact reproduce it at will by doing a Status > Filter Reload. With pfblockerNG-devel disabled, the manual filter reload does not have issue and the every-15-minute issue is gone as well.

    Perhaps it's reproducible for someone else who can track the issue down as well.



  • I am having exactly the same symptoms - I was not sure what was going - whether it was my machine or my network or the firewall. I used ping plotter to determine it was the pfsense firewall - including its LAN interface which was dropping packets for a few seconds on regular(ish) intervals. Been causing havoc while I work from home, voip calls dropping - vpn disconnecting.

    I removed all unneeded packages from my install running version 2.4.5- other than snort security 3.2.9.11 and nmap security 1.4.4_1. Still happening. I disabled snort on my internal interface - still happens. Checked all of the system logs but nothing correlates with the drops. I replaced the cabling - although i could not see any crc or interface errors on my switchports. Nothing has fixed it yet.

    Would LOVE for this to be rectified - as i am considering moving to another device if I cant get this fixed :-(



  • Im wondering if this is a driver issue - im very new to linux im afraid is there anyway I can tell which drivers i have installed?



  • @1-21Giggawatyts

    PfSense runs on FreeBSD, not Linux.



  • I had a exact same issue where i had packet loss every 15 minute. I solved it by stopping to use Firewall Schedule.

    What gave it away was this in the config history
    (system): Removed 15 minute filter reload for Time Based Rules

    So I removed the Firewall schedule from the firewall rule and everything is back to normal again.



  • @JKnott I had a look but I have no firewall schedules defined. Im at risk of hijackingnthis thread so rather than take over from the OP's questions and answers i will start another thread - i just wanted to say I am also having very similar issues.



  • Thought I would post an update. It has been months since I started this thread, with our friend COVID in town and everyone online, bringing the Internet down makes me very unpopular.

    Back on April 10, I rolled back to my old hardware. It was running v2.3.5-RELEASE-p2. From April 10 until today (June 28) it has worked perfectly. Never a glitch.

    Today the family went out, so I took the opportunity to try switching hardware again.

    I made a fresh backup of my v2.3.5-RELEASE-p2 that was running.

    I see there has been a new release of PFSense, so I downloaded v2.4.5-RELEASE-p1 and installed it on my new box.

    Restored the backup and everything is running perfectly. No packet loss every 15 minutes.

    I looked at the release notes for 2.4.5-p1 and don't see anything that jumps out at me.

    Guess I will never know what the true cause of the problem was, but glad to have my new hardward back in place without any packet loss as my old hardware could was limiting my connection speeds.


Log in to reply