Intermittent loss of connectivity



  • I have one of the PFSense-certified Atom based appliances with 6 gigabit ports from Netgate, currently running V2.1.2.  It is a very nice setup, and has worked well, but over the last few months I have been experiencing intermittent problems and I wondered if anyone had some advice for further troubleshooting them.

    The problem is that 10+ times a day at irregular intervals I will lose access to the WAN for around 30-45 seconds.  When I check things during one of these outages, I can successfully ping the firewall's main IP address, but I can't ping either the local nor the remote side of the WAN connection, nor anything beyond it.  The WAN port goes directly to a gigabit port on my internet supplier's fiber interface (ONT) box.  We have 75 megabit service over this fiber.

    I have worked with the ISP to see if they can detect any problems from their side, but they see no signs of these dropouts.  I have replaced the switch that is just downstream from the firewall, and the cable from the WAN port with no change in results.

    So at this point, I think the problem is either with the fiber link itself or with the firewall.

    I have checked the logs on PFsense and there is nothing unusual seen during these outages.  The interface does not appear to go down, I just can't send traffic.  The WAN link was previously forced to 100BTX Full Duplex, but I worked with the ISP to switch things over to autoselect (which selects gigabit full duplex), but that didn't affect it.

    Does anyone have any ideas of how I can troubleshoot this further, either to prove it is a problem with the ISP or with PFSense?

    Thanks
    Jeff



  • Pinging the "main IP" I presume you mean the WAN-side IP? That would indicate everything inside your network is fine, and that the firewall is routing traffic fine at least at a basic level.

    If you can get a packet capture on WAN, set count to 0, all else at defaults, and let it run as long as you don't have connectivity, the resulting pcap would be telling. Sounds like that might be a little difficult to catch, you don't want it running for long periods while it's working or you'll gather a bunch of useless stuff and have a huge pcap file that's difficult to work with. Maybe be ready to click "Start" whenever you notice it. Click Stop once things start working again.

    That sounds very much like a general WAN-side network issue or ISP issue from the troubleshooting you describe, the packet capture will be more telling though. Are you sending traffic out that isn't getting a reply? Is it destined to the appropriate upstream MAC? Things along those lines.

    If you have commercial support on that system, open up a ticket with that capture for the most prompt assistance. Otherwise can attach or link to it here, though be careful what pcaps you put out to the public. You can PM it to me or email it to cmb at pfsense dot org with a link to this thread if you aren't sure. You shouldn't be sending anything over the Internet in the clear that's sensitive in any way, especially if it's only catching traffic while your Internet isn't functioning properly, but "shouldn't" and "definitely aren't" aren't the same thing.



  • That's great advice.  I'll see if I can get  capture during an outage.

    Jeff