Intermittent packet loss on WAN - can’t tell if it’s ISP or pfSense



  • I recently took over our managing our computer network at work. I’m not a network engineer or anything, just volunteering because I’m the closest thing they’ve got to an IT person. We have a Static IP address through Charter business and use their bridged modem. We have a 100/5mbps connection and we barely average half of that bandwidth.

    http://imgur.com/v5wlosu

    About a month ago our modem seemed to be failing, it needed to be power cycled constantly. We had an aging RADIUS server and ASUS routers running the network, so before I had Charter come out to replace the modem, I wanted to have a newer setup. A new Netgate SG-2440 will do the trick – it can handle my subnets, captive portal and freeRADIUS all-in-one.

    The network consists of the one static WAN interface to charter, LAN interface (192.168.1.0/24) for PCs and printers in the office, and WIFI interface (192.168.10.0/24) for a network of wireless access points around the building. Employees are given usernames and passwords that they use to connect to the network via freeRADIUS/Captive Portal

    After setting everything up with the new modem (hitron CGNM-2250), it worked great, until we started getting really bad intermittent packet loss and high latency on both subnets. Everything seems to be fine inside the network, but the internet connection becomes very spotty. I’ve ran ping test between the subnets with no issues and the gateway seems to ping just fine as well. When I run a traceroute, the timeouts and ping spikes show up after our gateway IP hop. I switched my gateway monitor IP to 8.8.8.8 and that’s when I really got a better picture of our packet loss.

    Just to be sure, I tried a couple other routers. I had the new router hand out DHCP to the office LAN and then hooked up pfSense behind said router via a static IP to run the WIFI, let it run for a while, and the same packet loss was occurring again on both the office LAN and pfSense interfaces. I figured this had to be a faulty modem, so I had charter come replace the modem AGAIN with a SMC Networks D3G1604W and I’m back to square one again: intermittent packet loss.

    Last 8 hours:
    http://imgur.com/g22BT7D

    Last two days
    http://imgur.com/8lWSlE2

    I’ve been on and off the phone with Charter and they seem to think everything looks fine when they run their tests, but nothing seems to be working. Tomorrow I am having them out to test the lines into the building. In the meantime, it would be awesome if any of you could take a look at my setup and let me know if you see any mistake I may have made that could be causing this headache. Thanks in advance!

    Modem–>pfSense-->LAN Switch
                                -->WIFI Switch

    LAN switch connects office PCs and printers
    WIFI switch goes out to a few other POE switches around the building that connect APs

    pfSense 2.3.3-RELEASE-p1  SETTINGS

    Admin Access
    Default settings accept using HTTPS

    Firewall & NAT



    Firewall Optimization Options [Normal]


    Firewall Adaptive Timeouts [Default Settings]
    Firewall Maximum States [405000]
    Firewall Maximum Table Entries [200000]
    Firewall Maximum Fragment Entries [default]
    [ ]Static route filtering
    [ ]Disable Auto-added VPN rules
    [ ]Disable reply-to
    [ ]Disable Negate rules
    Aliases Hostnames Resolve Interval [300]
    [ ]Check certificate of aliases URLs
    Update Bogon Networks Frequency [Monthly]
    NAT Reflection mode for port forwards [disabled]
    Reflection Timeout [ ]
    [ ]Enable NAT Reflection for 1:1 NAT
    [ ]Enable automatic outbound NAT for Reflection

    Networking
    [x]Allow IPv6
    [ ]IPv6 over IPv4
    [ ]Prefer IPv4 over IPv6
    [ ]Device polling
    [ ]Hardware Checksum Offloading
    [x]Hardware TCP Segmentation Offloading
    [x]Hardware Large Receive Offloading
    [ ]ARP Handling

    Miscellaneous
    Nothing changed here, all default

    General Setup
    DNS Servers (Provided by Charter)
    86.75.309.1 [none]
    86.75.309.2 [none]


    [ ]Disable DNS Forwarder

    Routing->Gateways->CharterGateway


    Interface [WAN]
    Address Family [IPv4]
    Name [CharterGateway]
    Gateway [12.345.678.998]
    [x]Default Gateway
    [ ]Disable Gateway Monitoring
    [x]Disable Gateway Monitoring Action
    Monitor IP [8.8.8.8]
    [ ]Mark Gateway as Down

    Weight [1]
    Data Payload [0]
    Latency Thresholds [200]-[300]
    Packet Loss thresholds [20]-[50]
    Probe Interval [2000]
    Loss Interval [8000]
    Time Period [60000]
    Alert Interval [8000]
    [ ]Use non-local gateway through interface specific route.

    Interfaces
    WAN
    [x]Enable
    IPv4 Configuration Type [Static IPv4]
    IPv6 Configuration Type [None]
    MAC Address [blank]
    MTU [blank]
    MSS [blank]
    Speed and Duplex [default]
    IPv4 Address [12.345.678.999] [/30]
    IPv4 Upstream gateway [CharterGateway-12.345.678.998]
    [x]Block private networks and loopback addresses
    [x]Block bogon networks

    LAN
    [x]Enable
    IPv4 Configuration Type [Static IPv4]
    IPv6 Configuration Type [None]
    MAC Address [blank]
    MTU [blank]
    MSS [blank]
    Speed and Duplex [default]
    IPv4 Address [192.168.1.1] [/24]
    IPv4 Upstream gateway [None]
    [ ]Block private networks and loopback addresses
    [ ]Block bogon networks

    WIFI
    [x]Enable
    IPv4 Configuration Type [Static IPv4]
    IPv6 Configuration Type [None]
    MAC Address [blank]
    MTU [blank]
    MSS [blank]
    Speed and Duplex [default]
    IPv4 Address [192.168.10.1] [/24]
    IPv4 Upstream gateway [None]
    [ ]Block private networks and loopback addresses
    [ ]Block bogon networks

    Firewall Rules
    WAN
    BLOCK RFC 1918 networks
    BLOCK Reserved Not assigned by IANA
    Pass IPv4 TCP/UDP
    Pass UDP 1194 (OpenVPN)

    LAN
    Pass 443/80/22 (Anti-Lockout Rule)
    Pass IPv4

    WIFI
    Pass IPv4

    Captive Portal Configuration
    [x]Enable Captive Portal
    Interfaces [WIFI]
    Maximum concurrent connections [blank]
    Idle timeout (Minutes) [480]
    Hard timeout (Minutes) [481]
    Pass-through credits per MAC address. [blank]
    Waiting period to restore pass-through credits. (Hours) [blank]
    [ ]Reset waiting period
    [ ]Logout popup window
    Pre-authentication redirect URL [blank]
    After authentication Redirection URL [blank]
    Blocked MAC address redirect URL[blank]
    [ ]Disable Concurrent user logins
    [ ]Disable MAC filtering
    [x]Enable Pass-through MAC automatic additions
    [ ]Enable Pass-through MAC automatic addition with username
    [x]Enable per-user bandwidth restriction
    Default download (Kbit/s) [10967]
    Default upload (Kbit/s) [2500]
    Authentication Method: [x]RADIUS Authentication
    RADIUS protocol: [x]PAP
    Primary RADIUS server [192.168.10.1]
    Shared Secret [Password1]

    FreeRADIUS
    NAS/Clients
    Client IP Address [192.168.10.1]
    Client IP Version [IPv4]
    Client Shortname [WIFIRadius]
    Client Shared Secret [Password1]
    Client Protocol [UDP]
    Client Type [other]
    Require Message Authenticator [No]
    Max Connections [250]
    NAS Login [blank]
    NAS Password [blank]

    Interfaces
    Interface IP Address [192.168.10.1]
    Port [1812]
    Inerface Type [Authentication]
    IP Version [IPv4]

    DHCP (default settings)
    LAN
    192.168.1.21-254
    WIFI
    192.168.10.21-254



  • Can you verify this problem without using the pfSense graphs? Maybe dslreports.com? If so, then run the test and see the fault. The remove pfSense and use your laptop directly connected to the ISP modem and set it to the static IP for your WAN details (yes firewall on please). Test it again and compare the reports.

    The WAN monitoring (apinger/dpinger) are sometimes not the best things to rely on. Usually because the modem/upstream router will de-prioritize your polling requests, making things look bad when they're not.



  • @moikerz:

    Can you verify this problem without using the pfSense graphs? Maybe dslreports.com? If so, then run the test and see the fault. The remove pfSense and use your laptop directly connected to the ISP modem and set it to the static IP for your WAN details (yes firewall on please). Test it again and compare the reports.

    The WAN monitoring (apinger/dpinger) are sometimes not the best things to rely on. Usually because the modem/upstream router will de-prioritize your polling requests, making things look bad when they're not.

    Absolutely. I forgot to mention last week I created a firewall rule to allow ICMP request/reply on the WAN and ran dslrepots SmokePing for 24hrs. The results were a match - packet loss and ping spikes were logged the exact same times in both pfsense and smokeping graphs.

    I have an old unused laptop that I intend on hooking directly up to the modem to run more tests, but what if the results are perfectly fine on the laptop and crap on the pfsense box?? Is there any clues as to what may be causing this if it's pfsense?



  • Also, I understand that apinger/dpinger has been unreliable in the past but we are noticing the packet loss in real-time just as dpinger is reporting it. Trust me, when the internet goes down in the break room, I am the first to get hollered at.. I log in to pfsense and sure enough, the gateway is reporting packet loss  >:(


  • LAYER 8 Netgate

    If dpinger is reporting packet loss, it is your ISP or further.

    dpinger != apinger



  • UPDATE: Charter finally identified the problem as a faulty electrical insulator from nearby power plant. Apparently, this was injecting a lot of noise into charter’s network infrastructure, which in turn wreaked havoc on our internet connection. For now, it seems like it’s fixed.


  • LAYER 8 Netgate

    Congratulations on getting them to solve it.


Log in to reply