Intermittent packet loss on WAN - can’t tell if it’s ISP or pfSense
-
I recently took over our managing our computer network at work. I’m not a network engineer or anything, just volunteering because I’m the closest thing they’ve got to an IT person. We have a Static IP address through Charter business and use their bridged modem. We have a 100/5mbps connection and we barely average half of that bandwidth.
About a month ago our modem seemed to be failing, it needed to be power cycled constantly. We had an aging RADIUS server and ASUS routers running the network, so before I had Charter come out to replace the modem, I wanted to have a newer setup. A new Netgate SG-2440 will do the trick – it can handle my subnets, captive portal and freeRADIUS all-in-one.
The network consists of the one static WAN interface to charter, LAN interface (192.168.1.0/24) for PCs and printers in the office, and WIFI interface (192.168.10.0/24) for a network of wireless access points around the building. Employees are given usernames and passwords that they use to connect to the network via freeRADIUS/Captive Portal
After setting everything up with the new modem (hitron CGNM-2250), it worked great, until we started getting really bad intermittent packet loss and high latency on both subnets. Everything seems to be fine inside the network, but the internet connection becomes very spotty. I’ve ran ping test between the subnets with no issues and the gateway seems to ping just fine as well. When I run a traceroute, the timeouts and ping spikes show up after our gateway IP hop. I switched my gateway monitor IP to 8.8.8.8 and that’s when I really got a better picture of our packet loss.
Just to be sure, I tried a couple other routers. I had the new router hand out DHCP to the office LAN and then hooked up pfSense behind said router via a static IP to run the WIFI, let it run for a while, and the same packet loss was occurring again on both the office LAN and pfSense interfaces. I figured this had to be a faulty modem, so I had charter come replace the modem AGAIN with a SMC Networks D3G1604W and I’m back to square one again: intermittent packet loss.
Last 8 hours:
http://imgur.com/g22BT7DLast two days
http://imgur.com/8lWSlE2I’ve been on and off the phone with Charter and they seem to think everything looks fine when they run their tests, but nothing seems to be working. Tomorrow I am having them out to test the lines into the building. In the meantime, it would be awesome if any of you could take a look at my setup and let me know if you see any mistake I may have made that could be causing this headache. Thanks in advance!
Modem–>pfSense-->LAN Switch
-->WIFI SwitchLAN switch connects office PCs and printers
WIFI switch goes out to a few other POE switches around the building that connect APspfSense 2.3.3-RELEASE-p1 SETTINGS
Admin Access
Default settings accept using HTTPSFirewall & NAT
IP Do-Not-Fragment compatibility
IP Random id generation
Firewall Optimization Options [Normal]
Disable Firewall
Disable Firewall Scrub
Firewall Adaptive Timeouts [Default Settings]
Firewall Maximum States [405000]
Firewall Maximum Table Entries [200000]
Firewall Maximum Fragment Entries [default]
[ ]Static route filtering
[ ]Disable Auto-added VPN rules
[ ]Disable reply-to
[ ]Disable Negate rules
Aliases Hostnames Resolve Interval [300]
[ ]Check certificate of aliases URLs
Update Bogon Networks Frequency [Monthly]
NAT Reflection mode for port forwards [disabled]
Reflection Timeout [ ]
[ ]Enable NAT Reflection for 1:1 NAT
[ ]Enable automatic outbound NAT for ReflectionNetworking
[x]Allow IPv6
[ ]IPv6 over IPv4
[ ]Prefer IPv4 over IPv6
[ ]Device polling
[ ]Hardware Checksum Offloading
[x]Hardware TCP Segmentation Offloading
[x]Hardware Large Receive Offloading
[ ]ARP HandlingMiscellaneous
Nothing changed here, all defaultGeneral Setup
DNS Servers (Provided by Charter)
86.75.309.1 [none]
86.75.309.2 [none]
DNS Server Override
[ ]Disable DNS ForwarderRouting->Gateways->CharterGateway
Disable this gateway
Interface [WAN]
Address Family [IPv4]
Name [CharterGateway]
Gateway [12.345.678.998]
[x]Default Gateway
[ ]Disable Gateway Monitoring
[x]Disable Gateway Monitoring Action
Monitor IP [8.8.8.8]
[ ]Mark Gateway as DownWeight [1]
Data Payload [0]
Latency Thresholds [200]-[300]
Packet Loss thresholds [20]-[50]
Probe Interval [2000]
Loss Interval [8000]
Time Period [60000]
Alert Interval [8000]
[ ]Use non-local gateway through interface specific route.Interfaces
WAN
[x]Enable
IPv4 Configuration Type [Static IPv4]
IPv6 Configuration Type [None]
MAC Address [blank]
MTU [blank]
MSS [blank]
Speed and Duplex [default]
IPv4 Address [12.345.678.999] [/30]
IPv4 Upstream gateway [CharterGateway-12.345.678.998]
[x]Block private networks and loopback addresses
[x]Block bogon networksLAN
[x]Enable
IPv4 Configuration Type [Static IPv4]
IPv6 Configuration Type [None]
MAC Address [blank]
MTU [blank]
MSS [blank]
Speed and Duplex [default]
IPv4 Address [192.168.1.1] [/24]
IPv4 Upstream gateway [None]
[ ]Block private networks and loopback addresses
[ ]Block bogon networksWIFI
[x]Enable
IPv4 Configuration Type [Static IPv4]
IPv6 Configuration Type [None]
MAC Address [blank]
MTU [blank]
MSS [blank]
Speed and Duplex [default]
IPv4 Address [192.168.10.1] [/24]
IPv4 Upstream gateway [None]
[ ]Block private networks and loopback addresses
[ ]Block bogon networksFirewall Rules
WAN
BLOCK RFC 1918 networks
BLOCK Reserved Not assigned by IANA
Pass IPv4 TCP/UDP
Pass UDP 1194 (OpenVPN)LAN
Pass 443/80/22 (Anti-Lockout Rule)
Pass IPv4WIFI
Pass IPv4Captive Portal Configuration
[x]Enable Captive Portal
Interfaces [WIFI]
Maximum concurrent connections [blank]
Idle timeout (Minutes) [480]
Hard timeout (Minutes) [481]
Pass-through credits per MAC address. [blank]
Waiting period to restore pass-through credits. (Hours) [blank]
[ ]Reset waiting period
[ ]Logout popup window
Pre-authentication redirect URL [blank]
After authentication Redirection URL [blank]
Blocked MAC address redirect URL[blank]
[ ]Disable Concurrent user logins
[ ]Disable MAC filtering
[x]Enable Pass-through MAC automatic additions
[ ]Enable Pass-through MAC automatic addition with username
[x]Enable per-user bandwidth restriction
Default download (Kbit/s) [10967]
Default upload (Kbit/s) [2500]
Authentication Method: [x]RADIUS Authentication
RADIUS protocol: [x]PAP
Primary RADIUS server [192.168.10.1]
Shared Secret [Password1]FreeRADIUS
NAS/Clients
Client IP Address [192.168.10.1]
Client IP Version [IPv4]
Client Shortname [WIFIRadius]
Client Shared Secret [Password1]
Client Protocol [UDP]
Client Type [other]
Require Message Authenticator [No]
Max Connections [250]
NAS Login [blank]
NAS Password [blank]Interfaces
Interface IP Address [192.168.10.1]
Port [1812]
Inerface Type [Authentication]
IP Version [IPv4]DHCP (default settings)
LAN
192.168.1.21-254
WIFI
192.168.10.21-254 -
Can you verify this problem without using the pfSense graphs? Maybe dslreports.com? If so, then run the test and see the fault. The remove pfSense and use your laptop directly connected to the ISP modem and set it to the static IP for your WAN details (yes firewall on please). Test it again and compare the reports.
The WAN monitoring (apinger/dpinger) are sometimes not the best things to rely on. Usually because the modem/upstream router will de-prioritize your polling requests, making things look bad when they're not.
-
Can you verify this problem without using the pfSense graphs? Maybe dslreports.com? If so, then run the test and see the fault. The remove pfSense and use your laptop directly connected to the ISP modem and set it to the static IP for your WAN details (yes firewall on please). Test it again and compare the reports.
The WAN monitoring (apinger/dpinger) are sometimes not the best things to rely on. Usually because the modem/upstream router will de-prioritize your polling requests, making things look bad when they're not.
Absolutely. I forgot to mention last week I created a firewall rule to allow ICMP request/reply on the WAN and ran dslrepots SmokePing for 24hrs. The results were a match - packet loss and ping spikes were logged the exact same times in both pfsense and smokeping graphs.
I have an old unused laptop that I intend on hooking directly up to the modem to run more tests, but what if the results are perfectly fine on the laptop and crap on the pfsense box?? Is there any clues as to what may be causing this if it's pfsense?
-
Also, I understand that apinger/dpinger has been unreliable in the past but we are noticing the packet loss in real-time just as dpinger is reporting it. Trust me, when the internet goes down in the break room, I am the first to get hollered at.. I log in to pfsense and sure enough, the gateway is reporting packet loss >:(
-
If dpinger is reporting packet loss, it is your ISP or further.
dpinger != apinger
-
UPDATE: Charter finally identified the problem as a faulty electrical insulator from nearby power plant. Apparently, this was injecting a lot of noise into charter’s network infrastructure, which in turn wreaked havoc on our internet connection. For now, it seems like it’s fixed.
-
Congratulations on getting them to solve it.