Intermittent Network Drops pfSense
-
Hi,
I have a Dual WAN setup with Primary Bell Canada (Fiber) and Secondary Rogers (5G Home Internet).
It has been three days since my primary Bell connection went down, so I’m using Rogers 5G as a backup.
However, I keep encountering a strange issue:
Every hour or so, all networks drop out—LAN, IoT, and Guest. When this happens, devices are unable to obtain IP addresses from pfSense.At first, I suspected the Rogers device might be the cause. But when pfSense stops working, if I connect the Rogers modem directly to my laptop via Ethernet, the internet works fine. So, it’s not an issue with the Rogers device.
From pfSense, I can still ping external sites like google.ca, but I can’t reach any IP address within my subnets.
Rebooting pfSense temporarily resolves the issue, and all networks come back online—but the problem recurs about an hour later.
Looking at the logs (dmesg), I see the following error repeatedly:
arprequest_internal: cannot find matching address -
@manjotsc said in Intermittent Network Drops pfSense:
arprequest_internal: cannot find matching address
That sounds like some device is trying to reach an IP that is no longer available? Could there be some "states" that are lurking or something thinking the main gateway is still up even if it isn't? Only, I guess since you said you did reboot, the state table should be cleared...
Perhaps there there is something in the gateway monitoring that is not set up correctly or there is an IP conflict? -
@Gblenn I recently updated to version 24.11, and now my pfSense setup keeps freezing and becomes completely unusable. I have to perform a hard reboot to get it working again.
After rebooting, it runs fine for a few hours, but the issue reoccurs.
I ran a memtest, and it passed without any errors.
I suspect the problem might be related to the motherboard or CPU, but I'm not entirely sure.
Could it also be a network card issue? I have two network cards in the system:
-
One is used for LAN & WAN.
vendor = 'Intel Corporation'
device = 'Ethernet Controller 10-Gigabit X540-AT2' -
The other is for Proxmox and the 5G_WAN.
vendor = 'Intel Corporation'
device = '82576 Gigabit Network Connection'
Are there any specific logs I should check to help diagnose the problem?
-
-
@manjotsc So things have been working fine until the upgrade to 24.11? Or was it only related to the failover situation?
I'm still thinking there is an IP conflict somewhere...What does your setup look like... on the primary WAN you have a public IP from Bell I suppose? And from Rogers you have their 5G router providing a private IP? Or have you set it up in bridge mode?
-
@Gblenn The problem existed even before updating to version 24.11, but it has gotten significantly worse after the update.
My setup includes:
Bell (WAN): A PPPoE connection with a public IP.
5G_WAN: Direct public IP with IP PassthroughUnder Gateways (WAN_DHCP), it shows a private IP address (10.11.26.177).
Under Interfaces, it displays the correct public IP address. -
@manjotsc Ok, that I don't understand... what is WAN_DHCP? Have you added that yourself? And why are you on 5G when the fiber appears to be up?
-
- WAN_DHCP: Bell Fiber (Primary Internet)
- StaticGateway: WireGuard VPN with a Static IP
- 5G_WAN_DHCP: Rogers 5G (Secondary Internet)
- Home_India: WireGuard VPN connection to my home in India
- EXPO67: WireGuard VPN for my Guest Network
(WAN_DHCP) There is ongoing work in my area for Bell, which is causing intermittent outages.
-
@manjotsc What I meant was that you have your StaticGateway which I assume is Bell fiber. But then you have another gateway called WAN_DHCP... what is that?
And when StaticGateway is showing Online, you still have 5G_WAN_DHCP as your default gateway...Also, that IP 100.80.NN.NN is part of private IP range, meaning it's not a public IP and rather CGNAT with Bell... ?
-
@Gblenn StaticGateway is a WireGuard VPN configured with a static IP address. This IP is a client-side address provided by my VPN provider as per their configuration.
I have a failover setup between WAN_DHCP and 5G_WAN_DHCP.
Here I have changed the naming scemes, should be clear now.
-
@manjotsc Aha, that makes more sense...
Another thing, which DHCP are you using, KEA? I just realized I had something similar happening, only once though, and switched back to ISC from KEA which resolved the issue... Not sure if the quirks should be fixed in 24.11 though... -
@Gblenn I nerver switched to KEA
-
Did you set your wan mtu ?
-
@JonathanLee MTU on WAN_DHCP it's set to 1492 and 5G_WAN_DHCP not set to anything.
-
@manjotsc did you ping test this to see when it fragments to get the right value or just default it?
-
@JonathanLee Just default