WAN flapping on 2.4.5-p1
-
Since installing this version on protectli hardware, my WAN seems to go up and down about every 30 mins. I've verified this is not an ISP issue by installing my firewall(different hardware) with a previous version (2.4.4-p3). So its either a compatibility issue with the hardware, or this version of pfsense.
-
@larold42 Which Protectli model is it?
Also, if your setup isn't too complicated, you can swap the WAN interface with another, extra interface, on the box, to see if the problem follows the swapped port. If it does, then you for sure have a hardware problem.
-
4 port, celeron j3160. They just shipped me a new one, and its still having the same issue. I also downgraded versions from -p1 to the 2.4.5 main and i'm still noticing it.
-
If you switched out the router but the problem remains, it sounds like a configuration/software issue or something between the router and the modem/isp equipment. I would start with the basics. When the WAN 'goes down' what is happening? I assume there is no internet, but on the WAN port, does it still have an IP? From a device on the LAN side, can you ping the WAN IP? Is it flapping (up down up down up...) or is it just going down and staying down after about 30 minutes. Since the hardware is different, compared to the old hardware that doesn't do this, a few things to check that may need to be set differently from the old hardware. Under SYSTEM/ADVANCED/NETWORKING- in NETWORK INTERFACES section, make sure hardware checksum, segmentation offload and large receive offload are OFF (checked). Is the service PPPoe? Simple thing, verify correct account information is set. Double check under INTERFACES/WAN that speed and duplex is set to AUTO- but frankly, if I set this on my router, it will flap up and down and just not work. I have to set it to 1000 T Full Duplex for it to remain connected (probably due to the NIC I am using, an old Intel 'server pull') so YMMV. I just changed ISPs, and found that I had to turn off Gateway Monitoring of the IPV6 WAN Gateway or it would it would start dropping packets with high latency, drop out completely, then come back up OK, then latency, dropped packets, link down over and over. I could see it on the Dashboard Gateway Monitor.
It just sounds strange that it takes about 1/2 hour for a problem to occur in your case though... -
Ok so to reply to your comment. I returned the Protectli hardware, got a new piece of equipment from ebay, actually more powerful and cheaper after i built it. Noticed right after i installed 2.4.5 base, it started again...... I am very close to ditching pfsense all together now. The WAN gets a latency warning, and then gets a "arpresolve: can't allocate llinfo for" followed by the WAN dropping. So far its every 10 mins. It lasts only like 15 seconds max. There is no IP when the WAN drops, its as if i unplugged the cable. So i checked all the changes you suggested, only change i had to make was uncheck the hardware checksum. The rest of it was good to go.
My new hardware is Jetway HBJC385F551-63U-B which... is probably way overkill for right now. But fiber will be in my locating in my area within the next couple years. -
@larold42
What is your internet? Are you behind a cable box or is this something like ATT UVerse (DSL) or something else (fiber)? What is the hardware and service.
Did you try setting the WAN speed/duplex instead of leaving it on auto sense (INTERFACES/WAN). As I mentioned above, I can't set mine to auto, I had to set it to '1000t full duplex' or it would flap, and I had to also set IPV6 gateway as unmonitored to keep it stable with Spectrum cable internet service (SYSTEM/ROUTING/GATEWAYS/WAN DHCP6/DISABLE GATEWAY MONITORING). Can't hurt to give it a test. -
@Tzvia my internet spectrum is cable, i'm behind a modem. I didnt try setting my speeds yet, but i really dont think this will fix anything. Also i turned IPV6 off on the WAN. I have my old hardware back in place right now so i can work and not have a crappy connection, so i have to test a couple hours a day and then remove the new hardware.
-
@larold42 Ok interesting. I'm in So. Cal and think my area was TWC before it was spectrum - had Uverse till last month. I am using an itx AMD AM1 setup and had an old Intel 4 port nic in there. Had to set speed and duplex, but needed another nic for something else anyway so bought a 4port Intel 350t4. Turns out, still have to set speed, and now also have to have a small dumb switch between modem and router or it still flaps. Modem is a hitron En2251. This weekend I'll be troubleshooting this but I need internet now for work so I'm glad the switch hack works. It is worth trying the duplex setting.
Btw IPV6 works great in my area. DHCP on WAN and SLAAC on lan. -
@Tzvia i'll give the dumb switch idea a shot. Didnt even think of that. So old hardware has intel 82583v.
New hardware is 1x intel 219-LM, 1x intel 211-AT, 4x intel 350-AM4. I may try and figure out which port is which model and then make the WAN a new port to try an troubleshoot. -
@Tzvia soooooo i put a dumb switch in front of my pfsense.... and i i havent had an issue. How is this a solution. This isnt even an additional hop. I'm thinking the route cause is the NIC driver not being compatable with freebsd 11.2/3. Someone had a similarish experience https://redmine.pfsense.org/issues/9414, https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235147. Different intel ethernet controller than what i'm rocking, but i may do some troubleshooting and see if i can kick this over to freeBSD bug report.
-
@larold42 Yea I know it's nuts. Hope to get time this weekend to test out a few ideas. I had played a bit with the hardware offload settings, didn't seem to make a difference. Setting it to gig did on the ancient HP server pull nic I had. But when I put in the 350t4 nothing I did would stop the flapping until I put in the dumb switch- I had read about someone else doing that earlier and figured it couldn't hurt to try. Guessing it's something between that Hitron modem nic and my nic, the autosense not stable on the Hitron with the new nic on the router. PFSense sees pings failing, and restarts that interface, and we see it as flapping. Bet if I could get into the Hitron settings and set it to gig full duplex that would fix it. But it's a Spectrum 'free' modem and I don't have a way into the settings...
-
@Tzvia just for tracking NIC hardware and Driver information so we can compare. I'm half tempted to put in a bug report for freeBSD 11.3.
igb0@pci0:1:0:0: class=0x020000 card=0x0000ffff chip=0x15218086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = 'I350 Gigabit Network Connection' class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xdf160000, size 131072, enabled bar [18] = type I/O Port, range 32, base 0xe060, size 32, enabled bar [1c] = type Memory, range 32, base 0xdf18c000, size 16384, enabled cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks cap 11[70] = MSI-X supports 10 messages, enabled Table in map 0x1c[0x0], PBA in map 0x1c[0x2000] cap 10[a0] = PCI-Express 2 endpoint max data 256(512) FLR RO NS link x4(x4) speed 5.0(5.0) ASPM disabled(L0s/L1) ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected ecap 0003[140] = Serial 1 003018ffff0f0d21 ecap 000e[150] = ARI 1 ecap 0010[160] = SR-IOV 1 IOV disabled, Memory Space disabled, ARI disabled 0 VFs configured out of 8 supported First VF RID Offset 0x0180, VF RID Stride 0x0004 VF Device ID 0x1520 Page Sizes: 4096 (enabled), 8192, 65536, 262144, 1048576, 4194304 ecap 0017[1a0] = TPH Requester 1 ecap 0018[1c0] = LTR 1 ecap 000d[1d0] = ACS 1
Driver info
dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k dev.igb.%parent:
-
@larold42
Can't figure the upload here, but here is my NIC setup:dev.igb.3.%desc: Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k
dev.igb.2.%desc: Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k
dev.igb.1.%desc: Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k
dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection, Version - 2.5.3-kAnd the NICs are:
igb3@pci0:1:0:3: class=0x020000 card=0x03091dcf chip=0x15218086 rev=0x01 hdr=0x00
vendor = 'Intel Corporation'
device = 'I350 Gigabit Network Connection'
class = network
subclass = ethernet
bar [10] = type Memory, range 32, base 0xfe580000, size 524288, enabled
bar [18] = type I/O Port, range 32, base 0xe000, size 32, enabled
bar [1c] = type Memory, range 32, base 0xfe900000, size 16384, enabled
cap 01[40] = powerspec 3 supports D0 D3 current D0
cap 05[50] = MSI supports 1 message, 64 bit, vector masks
cap 11[70] = MSI-X supports 10 messages, enabled
Table in map 0x1c[0x0], PBA in map 0x1c[0x2000]
cap 10[a0] = PCI-Express 2 endpoint max data 512(512) FLR NS
link x4(x4) speed 5.0(5.0) ASPM disabled(L0s/L1)
ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected
ecap 0003[140] = Serial 1 80615fffff08059c
ecap 000e[150] = ARI 1
ecap 0010[160] = SR-IOV 1 IOV disabled, Memory Space disabled, ARI disabled
0 VFs configured out of 8 supported
First VF RID Offset 0x0180, VF RID Stride 0x0004
VF Device ID 0x1520
Page Sizes: 4096 (enabled), 8192, 65536, 262144, 1048576, 4194304
ecap 0017[1a0] = TPH Requester 1
ecap 000d[1d0] = ACS 1Working from home means I have to keep the network UP all week so troubleshooting is limited till the weekend, but I did try setting the WAN to auto again now that it's connected to a dumb switch, and the problem flapping began again. The Hitron modem remained connected to the switch. Without the switch, I could see the link light going out every 8~10 seconds on the modem as well as on the PFSense WAN link. I unplugged the Hitron modem leaving the switch in place, and the PFSense WAN link kept flapping so I don't now think that anything about that modem is a part of this in my setup. It's strictly PFSense. I also tried setting the MTU manually along with MSS clamping to 1460 and that made no change.
-
@Tzvia yea so the short answer is we need to patch, https://downloadcenter.intel.com/download/15815/Intel-Network-Adapter-Driver-for-82575-6-and-82580-Based-Gigabit-Network-Connections-under-FreeBSD-?product=46827. Or upgrade to 2.5, which "i believe" has support for these newer drivers because its based on freebsd12. Still have to see if these drivers are in there.
-
@larold42 said in WAN flapping on 2.4.5-p1:
@Tzvia yea so the short answer is we need to patch, https://downloadcenter.intel.com/download/15815/Intel-Network-Adapter-Driver-for-82575-6-and-82580-Based-Gigabit-Network-Connections-under-FreeBSD-?product=46827. Or upgrade to 2.5, which "i believe" has support for these newer drivers because its based on freebsd12. Still have to see if these drivers are in there.
The drivers you download from Intel will usually be for FreeBSD-11 and earlier. As I mentioned in another thread similar to this one, FreeBSD-12 and newer uses the iflib API wrapper for NIC drivers. So that is a completely different type of driver software, and in the case of Intel even has a different version numbering scheme. So Intel and others have a situation now where they sort of have to maintain two different driver familes: one for FreeBSD-11 and another for iflib and FreeBSD-12 and newer. That sets up a situation where things may get fixed in one family but not necessarily get backported to the other (or at least not at the same time).
-
Please check the MTU on the flapping interface. This sounds familiar:
https://forum.netgate.com/topic/136089/solved-and-revised-2-4-4-release-arpresolve-can-t-allocate-llinfo-for-gateway-on-interface0-dhcp-mtu-576?_=1604207774054
Cheers,
Bennett -
@bfeitell Interesting read. I had played a bit with it this weekend. I had played with traffic shaping about a year ago, thought I had removed all the bits I had configured, but found a WAN interface limit that was well below the current speed tier I have. Removed that. I then found that I had set the LAN interface speed/duplex to 1000f at some point in the distant past, so set that to 'Default, no preference' along with the WAN. Connected to the switch it comes up 1000f stable. If I connect direct to the modem, it no longer flaps but it comes up 100f with the modem at 1000f so it still doesn't work. But it doesn't flap... So I set "supersede interface-mtu 0" per that link you found, and am just letting that alone for a bit to verify it hasn't introduced anything negative. In a while I will try connecting it to the switch...
-
Just had to try a few more things because I am stubborn. Character flaw, but I think at this point I am now done. There is no fix for this NIC/Hitron modem combo. I grabbed another SSD I had around, installed the 2.5 devel, did basic setup only on it- WAN set dhcp, lan set static ipv4. Nothing else. Connected the WAN to the Hitron modem and it flapped maybe twice, then no lights on modem or PFSense NIC. They just don't talk to each other. Connect the WAN to that little Netgear dumb switch and it's fine. So it stays like that till I can think of something else. BTW I also booted that SSD on an old test computer I have that has an old 2 port intel PRO 1000 NIC that also uses the IGB driver and it flapped then lights out as well.