WAN flapping since upgrading to 2.4.5
-
@bmeeks The only thing is, the protectli box has the same NIC as my old hardware intel 82583v and that had the same problem.
-
@larold42 said in WAN flapping since upgrading to 2.4.5:
@bmeeks The only thing is, the protectli box has the same NIC as my old hardware intel 82583v and that had the same problem.
You might be seeing an impact of this bug which was reportedly fixed: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235147. Do you perhaps have the older suggested workaround for this bug still in your configuration? If so, try removing it. Here is the pfSense Redmine bug report associated with the FreeBSD bug report I linked earlier: https://redmine.pfsense.org/issues/9414.
-
@bmeeks huh didnt even know about this bug, well on the jetway i dont have that ethernet controller and it is recognized and loads. But i'm wondering if i should put in a bug now. Problem is i really dont have the time to troubleshoot this anymore. This is the only part that sucks about not having support.
-
@bmeeks i'm almost tempted to try the 2.5 dev version, but i feel like that will only dilute the problem with likely more issues.
-
@larold42 said in WAN flapping since upgrading to 2.4.5:
@bmeeks huh didnt even know about this bug, well on the jetway i dont have that ethernet controller and it is recognized and loads. But i'm wondering if i should put in a bug now. Problem is i really dont have the time to troubleshoot this anymore. This is the only part that sucks about not having support.
If you submit a bug report, it most likely should be to the FreeBSD bunch and not pfSense. The pfSense team does not do anything with regards to drivers. That is all taken "as-is" from upstream FreeBSD.
What is different with pfSense-2.4.5 is the newer version of FreeBSD. Things like drivers get various fixes and adjustments with new OS versions. Some of those are good and fix things, but others can "break" things through unintentional regressions of one kind of another.
-
@larold42 said in WAN flapping since upgrading to 2.4.5:
@bmeeks i'm almost tempted to try the 2.5 dev version, but i feel like that will only dilute the problem with likely more issues.
2.5 is based on FreeBSD-12.2/STABLE, so it is newer still. But it does have all of the latest NIC driver "fixes". The really big change in terms of NIC drivers in FreeBSD-12 is the move to the iflib wrapper API. That is a big change to the way NIC manufacturers write their hardware drivers.
-
This one - maybe not related - is also an issue that has to be checked :
@larold42 said in WAN flapping since upgrading to 2.4.5:Oct 26 23:28:36 php-fpm 98247 /rc.newwanip: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1603754916] unbound[19136:0] error: bind: address already in use [1603754916] unbound[19136:0] fatal error: could not open ports'
"could not open ports" == another (probably) unbound instance was already running - or is still running (?). A new one can't be launched, as used ports like '53' are still occupied.
If I recall correctly (2.4.4-p3 is rather old already) , there was a timing issue with unbound, as the "stop" taks some time - and the restart came in to fast.Check the unbound logs to see why it failed ?!
Also : take note that these settings :
will take the interface (WAN) down and up (== 'flapping ?!) if the motoring looses contact with the automatic or gateway IP (my 87.98.136.xx).
Practical joke : many use 8.8.8.8 here - and 8.8.8.8 is not being paid to serve (reply to) ICMP packets, it's job is serving DNS requests. So when 8.8.8.8 stops replying on (useless) ICMP, many WAN interfaces will fall.
In other words; your WAN connection will be as good as the "Monitor IP" being able to reply to pings. Temporary solution : disable the Gateway action to exclude this reason as a possible cause.@larold42 said in WAN flapping since upgrading to 2.4.5:
and immediately noticed my WAN going up and down every 15 mins.
So it's down all the time (15 minutes) - then it goes UP :
Your log starts with :Oct 26 23:28:32 kernel igb0: link state changed to UP
to go down 5 seconds later
Oct 26 23:28:37 kernel igb0: link state changed to DOWN
at the end of the log lines you showed - and it stays down for another 15 minutes ?
-
-
@bmeeks so i check the interface that was doing this
igb0@pci0:1:0:0: class=0x020000 card=0x0000ffff chip=0x15218086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = 'I350 Gigabit Network Connection' class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xdf160000, size 131072, enabled bar [18] = type I/O Port, range 32, base 0xe060, size 32, enabled bar [1c] = type Memory, range 32, base 0xdf18c000, size 16384, enabled cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks cap 11[70] = MSI-X supports 10 messages, enabled Table in map 0x1c[0x0], PBA in map 0x1c[0x2000] cap 10[a0] = PCI-Express 2 endpoint max data 256(512) FLR RO NS link x4(x4) speed 5.0(5.0) ASPM disabled(L0s/L1) ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected ecap 0003[140] = Serial 1 003018ffff0f0d21 ecap 000e[150] = ARI 1 ecap 0010[160] = SR-IOV 1 IOV disabled, Memory Space disabled, ARI disabled 0 VFs configured out of 8 supported First VF RID Offset 0x0180, VF RID Stride 0x0004 VF Device ID 0x1520 Page Sizes: 4096 (enabled), 8192, 65536, 262144, 1048576, 4194304 ecap 0017[1a0] = TPH Requester 1 ecap 0018[1c0] = LTR 1 ecap 000d[1d0] = ACS 1
so... i'm wondering how many other folks are having issues running i350's, this is has to be a driver issue.
EDIT:
Here is the driver info as well
dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k
dev.igb.%parent: -
Here is a link to the source code for the latest version of Intel driver for what appears to be your card: https://downloadcenter.intel.com/download/15815/Intel-Network-Adapter-Driver-for-82575-6-and-82580-Based-Gigabit-Network-Connections-under-FreeBSD-?product=46827. This is only the C source code. To use this driver, you would need to create your own separate FreeBSD-11 virtual machine with the proper developer tools installed (compiler and linker) and then compile the source code into the binary driver module. Then copy that module over to your pfSense box and load it. That may be more effort than you wish to expend, though.
The one thing I've noticed over the years with FreeBSD is that the support of newer hardware seems to lag behind Linux. The drivers within FreeBSD-11 and earlier are maintained by a team of Intel folks who then submit the updates to FreeBSD. For FreeBSD-12 and later, as I mentioned in a previous post, FreeBSD has moved to a new wrapper API called iflib. That move has muddied the waters a bit in terms of NIC driver development and support as now the FreeBSD team has the iflib API part while hardware manufacturers write the pieces that need to directly manipulate widgets on their particular NIC.
It might be worth trying pfSense-2.5 DEVEL since it is based on FreeBSD-12.2/STABLE and will contain newer NIC driver versions.