Arpresolv error and WAN NIC down
-
I don't know if this is 2.1 specific, but every now and again, and under circumstances I can't determine:
The WAN interface goes offline and the syslog (dmesg) is filled up with
arpresolve: can't allocate llinfo for [WAN IP ADDRESS]
Reassigning the address via WebGUI, rc.initial, or CLI "fixes" the issue until it occurs again.
Any ideas? It's probably something stupidly simple I'm missing.
-
That log indicates it's trying to ARP something that isn't on a locally configured subnet. What's the output of "ifconfig" look like while that's happening?
-
@cmb:
That log indicates it's trying to ARP something that isn't on a locally configured subnet. What's the output of "ifconfig" look like while that's happening?
I'll try and grab an ifconfig the next time it happens. I didn't save one the last time, but I do seem to recall that the interface was down.
I'm sorry I wasn't clear: I understand what the error message means, but I can't think of anything that could be mis-configured to generate an ARP on a non-local subnet. This would obviously be be a lot easier to troubleshoot if I knew the circumstances that trigger the event and/or could reproduce it at will, but so far no luck in figuring tha out.
-
As requested here's the ifconfig after the port is down:
iwi0: flags=8802 <broadcast,simplex,multicast>metric 0 mtu 2290
ether 00:13:ce:db:83:6a
media: IEEE 802.11 Wireless Ethernet autoselect (autoselect)
status: no carrier
fwe0: flags=8802 <broadcast,simplex,multicast>metric 0 mtu 1500
options=8 <vlan_mtu>ether 02:00:39:7c:13:e8
ch 1 dma -1
fwip0: flags=8802 <broadcast,simplex,multicast>metric 0 mtu 1500
lladdr 0.0.39.0.0.7c.13.e8.a.2.ff.fe.0.0.0.0
lo0: flags=8049 <up,loopback,running,multicast>metric 0 mtu 16384
options=3 <rxcsum,txcsum>inet 127.0.0.1 netmask 0xff000000
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x9
nd6 options=3 <performnud,accept_rtadv>pflog0: flags=100 <promisc>metric 0 mtu 33200
enc0: flags=0<> metric 0 mtu 1536
pfsync0: flags=0<> metric 0 mtu 1460
syncpeer: 224.0.0.240 maxupd: 128 syncok: 1
ue0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
options=80008 <vlan_mtu,linkstate>ether 00:1c:10:49:c6:2e
inet6 fe80::21c:10ff:fe49:c62e%ue0 prefixlen 64 scopeid 0xd
nd6 options=1 <performnud>media: Ethernet autoselect (100baseTX <full-duplex>)
status: active
ue1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
options=80008 <vlan_mtu,linkstate>ether 00:0e:c6:87:a7:a4
inet 192.168.123.1 netmask 0xffffff00 broadcast 192.168.123.255
inet6 fe80::20e:c6ff:fe87:a7a4%ue1 prefixlen 64 scopeid 0xe
nd6 options=1 <performnud>media: Ethernet autoselect (100baseTX <full-duplex>)
status: active
ovpns1: flags=8051 <up,pointopoint,running,multicast>metric 0 mtu 1500
options=80000 <linkstate>inet6 fe80::213:ceff:fedb:836a%ovpns1 prefixlen 64 scopeid 0xf
inet 192.168.231.1 –> 192.168.231.2 netmask 0xffffffff
nd6 options=3 <performnud,accept_rtadv>Opened by PID 15623</performnud,accept_rtadv></linkstate></up,pointopoint,running,multicast></full-duplex></performnud></vlan_mtu,linkstate></up,broadcast,running,simplex,multicast></full-duplex></performnud></vlan_mtu,linkstate></up,broadcast,running,simplex,multicast></promisc></performnud,accept_rtadv></rxcsum,txcsum></up,loopback,running,multicast></broadcast,simplex,multicast></vlan_mtu></broadcast,simplex,multicast></broadcast,simplex,multicast>WAN is ue0, LAN is ue1
I think I'll have to write some sort of simple monitoring script to grab the ifconfig the first time an arpresolv shows up.
-
I thought I might have fallen foul of the spoof/dhclient issue, but removed the spoofing and I'm still getting the problem.
Any suggestions for diagnosing the dhcp client? I'm no stranger to the command line or Unix generally, but I am a pfsense/FreeBSD newbie.
-
OK, I've "fixed" this, but I don't understand why:
I swapped in a GbE Expresscard (RTL 8111 chipset) I had thought to use this to connect to the GbE LAN, but such is not the case I'm having to use it as 100Mb WAN connection to the cable modem. This problem manifests itself with EITHER of my two USB NICs (totally different brands but same ASIX AX88178 chipset) as the WAN NIC.
I also noticed with the USB adapters on the WAN the DHCP DNS servers showed up in the arp table, with the Xcard, not. No dhcp or DNS changes in pfSense, just the WAN/LAN swap away from using USB for the WAN. There was no performance issue with the USB NICs, both were quite capable from a throughput and latency (ping) perspective, the Xcard as WAN has not changed those performance numbers either.
I am supremely puzzled. ??? :-[
In a related note I've discovered that my ISP's gateway doesn't appreciate ICMP status monitoring packets. It starts to ignore them aftter a while triggering WAN resets. Some kind of IDS maybe?
-
Some sort of driver conflict, when a USB port could both be a serial NIC or a bus interface to which a NIC is attached? Maybe the OS gets confused about which is the case.
I have a bunch of USB NICs with ASIX chips in them sitting around for cases where I need more ethernet ports than are available, so this could come to bite me at some point in the future, too. :( -
That's the direction I was wondering, but why is it only pathological when on the WAN port? If it was hardware or IRQ related I'd expect it to show up there too. ???
-
but why is it only pathological when on the WAN port?
Not an answer, just more speculation: Since pfSense 2.x it doesn't need a LAN port, so the system can have N additional NICs, but it MUST have a WAN port. So assume something like an occasional USB bus reset happens, if a LAN port goes away for a short moment, the system may handle that like a hot-plug event for an optional interface. But if the MUST HAVE WAN port disappears for ever so short a moment, it may cause it to throw a fit.
Again, that's just speculation, but given that the WAN port has a special standing, it could relate to that.
-
but why is it only pathological when on the WAN port?
Not an answer, just more speculation: Since pfSense 2.x it doesn't need a LAN port, so the system can have N additional NICs, but it MUST have a WAN port. So assume something like an occasional USB bus reset happens, if a LAN port goes away for a short moment, the system may handle that like a hot-plug event for an optional interface. But if the MUST HAVE WAN port disappears for ever so short a moment, it may cause it to throw a fit.
Again, that's just speculation, but given that the WAN port has a special standing, it could relate to that.
You may be on to something here. I just checked and there's a handful of ue0 DOWN then Ups in the dmesg output.
This is an elderly laptop with only two USB 2.0 ports, and a few 1.0 ports that I think are only exposed on a dock that I don't have. Here's the USB (+ serial) and axe0/ue0 related parts of the dmesg -a output:
uhci0: <intel 82801fb="" fr="" fw="" frw="" (ich6)="" usb="" controller="" usb-a="">port 0xbfe0-0xbfff irq 23 at device 29.0 on pci0 uhci0: [ITHREAD] uhci0: LegSup = 0x2f00 usbus0: <intel 82801fb="" fr="" fw="" frw="" (ich6)="" usb="" controller="" usb-a="">on uhci0 uhci1: <intel 82801fb="" fr="" fw="" frw="" (ich6)="" usb="" controller="" usb-b="">port 0xbf80-0xbf9f irq 19 at device 29.1 on pci0 uhci1: [ITHREAD] uhci1: LegSup = 0x2f00 usbus1: <intel 82801fb="" fr="" fw="" frw="" (ich6)="" usb="" controller="" usb-b="">on uhci1 uhci2: <intel 82801fb="" fr="" fw="" frw="" (ich6)="" usb="" controller="" usb-c="">port 0xbf60-0xbf7f irq 18 at device 29.2 on pci0 uhci2: [ITHREAD] uhci2: LegSup = 0x2f00 usbus2: <intel 82801fb="" fr="" fw="" frw="" (ich6)="" usb="" controller="" usb-c="">on uhci2 uhci3: <intel 82801fb="" fr="" fw="" frw="" (ich6)="" usb="" controller="" usb-d="">port 0xbf40-0xbf5f irq 16 at device 29.3 on pci0 uhci3: [ITHREAD] uhci3: LegSup = 0x2f00 usbus3: <intel 82801fb="" fr="" fw="" frw="" (ich6)="" usb="" controller="" usb-d="">on uhci3 ehci0: <intel 82801fb="" (ich6)="" usb="" 2.0="" controller="">mem 0xcddffc00-0xcddfffff irq 23 at device 29.7 on pci0 ehci0: [ITHREAD] usbus4: EHCI version 1.0 usbus4: <intel 82801fb="" (ich6)="" usb="" 2.0="" controller="">on ehci0 pcib4: <acpi pci-pci="" bridge="">at device 30.0 on pci0 pci5: <acpi pci="" bus="">on pcib4 iwi0: <intel(r) pro="" wireless="" 2200bg="">mem 0xcdcff000-0xcdcfffff irq 22 at device 5.0 on pci5 ... uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: [FILTER] ... usbus0: 12Mbps Full Speed USB v1.0 usbus1: 12Mbps Full Speed USB v1.0 usbus2: 12Mbps Full Speed USB v1.0 usbus3: 12Mbps Full Speed USB v1.0 usbus4: 480Mbps High Speed USB v2.0 ... ugen0.1: <intel>at usbus0 uhub0: <intel 1="" 9="" uhci="" root="" hub,="" class="" 0,="" rev="" 1.00="" 1.00,="" addr="">on usbus0 ugen1.1: <intel>at usbus1 uhub1: <intel 1="" 9="" uhci="" root="" hub,="" class="" 0,="" rev="" 1.00="" 1.00,="" addr="">on usbus1 ugen2.1: <intel>at usbus2 uhub2: <intel 1="" 9="" uhci="" root="" hub,="" class="" 0,="" rev="" 1.00="" 1.00,="" addr="">on usbus2 ugen3.1: <intel>at usbus3 uhub3: <intel 1="" 9="" uhci="" root="" hub,="" class="" 0,="" rev="" 1.00="" 1.00,="" addr="">on usbus3 ugen4.1: <intel>at usbus4 uhub4: <intel 1="" 9="" ehci="" root="" hub,="" class="" 0,="" rev="" 2.00="" 1.00,="" addr="">on usbus4 uhub0: 2 ports with 2 removable, self powered uhub1: 2 ports with 2 removable, self powered uhub2: 2 ports with 2 removable, self powered uhub3: 2 ports with 2 removable, self powered ... uhub4: 8 ports with 8 removable, self powered Root mount waiting for: usbus4 ugen4.2: <vendor 0x05ac="">at usbus4 axe0: <vendor 2="" 0x05ac="" product="" 0x1402,="" rev="" 2.00="" 0.01,="" addr="">on usbus4 ... miibus1: <mii bus="">on axe0 ukphy0: <generic ieee="" 802.3u="" media="" interface="">PHY 16 on miibus1 ukphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto ue0: <usb ethernet="">on axe0 ... ue0: link state changed to DOWN ue0: link state changed to UP</usb></generic></mii></vendor></vendor></intel></intel></intel></intel></intel></intel></intel></intel></intel></intel></intel(r)></acpi></acpi></intel></intel></intel></intel></intel></intel></intel></intel></intel></intel>
The "Root mount waiting for: usbus4" is interesting. I never noticed that before. I don't understand that given that root is on Pri/IDE.
Are you thinking I could put some kind of "hint.[driver].[number].irq=[number]" in loader.conf or something?