PFsense fails to reply to ARP request
-
Hi,
Very rarely, our PFsense router doenst reply to ARP request anymore, causing the internet to be unreachable from the LAN.
A reboot of the PFsense router fixes the issue. The router itself is reachable remotely over the internet during such outage.I fired up the packet capture tool on the PFsense during the outage, which you can find here: https://www.cloudshark.org/captures/553e8d631a65
What could cause such behaviour?
-
So that is sniff on pfsense lan?
What is the mask on your interface? I see one packet from 192.168.30.11 to 8.8.8.8 what mac did it send that too? Why would you be seeing traffic from 192.168.30/? When your pfsense lan IP is 192.168.1.254/?
-
Yep, the sniff is on the PFsense LAN. We have a couple of guest (wifi) networks on a different VLAN. Not sure why that traffic shows up on the LAN capture…
-
It shouldn't so you should figure out why that is for sure. It looks to be tagged with vlan ID 56, and being sent to a 00:dd:2a:e8:31:02
What is that? I don't show anything for maker of 00:dd:2a ??
Your saying when this happens you reboot pfsense and everything is fine? Did you try just downing and up the interface, or down up the switch port that is connected. Is that mac your pfsense interface?
So the pfsense lan interface is physical interface, and the IP your arping for is on that interface its not a vip? And pfsense is not VM or anything is it? What hardware is it running on?
-
00:dd:2a:e8:31:02 is the MAC of the LAN interface on the PFsense. The (tagged) VLAN`s are also created on this interface. VLAN ID 56 is one of our guest networks indeed.
I didnt try to just disable and enable the interface. Will do that next time it happens.The PFsense lan is a physical interface, connected to a non managed switch. The IP the clients ARP`ing for is not a VIP. Pfsense is running on a router motherboard with 6 INTEL network interfaces, 2GB RAM, 32GB SSD, 1037U celeron CPU.
Thanks for your effort so far!
-
"The PFsense lan is a physical interface, connected to a non managed switch."
How are you doing vlan tagging if your connected to a dumb switch??
-
The switch doenst strip the VLAN tags, so that works perfect. In fact, most unmanaged switches are able to let VLAN tags through.
Also, this issue occured when we had a different (managed) switch in place as well -
while the switch might not strip the tags it doesn't actually isolate the traffic so you have not actual barrier between your vlans. You might as well just be running multiple layer 3 on the same layer 2.
So if I am on your guest vlan, I could access stuff on your other vlans if I just arp for the IP, etc.
Its a borked configuration for sure.
-
I setup the firewall to not allow vlans to access eachothers subnet. This works fine.
As I said, the issue happend with a managed switch as well.
-
Maybe an improperly-configured managed switch.
-
^ possible. Whatever problems your seeing are now suspect to the fact your trying to run multiple layer3 over the same layer 2, be it your tagging traffic or not. Trying to use vlans on a switch that does not support vlans throws in all kinds of new variables that could be causing problems.
I also have a question to this hardware your running on and the mac.. I can not find the maker of that mac anywhere..
That you think its ok to run tagged traffic over a dumb switch just because it doesn't strip them.. Just makes no sense to me..
"I setup the firewall to not allow vlans to access eachothers subnet. This works fine. "No it doesn't.. Your on 1 big broadcast domain, Anyone can can talk to anyone else if they know the IP address, or just freaking arp for it… So while your system might keep grandma jane from talking to stuff on the other vlans. Anyone with clue one or how to google for informatoin can talk to anything else they want no matter what firewall rules you put in place on pfsense.. Because no matter if you tag the traffic or not your connected to the same layer 2. Your switch is not isolating the traffic, its dumb!!
-
That you think its ok to run tagged traffic over a dumb switch just because it doesn't strip them.. Just makes no sense to me..
Actually, it's entirely accurate. A VLAN frame is still just an Ethernet frame, with the VLAN header inserted. A non-managed switch should pass VLAN frames, but it can't create or terminate them.
-
But you are mixing your broadcast domains. It's messy and pretty much an invalid configuration just about guaranteed to fail in unpredictable ways such as what you are seeing.
Actually, it's entirely accurate. A VLAN frame is still just an Ethernet frame, with the VLAN header inserted. A non-managed switch should pass VLAN frames, but it can't create or terminate them.
It also forwards all frames to all ports instead of just the ports on that VLAN, which is broken behavior.
If you think "pfSense is failing to respond to ARP request" and you have a misconfigured network, don't be surprised if people tell you to fix your network first.
At least back up your claim with some packet captures.
-
It also forwards all frames to all ports instead of just the ports on that VLAN, which is broken behavior.
You may want to read up on Ethernet frames. An Ethernet frame contains destination and source MACs, payload and frame check sequence. Everything between the destination MAC and FCS is payload, including the VLAN header(s). Since the MAC addresses are in the same location as always, switch forwarding works as always. Passing VLAN traffic through an un-managed switch is no different than passing it through a trunk port in a managed switch, in that all traffic, VLAN or not is passing through it. Please compare an un-managed switch, with a managed switch where all ports are trunk ports. As for broadcast domains, the same amount of traffic is present, but devices see it according to the VLAN, if any, ithey're listing to.
https://en.wikipedia.org/wiki/Ethernet_frame
https://en.wikipedia.org/wiki/Virtual_LAN -
And you may want to read up on broadcast domains. Your configuration is nothing that someone who wanted to actually do work on his network would do.
And, again, provide the pcap of pfSense not responding to ARP.
Not sure why that traffic shows up on the LAN capture…
Because you are mixing all your broadcast domains up. They are to be separated using VLANs for a reason.
-
I am quite familiar with what broadcast domains are. I am well aware that VLANs are used to isolate broadcast domains. However, again, VLANs passing through an un-managed switch is no different than passing through a managed switch with all ports configured as trunks. In neither case is the VLAN traffic isolated from non-VLAN. It is only the access ports, on a managed switch, configured for a specific VLAN or native LAN that there's any difference. Then you only see the VLAN or native traffic, as configured.
BTW, I am a Cisco CCNA and have been working with LANs since 1978 (yep, before there was such a thing as Ethernet). I have also worked with the original 10base5 "ThiickNet" Ethernet and Token Ring. I have worked with VLANs and had them pass through an un-managed switch without issue. I have fired up Wireshark to see the mix on LAN & VLAN traffic on a network. You can connect a computer to a trunk port, configure the NIC for whatever combination of VLANs you wish (very easy to do in Linux) and not even need an access port configured for a VLAN. Again, I have done that. I have even worked on systems with double VLAN headers, so yes, I have lots of experience in this area.
-
I'm a CCNA too and have tapped my share of thicknet and there is no way I would ever design a network passing dot1q through an unmanaged switch.
Difference is I am not posting on some forum asking why ARP is showing up on the wrong layer 3 interface.
-
I would ever design a network passing dot1q through an unmanaged switch.
I guess it's time for a "black box" test. You have a switch, you don't know whether it's a managed switch with all trunk ports or an un-managed switch and no way to find out. Please explain any difference when passing mixed VLAN and non-VLAN traffic through it that might help you decide and why. If you can't determine a difference, then there will be no difference in practice.
-
If you know you are dealing with all trunk ports connected and you know the switch can handle the increased frame size and doesn't choke on passing the traffic and you know that all ports should receive all VLANs then you could use one. I would still not do it in any network that mattered.
Obviously OP is missing something in this regard.
-
Why do you think a switch can't handle the increase frame size? It's only with Ethertype/Length field used for length that it's an issue. With Ethernet II (DIX), there is no length field and the frame ends only when the data stops. This is what allows jumbo frames. However, jumbo frames are so much larger that hardware has to be built to handle it. Standard Ethernet II frames can handle 1536 bytes, with IP MTU limited to 1500, so there's plenty of space for the VLAN header or even 2 at 4 bytes per header.. If a switch can properly handle Ethernet II, it can handle VLAN & IP and IP is normally carried on Ethernet II.
Incidentally, over the years, I've often challenged "common knowledge" and found that it's not entirely accurate. This is one example, where people make assumptions based on this common knowledge. While they're generally true, they're not absolutely true. I try to verify this through experiment, if possible. You can do the same with an un-managed switch, Wireshark and a couple of computers running Linux. Give it a try and see what turns up. Try again with a managed switch and trunk ports. You can also learn a lot by getting into the details of the protocols to see where limits may or may not exist. One example of this is the length field in 802.3 vs type field in Ethernet II. It is that length field that's the origin of the 1500 MTU limit, even though IP doesn't normally use 802.3. In comparison, other network types, such as token ring or WiFi have a much larger MTU. In fact the maximum IP MTU is 65K and even that's exceeded in some circumstances with "Jumbograms".
https://en.wikipedia.org/wiki/Jumbogram
Bottom line, don't take "common knowledge" as absolute. It isn't.