2.4.2 in HA mode NBNS storm kills wan
-
That looks fine.
You'll need to either stop accepting that traffic into the LAN ports or trace it and figure out what is reflecting it. Before accepting a workaround I would try to isolate the actual cause if for no other reason than understanding what the issue actually is.
One thing to note is that with the pass source any rule on NAT_LAN, the router has no idea that 172.222.22.255 is a broadcast address because it has no netmask to reference. .255 is a valid last-octet on, say, a /22. If the "broadcast" arrives addressed to the NAT_LAN interface, it should be forwarded.
Are the NAT_LAN interfaces in promiscuous mode or something for some reason?
Have you done any pcaps?
I'd pcap on LAN - you only need enough to see what is looping/reflecting. I'd set the packet counter to 1000 or something.
Then I'd pcap on WAN, same thing.
You'll probably need to look at the MAC addresses, etc.
It would be best to capture the same test on both interfaces at the same time but without managed switches and mirror ports you'll have to start at least one of the captures manually in the shell.
You can start one and ps axww | grep tcpdump to see the commands that are run for each interface.
-
So here is what I did.
Tested this and captured on LAN interfaces, then on WAN.
Those are attached, just remove .txt if you want to view them.Turned off the policy route but that made no difference for this problem.
Then I changed the LAN rule to only accept traffic from our 172.22.22.0/23 network.
That seems to have fixed it.But I still have the policy routes disabled but not sure I really need them.
I thought I got that from the HA setup but cant recall where and why.So I think this is a temporary fix but not sure of the consequence of disabling the rule yet.
[capture_ fw1_WAN_failure.cap.txt](/public/imported_attachments/1/capture_ fw1_WAN_failure.cap.txt)
capture_fw2_WAN_failure.cap.txt
capture_fw1_novpn_no_pr_failure.cap.txt
capture_fw2_novpn_nopr_failure.cap.txt -
These MAC addresses are reflecting the bad Src: 172.222.22.92, Dst: 172.222.23.255 traffic that gets irresponsibly put out on WAN back and forth. Fix that and you fix your problem.
00:a0:d1:ea:eb:f4
00:26:6c:f1:ff:d0Where is this? OVH or something?
-
Thats the VM that we use for this testing.
Its a Windows 2012 server that we just changed the IP to create this situation.
It is sitting on the Hyper-V cluster.The problem is that if we by accident types in 172.222.x.x instead of 172.22.x.x it creates the storm.
We discovered that when one of the guys made exactly that mistake.It does not happen if we turn off backup firewall, or disable the LAN side of the backup firewall.
The way we make the storm stop is to correct the IP and disable the WAN NIC on primary firewall.I changed the rules for the LAN sections to only allow 172.22.22.x/23, that is the subnet we are using.
I also removed the policy by pass rule that you see in the image attached before. Not sure why its there and if its needed. -
Why is that MAC address in a pcap on WAN then? Seems you have some sorting out to do there. inside MAC addresses should never be on the WAN layer 2.
-
Why is that MAC address in a pcap on WAN then? Seems you have some sorting out to do there. inside MAC addresses should never be on the WAN layer 2.
Thats what all of this has been about.
Trying to sort that out. -
Hint: It's not pfSense.
-
Hint: It's not pfSense.
Please explain how it can't be pfsense.
Look at my network drawing.2 dumb switches not connected to each other.
2 pfsense boxes
1 dumb hub.The switches not connect to each other.
pfsense boxes are connected to switches via 2 wan interfaces.
pfsense boxes are connected to the dumb hub with 2 lan interfaces.Only other devices involved is a laptop connected to the hub.
Since the switches/hub has NO connection between them, how can it not be pfsense?
Only thing that connects all the devices together is pfsense boxes.
primary pfsense is connected to switch 1 via wan
primary pfsense is connected to switch 2 via wan
primary pfense is connected to hub via lan.backup pfsense is connected to switch 1 via wan
backup pfsense is connected to switch 2 via wan
backup pfense is connected to hub via lan. -
pfSense will not put an inside MAC address on the outside without a bridge interface. period. Check your layer 2.
-
There are no physical connection between any of the 3 switches.
2 of them is dumb netgear switches I purchased for this testing.
The hub is something very old and retired.Only physical connection between them is via pfsense so I am a loss when you are saying its not pfsense.
Thank you for all your help, I really appreciate your input even if I am a bit confused.
-
All I can say is check again. It is pretty much impossible to have an inside MAC address on a WAN pcap without some sort of layer 2 connectivity between inside and outside.