Multiple (separate) firewalls on same network, weird "drops"
Admin: Sorry if this is in the wrong forum section. Since I am guessing this has mostly to do with CARP I try in this forum.
Searched and could find no similar posts?
Here we go:
Problem: slow connections to/from servers, time outs and possibly drops.
External IP-network X.Y.Z.0/24
Firewall #1 WAN: X.Y.Z.2
This firewall has some but not all hosts on the X.Y.Z.0/24 network. Has clients and servers with external IPs on the X.Y.Z.0/24-network and uses 192.168.0.0/24 addresses internally.
Behind Firewall #1 we have Server #1 using the external IP address: X.Y.Z.100
Firewall #2 and #3 is a carp / failover cluster serving some, but not all hosts on the X.Y.Z.0/24-network, also uses NAT and also uses 192.168.0.0/24 subnet internally as well as other internal networks.
WAN-CARP is: XY.Z.3
and Firewall #2 WAN is: X.Y.Z.4, Firewall #3 WAN-addr is: X.Y.Z.5.
Behind the firewall cluster we have Server #2 using external IP address: X.Y.Z.200
I also have some other simple/"small" broadband routers just for testing purposes. All Firewalls/routers use the same gateway (X.Y.Z.1).
What happens is the following:
Traffic to/from Server #1, out "throgh" firewall #1 works like a charm. NAT (1:1) works perfect. No drops, fast connecions etc.
External traffic from other external networks work fine, so does traffic via my broadband router. Internal traffic from/to clients or between internal clients and internal server works perfect. All is fine.
Traffic to/from firewalls #2 and #3 works good, SSH or HTTPS to WAN addr works good etc.
BUT, and this is what I cannot understand, traffic to/from hosts behind the firewall cluster does not work. Even though the config is exactly the same as on Firewall #1 - ie no reason to suscpect typos or similar. I have cross checked a million times.
The hardware looks good. Software looks good (all systems run 2.0.2). The only difference between the systems are that Firewall #1 and Firewall #3 run nanobsd (4G) and amd64 where as Firewall #2 runs i386, "normal install" (ie not nanobsd).
Couold this be causing the problems?
OR is there something else causing lets say Firewall #1 to "intercept" traffic that "should" go to Firewall #2 for some reason? If so, do we need to set up separate routes or even an extra gateway to route a subnet to the Firewall cluster "in front of" or "before" the firewalls in order not to have traffic "intercepted" by "the wrong firewall".
Since traffic to/from hosts behind Firewall #1 works great (problems only occur to/from hosts behind Firewall #2 so in this case #1 interceps some traffic aimed at #2 but not the other way around) and I find it hard to believe that this is the case but I am runing out of ideas here…
All WANs are connected to the same gigabit switch, no extra "hops" or similar and same lenght and type cables etc.
Behind Firewalls are different gigabit switches (yes we have tried replacing switches, errors still occur on Firewalls #2, #3).
Logs tell us nothing of interest.
Firewalls #2 and #3 have been "factory reset" and completely reconfigured, still same problem.
All firewalls use the exact same hardware, 2x em NICs and 4x Intel NICs (igb).
Carp config looks & works fine.
What we do see is that for example on Firewall #2 we see in logs that traffic aimed for Server #1 was dropped, but I assume this is ok - there is no configuration in Firewall #2 for that IP address and we see on Server #2 that the connections come in there so Firewall #1 does its job (Firewall #2 silently blocks/drops while Firewall #1 replies to the Gateway, connections are established etc).
Has anobody else encountered similar problems? Is it a known limitation/bug, am I doing something wrong?