WAN + two OptWAN, loadbalancing fails with weird MAC-addresses behaviour
-
Hello,
probably missing something simple here…. nevertheless here is a problem that just kills me.
pfSense-1.2.3
LAN=192.168.0.1/24
WAN=192.168.3.2/24 gateway 192.168.3.1
WAN1(OPT1 em1)=10.0.0.2/24 gateway 10.0.0.1
WAN2(OPT2 em0)=10.0.1.2/24 gateway 10.0.1.1
All interfaces are on different nics, static addressing is everywhere, no external switching between interfaces.
I have a loadbalancing pool from WAN1 and WAN2 with ip to monitor=respective gateways
On LAN I have a rule routing all traffic through this loadbalancing pool.
Test scenario: from PC connected to LAN I periodically go to the same web-site 97.107.134.79 waiting for all states to disappear between attempts.
Now dumps. This is my first attempt that goes through WAN2:# tcpdump -ni em0 -e host 97.107.134.79 20:39:48.143093 00:1b:21:7c:a1:6c > 00:26:f2:56:49:bc, ethertype IPv4 (0x0800), length 66: 10.0.1.2.2263 > 97.107.134.79.80: S 3695044687:3695044687(0) win 8192 <mss 1460,nop,wscale="" 8,nop,nop,sackok="">20:39:48.144004 00:1b:21:7c:a1:6c > 00:26:f2:56:49:bc, ethertype IPv4 (0x0800), length 66: 10.0.1.2.30502 > 97.107.134.79.80: S 4101759538:4101759538(0) win 8192 <mss 1460,nop,wscale="" 8,nop,nop,sackok="">20:39:48.237129 00:26:f2:56:49:bc > 00:1b:21:7c:a1:6c, ethertype IPv4 (0x0800), length 66: 97.107.134.79.80 > 10.0.1.2.2263: S 2213084595:2213084595(0) ack 3695044688 win 5840 <mss 6="" 1420,nop,nop,sackok,nop,wscale="">20:39:48.237379 00:26:f2:56:49:bc > 00:1b:21:7c:a1:6c, ethertype IPv4 (0x0800), length 66: 97.107.134.79.80 > 10.0.1.2.30502: S 2203729880:2203729880(0) ack 4101759539 win 5840</mss></mss></mss>
After some time the same connection request goes through WAN1, please pay attention to destination MAC-addresses:
# tcpdump -ni em1 -e host 97.107.134.79 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on em1, link-type EN10MB (Ethernet), capture size 96 bytes 20:41:44.681552 00:1b:21:7c:a4:57 > 00:26:f2:56:49:bc, ethertype IPv4 (0x0800), length 66: 10.0.0.2.37567 > 97.107.134.79.80: S 238567 8252:2385678252(0) win 8192 <mss 1460,nop,wscale="" 8,nop,nop,sackok="">20:41:44.686356 00:1b:21:7c:a4:57 > 00:26:f2:56:49:bc, ethertype IPv4 (0x0800), length 66: 10.0.0.2.28692 > 97.107.134.79.80: S 425659 6844:4256596844(0) win 8192 <mss 1460,nop,wscale="" 8,nop,nop,sackok="">20:41:44.934133 00:1b:21:7c:a4:57 > 00:26:f2:56:49:bc, ethertype IPv4 (0x0800), length 66: 10.0.0.2.56267 > 97.107.134.79.80: S 150322 3532:1503223532(0) win 8192 <mss 1460,nop,wscale="" 8,nop,nop,sackok="">20:41:47.680118 00:1b:21:7c:a4:57 > 00:26:f2:56:49:bc, ethertype IPv4 (0x0800), length 66: 10.0.0.2.37567 > 97.107.134.79.80: S 238567 8252:2385678252(0) win 8192</mss></mss></mss>
So, in the second case it uses different interface but destination-mac of WAN2 gateway!
# arp -an ? (10.0.0.1) at 00:26:f2:56:4b:86 on em1 [ethernet] ? (10.0.1.1) at 00:26:f2:56:49:bc on em0 [ethernet] ...
The problem is intermittent - approximately 90% of connection over new interface fail. But if an attempt is successfull then previous interface starts using 'wrong' destination mac.
More weird stuff: everything works as expected if I forward traffic through loadbalancers WAN-WAN1 or WAN-WAN2.Could anybody please give me a hint on what is going on here -(((
Thanks! -
Ok, let me rephrase the question.
Is anybody using outbound loadbalancing across two or more interfaces when the pool does not include main WAN interface?
Thanks. -
yes
-
Unbelievable but it was hardware issue.