Carp + Multiwan + load balancer
-
This time, it seems that all works great ! For the tests i have done with last 14 june build, this is the first time i can set up two pfsense in multiwan + load balancer + carp config. This is a really nice improvement !
Thanks pfsense team !I'm glad you are getting good results, I should be able to test this week. Are you using outbound NAT on your WAN links? ie are there NATting routers between your pfSense and the internet or does your pfSense have "external" IP addresses and do the NAT itself (themselves!)
Cheers
Jon -
I'm glad you are getting good results, I should be able to test this week. Are you using outbound NAT on your WAN links? ie are there NATting routers between your pfSense and the internet or does your pfSense have "external" IP addresses and do the NAT itself (themselves!)
Cheers
JonMy pfsense has externals IP on both wan connection. I use manual outbound NAT rule generation in pfsense's NAT config.
-
I've just updated to: "1.2.3-RC2 built on Tue Jun 16 04:58:42 EDT 2009" on my two pfSense boxes and the problem remains.
Summary:
3 x ADSL lines with Draytek 2800s connected.
5 NICs: LAN, WAN, WAN2, WAN3, CARP
The Drayteks NAT, the pfSense boxes have AON switched on with no rules defined - ie routingI can keep LAN and CARP plugged into both systems and the three WAN links in only one at a time. As soon as both are connected the boxes freeze up but resume working as soon as one set of the WAN connections is removed.
-
You are sure you have not done any configuration wrong?!
-
For my tests i didn't upgrade but reinstall both system with last snapshot… You should try to start with a fresh new install, maybe problem is here... On my setup i have 2 wans, 1 lan, 1 carp and it's running quite good for the moment. Load balancing failover works fine, failover carp too... You certainly missconfigured something somewhere... How did you setup your VIPS ? It seems like there is a conflict in your wans ip...
-
Thanks for the responses.
I have double checked the setup and it looks correct to me - I have checked using the config.xml to verify the passwords on the CARP addresses and ensured that there is no conflict.
One box on its own works fine - inbound port forwarding, outbound load balancing and failover etc etc. Connecting the WAN interfaces on the second one causes a packet storm of some sort.
The only major difference between my and your system is that my PFs don't NAT. Am I correct in assuming that to disable NAT I set AON to on but don't create any rules?
-
I fixed the storm in the kernel and it does not relate to NAT.
So either post your configuration here to double check or verify that you are not doing yourself a storm! -
Thanks for the responses.
I have double checked the setup and it looks correct to me - I have checked using the config.xml to verify the passwords on the CARP addresses and ensured that there is no conflict.
One box on its own works fine - inbound port forwarding, outbound load balancing and failover etc etc. Connecting the WAN interfaces on the second one causes a packet storm of some sort.
The only major difference between my and your system is that my PFs don't NAT. Am I correct in assuming that to disable NAT I set AON to on but don't create any rules?
With AON and multiples wans, you have to set up a rule in AON for each wan of your pfsense ! Create a new rule based on the one created by default and just change the interface the rule applies to. Be sure in carp settings you synchronise NAT too.
-
With AON and multiples wans, you have to set up a rule in AON for each wan of your pfsense ! Create a new rule based on the one created by default and just change the interface the rule applies to. Be sure in carp settings you synchronise NAT too.
What are these rules for if I don't actually use NAT?
@ermal:
I fixed the storm in the kernel and it does not relate to NAT.
So either post your configuration here to double check or verify that you are not doing yourself a storm!Which sections should I post or would the whole lot be better?
Thanks for your patience
-
With AON and multiples wans, you have to set up a rule in AON for each wan of your pfsense ! Create a new rule based on the one created by default and just change the interface the rule applies to. Be sure in carp settings you synchronise NAT too.
What are these rules for if I don't actually use NAT?
@ermal:
I fixed the storm in the kernel and it does not relate to NAT.
So either post your configuration here to double check or verify that you are not doing yourself a storm!Which sections should I post or would the whole lot be better?
Thanks for your patience
Anyone care to comment?
-
@ermal:
I fixed the storm in the kernel and it does not relate to NAT.
So either post your configuration here to double check or verify that you are not doing yourself a storm!Just tested fresh install of http://snapshots.pfsense.org/FreeBSD_RELENG_7_2/pfSense_RELENG_1_2/livecd_installer/pfSense-1.2.3-20090624-1038.iso.gz
The problem remains. Multicast/broadcast packets go from LAN to WAN when you use loadbalancer and no nat configured. -
Can you get tcpdumps of the traffic on both interfaces?
-
@ermal:
Can you get tcpdumps of the traffic on both interfaces?
Here's a quick sample. I have to say it looks normal to me …
At the moment I have the second pf box without its WAN{,2,3} connections plugged in.
If it helps, I think I can get away with plugging in just one of the external links without box boxes going completely comatose. I'll try this once I've warned off the office and try and get a dump showing the storm as well.
*** Welcome to pfSense 1.2.3-RC2-pfSense on absinthe1 ***
WAN* -> fxp0 -> 10.100.1.10
OPT1(WAN2)* -> fxp1 -> 10.100.2.10
OPT2(WAN3)* -> fxp2 -> 10.100.3.10
OPT3(CARP)* -> fxp3 -> 10.100.250.1
LAN* -> rl0 -> 192.168.100.2############## LAN:
tcpdump -n -i rl0 ether multicast or ip broadcast
10:17:13.950372 arp who-has 192.168.100.200 tell 192.168.100.254
10:17:14.036876 IP 192.168.100.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 10, prio 0, authtype none, intvl 1s, length 36
10:17:14.283542 IP 192.168.100.27.631 > 192.168.100.255.631: UDP, length 201
10:17:14.940426 IP 192.168.100.254.137 > 192.168.100.255.137: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST
10:17:14.940563 IP 192.168.100.254.137 > 192.168.100.255.137: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST
10:17:14.940582 IP 192.168.100.254.137 > 192.168.100.255.137: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST
10:17:14.943779 IP6 fe80::f98d:b07d:d4d4:6b81.546 > ff02::1:2.547: dhcp6 solicit
10:17:14.950340 arp who-has 192.168.100.200 tell 192.168.100.254
10:17:15.037888 IP 192.168.100.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 10, prio 0, authtype none, intvl 1s, length 36
10:17:15.283514 IP 192.168.100.27.631 > 192.168.100.255.631: UDP, length 191
10:17:15.950372 arp who-has 192.168.100.200 tell 192.168.100.254
10:17:16.038877 IP 192.168.100.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 10, prio 0, authtype none, intvl 1s, length 36
10:17:16.283488 IP 192.168.100.27.631 > 192.168.100.255.631: UDP, length 210
10:17:16.960480 arp who-has 192.168.100.200 tell 192.168.100.254
10:17:17.039896 IP 192.168.100.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 10, prio 0, authtype none, intvl 1s, length 36
10:17:17.960442 arp who-has 192.168.100.200 tell 192.168.100.254
10:17:18.040883 IP 192.168.100.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 10, prio 0, authtype none, intvl 1s, length 36
10:17:18.960418 arp who-has 192.168.100.200 tell 192.168.100.254
10:17:19.041885 IP 192.168.100.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 10, prio 0, authtype none, intvl 1s, length 36
10:17:19.193913 arp who-has 192.168.100.49 tell 192.168.100.13
10:17:19.453274 arp who-has 192.168.100.57 tell 192.168.100.17
10:17:20.042932 IP 192.168.100.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 10, prio 0, authtype none, intvl 1s, length 36
10:17:20.453366 arp who-has 192.168.100.57 tell 192.168.100.17
10:17:21.043893 IP 192.168.100.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 10,10:17:31.054006 IP 192.168.100.2 > 224.0.0.18: VRRPv2, Advertisement, vrid 10, prio 0, authtype none, intvl 1s, length 36
10:17:31.518347 arp who-has 192.168.100.57 tell 192.168.100.17
10:17:31.628973 IP 192.168.100.152.1171 > 255.255.255.255.1211: UDP, length 75
10:17:31.942878 arp who-has 192.168.100.31 tell 192.168.100.20################ WAN link:
tcpdump -n -i fxp0 ether multicast or ip broadcast
10:25:53.614954 arp who-has 10.100.1.3 tell 10.100.1.1
10:25:53.679919 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 12, prio 0, authtype none, intvl 1s, length 36
10:25:53.679941 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 11, prio 0, authtype none, intvl 1s, length 36
10:25:54.372699 IP 10.100.1.52.137 > 10.100.1.255.137: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST
10:25:54.680921 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 12, prio 0, authtype none, intvl 1s, length 36
10:25:54.680943 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 11, prio 0, authtype none, intvl 1s, length 36
10:25:55.681927 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 12, prio 0, authtype none, intvl 1s, length 36
10:25:55.681948 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 11, prio 0, authtype none, intvl 1s, length 36
10:25:56.682931 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 12, prio 0, authtype none, intvl 1s, length 36
10:25:56.682954 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 11, prio 0, authtype none, intvl 1s, length 36
10:25:57.683932 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 12, prio 0, authtype none, intvl 1s, length 36
10:25:57.683954 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 11, prio 0, authtype none, intvl 1s, length 36
10:25:58.684936 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 12, prio 0, authtype none, intvl 1s, length 36
10:25:58.684959 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 11, prio 0, authtype none, intvl 1s, length 36
10:25:59.685958 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 12, prio 0, authtype none, intvl 1s, length 36
10:25:59.685982 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 11, prio 0, authtype none, intvl 1s, length 36
10:25:59.710273 IP 10.100.1.52.137 > 10.100.1.255.137: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST
10:26:00.464228 IP 192.168.100.156.64348 > 239.255.255.253.427: UDP, length 85
10:26:00.472231 IP 10.100.1.52.137 > 10.100.1.255.137: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST
10:26:00.687002 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 12, prio 0, authtype none, intvl 1s, length 36
10:26:00.687029 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 11, prio 0, authtype none, intvl 1s, length 36
10:26:01.237280 IP 10.100.1.52.137 > 10.100.1.255.137: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST
10:26:01.687947 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 12, prio 0, authtype none, intvl 1s, length 36
10:26:01.687970 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 11, prio 0, authtype none, intvl 1s, length 36
10:26:02.688969 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 12, prio 0, authtype none, intvl 1s, length 36
10:26:02.688999 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 11, prio 0, authtype none, intvl 1s, length 36
10:26:03.689958 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 12, prio 0, authtype none, intvl 1s, length 36
10:26:03.689980 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 11, prio 0, authtype none, intvl 1s, length 36
10:26:04.690966 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 12, prio 0, authtype none, intvl 1s, length 36
10:26:04.690989 IP 10.100.1.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 11, prio 0, authtype none, intvl 1s, length 36
10:26:05.033667 IP 192.168.1.70.138 > 255.255.255.255.138: NBT UDP PACKET(138)0:17:32.010647 arp who-has 192.168.100.200 tell 192.168.100.254 -
@ermal:
Can you get tcpdumps of the traffic on both interfaces?
Here is a demo of the problem. I connected my laptop to our second Draytek (WAN2) and ran tcpdump on it and also our first pfSense box. I briefly plugged in the WAN2 interface on our second pfSense box and then pulled it out as both tcpdummps went mad. Below I have pasted the tail end of each dump. You can see a broadcast that came in from a VPN connected to the Draytek router than then got repeated between the two pfSense boxes. Note the timestamps and the packet count summaries! The point where the VRRPv2 packets reappear is when I unplugged the link.
It was only plugged in for about 5 seconds and I got nearly 400,000 packets - pretty impressive. As you can imagine, things go a bit weird when all the links are connected.
Now I look at it, it seems that it is not stuff from LAN being forwarded to WAN but broadcasts being retransmitted on the interface that they came in on.
Dump on laptop:
10:54:30.380386 IP 192.168.100.157.64849 > 239.255.255.253.427: UDP, length 78
10:54:30.380435 IP 192.168.100.157.64846 > 239.255.255.253.427: UDP, length 78
10:54:30.380438 IP 192.168.100.157.64847 > 239.255.255.253.427: UDP, length 78
10:54:30.380464 IP 192.168.100.157.64847 > 239.255.255.253.427: UDP, length 78
10:54:30.380498 IP 192.168.100.157.64848 > 239.255.255.253.427: UDP, length 78
10:54:30.380501 IP 192.168.100.157.64846 > 239.255.255.253.427: UDP, length 78
10:54:30.380504 IP 192.168.100.157.64849 > 239.255.255.253.427: UDP, length 78
10:54:30.380545 IP 192.168.100.157.64847 > 239.255.255.253.427: UDP, length 78
10:54:30.380548 IP 192.168.100.157.64848 > 239.255.255.253.427: UDP, length 78
10:54:30.558559 IP 192.168.100.157.64850 > 239.255.255.253.427: UDP, length 78
10:54:30.859228 IP 10.100.2.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 13, prio 0, authtype none, intvl 1s, length 36
10:54:30.859239 IP 10.100.2.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 14, prio 0, authtype none, intvl 1s, length 36
10:54:31.860246 IP 10.100.2.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 13, prio 0, authtype none, intvl 1s, length 36
10:54:31.860256 IP 10.100.2.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 14, prio 0, authtype none, intvl 1s, length 36
10:54:31.993321 IP 192.168.254.33.138 > 255.255.255.255.138: NBT UDP PACKET(138)67311 packets captured
466056 packets received by filter
398745 packets dropped by kernel############## Dump on pfsense box:
10:54:30.421045 IP 192.168.100.157.64848 > 239.255.255.253.427: UDP, length 78
10:54:30.421067 IP 192.168.100.157.64846 > 239.255.255.253.427: UDP, length 78
10:54:30.421077 IP 192.168.100.157.64846 > 239.255.255.253.427: UDP, length 78
10:54:30.421096 IP 192.168.100.157.64849 > 239.255.255.253.427: UDP, length 78
10:54:30.421109 IP 192.168.100.157.64849 > 239.255.255.253.427: UDP, length 78
10:54:30.421158 IP 192.168.100.157.64847 > 239.255.255.253.427: UDP, length 78
10:54:30.421167 IP 192.168.100.157.64847 > 239.255.255.253.427: UDP, length 78
10:54:30.421185 IP 192.168.100.157.64846 > 239.255.255.253.427: UDP, length 78
10:54:30.421194 IP 192.168.100.157.64846 > 239.255.255.253.427: UDP, length 78
10:54:30.421224 IP 192.168.100.157.64847 > 239.255.255.253.427: UDP, length 78
10:54:30.421233 IP 192.168.100.157.64847 > 239.255.255.253.427: UDP, length 78
10:54:30.421292 IP 192.168.100.157.64848 > 239.255.255.253.427: UDP, length 78
10:54:30.421302 IP 192.168.100.157.64848 > 239.255.255.253.427: UDP, length 78
10:54:30.421315 IP 192.168.100.157.64849 > 239.255.255.253.427: UDP, length 78
10:54:30.421324 IP 192.168.100.157.64849 > 239.255.255.253.427: UDP, length 78
10:54:30.421329 IP 192.168.100.157.64848 > 239.255.255.253.427: UDP, length 78
10:54:30.421337 IP 192.168.100.157.64848 > 239.255.255.253.427: UDP, length 78
10:54:30.421367 IP 192.168.100.157.64846 > 239.255.255.253.427: UDP, length 78
10:54:30.421376 IP 192.168.100.157.64846 > 239.255.255.253.427: UDP, length 78
10:54:30.421394 IP 192.168.100.157.64849 > 239.255.255.253.427: UDP, length 78
10:54:30.421403 IP 192.168.100.157.64849 > 239.255.255.253.427: UDP, length 78
10:54:30.421429 IP 192.168.100.157.64847 > 239.255.255.253.427: UDP, length 78
10:54:30.421438 IP 192.168.100.157.64847 > 239.255.255.253.427: UDP, length 78
10:54:30.421462 IP 192.168.100.157.64846 > 239.255.255.253.427: UDP, length 78
10:54:30.421470 IP 192.168.100.157.64846 > 239.255.255.253.427: UDP, length 78
10:54:30.584514 IP 192.168.100.157.64850 > 239.255.255.253.427: UDP, length 78
10:54:30.885182 IP 10.100.2.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 13, prio 0, authtype none, intvl 1s, length 36
10:54:30.885205 IP 10.100.2.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 14, prio 0, authtype none, intvl 1s, length 36
10:54:31.886173 IP 10.100.2.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 13, prio 0, authtype none, intvl 1s, length 36
10:54:31.886195 IP 10.100.2.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 14, prio 0, authtype none, intvl 1s, length 36
10:54:32.019317 IP 192.168.254.33.138 > 255.255.255.255.138: NBT UDP PACKET(138)
10:54:32.887181 IP 10.100.2.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 13, prio 0, authtype none, intvl 1s, length 36
10:54:32.887202 IP 10.100.2.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 14, prio 0, authtype none, intvl 1s, length 36
10:54:33.888195 IP 10.100.2.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 13, prio 0, authtype none, intvl 1s, length 36
10:54:33.888216 IP 10.100.2.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 14, prio 0, authtype none, intvl 1s, length 36
10:54:34.889199 IP 10.100.2.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 13, prio 0, authtype none, intvl 1s, length 36
10:54:34.889221 IP 10.100.2.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 14, prio 0, authtype none, intvl 1s, length 36
^C
75394 packets captured
476849 packets received by filter
389590 packets dropped by kernel -
The first problem here is that broadcast/multicast packets go from LAN(s) to WAN. Then broadcast storm happens.
To reproduce the problem:- Fresh install of pfSense-1.2.3-RC2 from 24 June 2009 snapshot
- LAN remains with 192.168.1.1/24, WAN=2.2.2.1/24 GW 2.2.2.254
- Enable AON
- Create failover loadbalancer on WAN with ICMP to 2.2.2.254 (doesn't matter actually)
- Modify default rule for LAN to 'allow all from any to any with Loadbalancer as gateway'
Then connect to LAN anything that have IP not belonging to 192.168.1.0/24 and broadcasting.
In the pictures below I connected MS Windows with 192.168.2.2
Left:LAN Right:WAN
-
I am not seeing quite the same as you.
You have a LAN broadcast appear on your WAN
I have another "LAN"s broadcast get repeated on the same interface that it arrived.So, I conject that "internal" addresses (RFC<whatever is="">) get repeated inappropriately.
My skills only go as far as iptables on Linux I'm afraid but I think I need to learn pf pretty damn quick! To get around my snag I am VMming my pfSense system and using 802.1Q + trunking. Besides I can't wedge enough NICs into the box. I have five ADSL lines now! (Our office is pretty rural and even ADSL Max is bit slow)
Looking at our results and to confirm or reject my conjecture we need you to generate a broadcast on your WAN from an internal address and I need to generate a LAN based broadcast and see if it crosses over to a WAN (or three). Once I've moved my production pfSense router over to the virtual side, I'll have its old standby to play with.
Cheers
Jon</whatever> -
You are seeing consequences I am seeing a cause, this is the only difference. ;-)
If multicast/broadcast packet goes from lan to wan and there is CARP then you get storm.
What is important (and interesting) is when I connect my laptop broadcasting with the same IP to WAN net segment then there is no storm. The same packet (at layer 3) coming from LAN (remember - no NAT) does cause a storm.