CARP Problems
-
Sorry, at the moment i copied the output the interface was on backup state, but for sure I'm testing w/ it UP and MASTER.
Sullrich, yes, there's the default rule permit * * * * * …. and for testing purpose, I added another permit icmp * * * * *
The strange thing is that tcpdump doesn't show the ping request packets.
ifconfig and arp commands doesn't show a virtual mac address for the interface, although the mac 00-00-5e-00-01-01 is learnt from the virtual interface...could any hidden rule keeps blocking requests to the virtual mac? (a layer2 hidden pf rule).
-
Could be the block private ip option in WAN.
-
Well… my problem is solved. Now the carp interfaces behave how they should.
Thank you all
-
Hi yall,
by now everything is almost perfect.
to ping to the VIP is possible (the NIC for some reason wasn't replying when VIP should….).but a problem w/ the preemption is happening...
the two systems see each other, they self-elect master and backup, but after sometime they change it! the master becomes the 'standy' and vice-versa.when I previous posted some possible correction for the skewadv, it looked like a problem solved.
i don't know what to do...
tks for your attention. -
We need to see the ifconfig output.
Please show us so we know what we are talking about, otherwise we are pissing in the dark and nobody wants to piss on themselves.
-
well… I have just upgraded to RELENG_1_SNAPSHOT_04-03-2006.
I started again to have the same problems with carp, with the backup becoming master. I've checked the file /etc/inc/interfaces.inc and in line 409 and 410 the advskew is hard coded to 200. This is an error that was solved in past versions.
-
It stays at 200 for 60-90 seconds on bootup then switches back.
-
It stays with advskew 200 all the time.
Hi yall.
I'm spending a couple of hours wondering why CARP ain't working on two boxes.
After creating VIP's their advskew are different, interfaces don't come up etc.Then, I checked the "/etc/inc/interfaces.inc" (using 1.0BETA2, and think on other versions too)
I think this is the reason (lines 408 and 409):
fwrite($fd, "/sbin/ifconfig carp" . $carp_instances_counter . " " . $vip['subnet'] . "/" . $vip['subnet_bits'] . " broadcast "
. $broadcast_address . " vhid " . $vip['vhid'] . "{$carpdev} advskew 200 " . $password . "\n");
409 mwexec("/sbin/ifconfig carp" . $carp_instances_counter . " " . $vip['subnet'] . "/" . $vip['subnet_bits'] . " broadcast " . $b
roadcast_address . " vhid " . $vip['vhid'] . "{$carpdev} advskew 200 " . $password);the advskew is code-fixed to 200, no matter what is in configuration (/conf/config.xml).
So, you can edit the /etc/inc/interfaces.inc, go to line 408 and 409, convert then to this
fwrite($fd, "/sbin/ifconfig carp" . $carp_instances_counter . " " . $vip['subnet'] . "/" . $vip['subnet_bits'] . " broadcast " . $broadcast_address . " vhid " . $vip['vhid'] . "{$carpdev} advskew " . $vip['advskew'] . " " . $password . "\n");
mwexec("/sbin/ifconfig carp" . $carp_instances_counter . " " . $vip['subnet'] . "/" . $vip['subnet_bits'] . " broadcast " . $broadcast_address . " vhid " . $vip['vhid'] . "{$carpdev} advskew " . $vip['advskew'] . " " . $password);In the previous version I had this changed. I thought that it was already in cvs, but only the sleep issue was changed.
-
Then you have a configuration issue. Check these issues:
- Make sure you have a static address on each of the pfsync interfaces in the same subnet
- Try pinging the other end of pfsync to ensure connectivity (if this doesnt work, then stop here and double check everything)
- Make sure each CARP ip has the same VHID shared across the cluster per ip
- Make sure each CARP pair has the same password
-
Then you have a configuration issue. Check these issues:
- Make sure you have a static address on each of the pfsync interfaces in the same subnet
- Try pinging the other end of pfsync to ensure connectivity (if this doesnt work, then stop here and double check everything)
- Make sure each CARP ip has the same VHID shared across the cluster per ip
- Make sure each CARP pair has the same password
I have checked all of the obove, but…
Master ___________ ~~~~~ | sis2|----DMZ ---WAN-|sis1 | ~~~~~ | | | ~~~~~ | |_____sis0|----LAN---------------LAN | | ~~~~~ | | ~~~~~ | |___VLAN0 - pfsync | | ~~~~~ | | | | ~~~~~ | |___VLAN1 - WLAN | Backup ~~~~~ | | ___________ ~~~~~ | | sis2|----DMZ ---WAN-|sis1 | ~~~~~ | | ~~~~~ |_____sis0|----LAN---------------LAN | ~~~~~ | ~~~~~ |___VLAN0 - pfsync | ~~~~~ | | ~~~~~ |___VLAN1 - WLAN ~~~~~
I configured CARP-VIPs for the DMZ, LAN and WLAN-vlan.
Now I have the same phenomenon as described before:
the boxes keep changing Master/Slave on DMZ and LAN, the backup box being Master most of the time.On the vlan however, both insist on being master. tcpdump on LAN shows the same strangeness in changing advskew.
I have * * * * * rules for all non-WAN interfaces.
Edit: Here's the ifconfig output of the Master
ifconfig sis0: flags=8943 <up,broadcast,running,promisc,simplex,multicast>mtu 1500 options=8 <vlan_mtu>inet6 fe80::20d:b9ff:fe02:7a8c%sis0 prefixlen 64 scopeid 0x1 inet 10.1.1.1 netmask 0xffff0000 broadcast 10.1.255.255 ether 00:0d:b9:02:7a:8c media: Ethernet autoselect (100baseTX <full-duplex>) status: active sis1: flags=8843 <up,broadcast,running,simplex,multicast>mtu 1500 options=8 <vlan_mtu>inet6 fe80::20d:b9ff:fe02:7a8d%sis1 prefixlen 64 scopeid 0x2 ether 00:0d:b9:02:7a:8d media: Ethernet autoselect (100baseTX <full-duplex>) status: active sis2: flags=8943 <up,broadcast,running,promisc,simplex,multicast>mtu 1500 options=8 <vlan_mtu>inet 10.5.1.1 netmask 0xffff0000 broadcast 10.5.255.255 inet6 fe80::20d:b9ff:fe02:7a8e%sis2 prefixlen 64 scopeid 0x3 ether 00:0d:b9:02:7a:8e media: Ethernet autoselect (100baseTX <full-duplex>) status: active pfsync0: flags=41 <up,running>mtu 1348 pfsync: syncdev: vlan0 maxupd: 128 lo0: flags=8049 <up,loopback,running,multicast>mtu 16384 inet 127.0.0.1 netmask 0xff000000 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5 pflog0: flags=100 <promisc>mtu 33208 vlan0: flags=8843 <up,broadcast,running,simplex,multicast>mtu 1500 inet 192.168.254.1 netmask 0xffffff00 broadcast 192.168.254.255 inet6 fe80::20d:b9ff:fe02:7a8c%vlan0 prefixlen 64 scopeid 0x7 ether 00:0d:b9:02:7a:8c media: Ethernet autoselect (100baseTX <full-duplex>) status: active vlan: 30 parent interface: sis0 vlan1: flags=8943 <up,broadcast,running,promisc,simplex,multicast>mtu 1500 inet 10.4.1.1 netmask 0xffff0000 broadcast 10.4.255.255 inet6 fe80::20d:b9ff:fe02:7a8c%vlan1 prefixlen 64 scopeid 0x8 ether 00:0d:b9:02:7a:8c media: Ethernet autoselect (100baseTX <full-duplex>) status: active vlan: 4 parent interface: sis0 ng0: flags=88d1 <up,pointopoint,running,noarp,simplex,multicast>mtu 1492 inet6 fe80::20d:b9ff:fe02:7a8c%ng0 prefixlen 64 scopeid 0x9 inet 80.136.201.83 --> 217.0.116.148 netmask 0xffffffff carp0: flags=49 <up,loopback,running>mtu 1500 inet 10.1.1.10 netmask 0xffff0000 carp: BACKUP vhid 1 advbase 1 advskew 200 carp1: flags=49 <up,loopback,running>mtu 1500 inet 10.4.1.10 netmask 0xffff0000 carp: BACKUP vhid 4 advbase 1 advskew 200 carp2: flags=49 <up,loopback,running>mtu 1500 inet 10.5.1.10 netmask 0xffff0000 carp: MASTER vhid 5 advbase 1 advskew 200</up,loopback,running></up,loopback,running></up,loopback,running></up,pointopoint,running,noarp,simplex,multicast></full-duplex></up,broadcast,running,promisc,simplex,multicast></full-duplex></up,broadcast,running,simplex,multicast></promisc></up,loopback,running,multicast></up,running></full-duplex></vlan_mtu></up,broadcast,running,promisc,simplex,multicast></full-duplex></vlan_mtu></up,broadcast,running,simplex,multicast></full-duplex></vlan_mtu></up,broadcast,running,promisc,simplex,multicast>
ifconfig on Backup
sis0: flags=8943 <up,broadcast,running,promisc,simplex,multicast>mtu 1500 options=8 <vlan_mtu>inet6 fe80::20d:b9ff:fe02:8094%sis0 prefixlen 64 scopeid 0x1 inet 10.1.1.5 netmask 0xffff0000 broadcast 10.1.255.255 ether 00:0d:b9:02:80:94 media: Ethernet autoselect (100baseTX <full-duplex>) status: active sis1: flags=8843 <up,broadcast,running,simplex,multicast>mtu 1500 options=8 <vlan_mtu>inet6 fe80::20d:b9ff:fe02:8095%sis1 prefixlen 64 scopeid 0x2 ether 00:0d:b9:02:80:95 media: Ethernet autoselect (100baseTX <full-duplex>) status: active sis2: flags=8943 <up,broadcast,running,promisc,simplex,multicast>mtu 1500 options=8 <vlan_mtu>inet 10.5.1.5 netmask 0xffff0000 broadcast 10.5.255.255 inet6 fe80::20d:b9ff:fe02:8096%sis2 prefixlen 64 scopeid 0x3 ether 00:0d:b9:02:80:96 media: Ethernet autoselect (100baseTX <full-duplex>) status: active pfsync0: flags=41 <up,running>mtu 1348 pfsync: syncdev: vlan0 maxupd: 128 lo0: flags=8049 <up,loopback,running,multicast>mtu 16384 inet 127.0.0.1 netmask 0xff000000 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5 pflog0: flags=100 <promisc>mtu 33208 vlan0: flags=8843 <up,broadcast,running,simplex,multicast>mtu 1500 inet 192.168.254.2 netmask 0xffffff00 broadcast 192.168.254.255 inet6 fe80::20d:b9ff:fe02:8094%vlan0 prefixlen 64 scopeid 0x7 ether 00:0d:b9:02:80:94 media: Ethernet autoselect (100baseTX <full-duplex>) status: active vlan: 30 parent interface: sis0 vlan1: flags=8943 <up,broadcast,running,promisc,simplex,multicast>mtu 1500 inet 10.4.1.5 netmask 0xffff0000 broadcast 10.4.255.255 inet6 fe80::20d:b9ff:fe02:8094%vlan1 prefixlen 64 scopeid 0x8 ether 00:0d:b9:02:80:94 media: Ethernet autoselect (100baseTX <full-duplex>) status: active vlan: 4 parent interface: sis0 ng0: flags=8890 <pointopoint,noarp,simplex,multicast>mtu 1500 carp0: flags=49 <up,loopback,running>mtu 1500 inet 10.1.1.10 netmask 0xffff0000 carp: MASTER vhid 1 advbase 1 advskew 200 carp1: flags=49 <up,loopback,running>mtu 1500 inet 10.4.1.10 netmask 0xffff0000 carp: MASTER vhid 4 advbase 1 advskew 200 carp2: flags=49 <up,loopback,running>mtu 1500 inet 10.5.1.10 netmask 0xffff0000 carp: MASTER vhid 5 advbase 1 advskew 200</up,loopback,running></up,loopback,running></up,loopback,running></pointopoint,noarp,simplex,multicast></full-duplex></up,broadcast,running,promisc,simplex,multicast></full-duplex></up,broadcast,running,simplex,multicast></promisc></up,loopback,running,multicast></up,running></full-duplex></vlan_mtu></up,broadcast,running,promisc,simplex,multicast></full-duplex></vlan_mtu></up,broadcast,running,simplex,multicast></full-duplex></vlan_mtu></up,broadcast,running,promisc,simplex,multicast>
I can ping the DMZ if from Master to Backup, but not vice versa.
tcpdump on LAN:
23:32:04.572009 IP Backup > vrrp.mcast.net: VRRPv2, Advertisement, vrid 1, prio 20, authtype none, intvl 1s, length 36
23:32:05.698596 IP Backup > vrrp.mcast.net: VRRPv2, Advertisement, vrid 1, prio 20, authtype none, intvl 1s, length 36
23:32:06.824884 IP Backup > vrrp.mcast.net: VRRPv2, Advertisement, vrid 1, prio 20, authtype none, intvl 1s, length 36
23:32:10.613710 IP master > vrrp.mcast.net: VRRPv2, Advertisement, vrid 1, prio 240, authtype none, intvl 1s, length 36
23:32:12.354547 IP master > vrrp.mcast.net: VRRPv2, Advertisement, vrid 1, prio 240, authtype none, intvl 1s, length 36
23:32:14.300326 IP master > vrrp.mcast.net: VRRPv2, Advertisement, vrid 1, prio 240, authtype none, intvl 1s, length 36
….
...
23:35:17.600611 IP master > vrrp.mcast.net: VRRPv2, Advertisement, vrid 1, prio 240, authtype none, intvl 1s, length 36
23:35:19.546316 IP master > vrrp.mcast.net: VRRPv2, Advertisement, vrid 1, prio 240, authtype none, intvl 1s, length 36
23:35:21.492071 IP master > vrrp.mcast.net: VRRPv2, Advertisement, vrid 1, prio 240, authtype none, intvl 1s, length 36
23:35:21.492303 IP Backup > vrrp.mcast.net: VRRPv2, Advertisement, vrid 1, prio 200, authtype none, intvl 1s, length 36
23:35:23.335285 IP Backup > vrrp.mcast.net: VRRPv2, Advertisement, vrid 1, prio 200, authtype none, intvl 1s, length 36
23:35:25.076075 IP Backup > vrrp.mcast.net: VRRPv2, Advertisement, vrid 1, prio 200, authtype none, intvl 1s, length 36Setup is currently BETA4
-
I have the same problem (brand new install of beta4).
After configuring CARP on each firewall, some of the interfaces of the master are in backup mode and some other in master mode, the same appears on the slave firewall.if I do an ifconfig carp0 carp1 etc… then I can see that the advskew is set to 200 to all carp interfaces on the two firewalls even if I have set 0 on the master one. Bakcuping the configuring and editing the XML file shows up the right configuration (0 for master VIPs and 200 for slave).
then if I modify the /tmp/carp.sh on the master by putting the advskew at 0, I destroy all carp interfaces and execute carp.sh all is fine because master is master !
If I modify the code where the advskee is hard coded on the master firewall then all is fine too.
-
I have the same problem (brand new install of beta4).
After configuring CARP on each firewall, some of the interfaces of the master are in backup mode and some other in master mode, the same appears on the slave firewall.if I do an ifconfig carp0 carp1 etc… then I can see that the advskew is set to 200 to all carp interfaces on the two firewalls even if I have set 0 on the master one. Bakcuping the configuring and editing the XML file shows up the right configuration (0 for master VIPs and 200 for slave).
then if I modify the /tmp/carp.sh on the master by putting the advskew at 0, I destroy all carp interfaces and execute carp.sh all is fine because master is master !
If I modify the code where the advskee is hard coded on the master firewall then all is fine too.
It will have a advertising skew until the final carp bringup process (about 2 minutes after the firewall is completely booted up). You can view the progress on the console.
In terms of having interfaces being master or backup and being wrong, this means that carp is not communicating on the interface themselves. It needs to be able to broadcast and talk to the other firewall on that interface in question.
-
In terms of having interfaces being master or backup and being wrong, this means that carp is not communicating on the interface themselves. It needs to be able to broadcast and talk to the other firewall on that interface in question.
How could I test it. Because I'm facing the similar problem, one of my carp interfaces out of the four are "master-master" no matter what I do. Simple ping goest fine to and fro'. Nothing seems to be blocked in the logs. I have already changed NIC's and switches without success.
-
In terms of having interfaces being master or backup and being wrong, this means that carp is not communicating on the interface themselves. It needs to be able to broadcast and talk to the other firewall on that interface in question.
How could I test it. Because I'm facing the similar problem, one of my carp interfaces out of the four are "master-master" no matter what I do. Simple ping goest fine to and fro'. Nothing seems to be blocked in the logs. I have already changed NIC's and switches without success.
If you have not seen the CARP tutorial on our site then you need to follow it. It will guide you in setting up the primary box which sycns the configuration to the secondaries. The reason this is important is because it ensures that the advskew and also the vhid are correct across all cluster members. It also ensures that the passwords match per vhid. Place a crossover cable between the two wan interfaces. Does the problem persist? If so you have a mismatched configuration somewhere.
-
If you have not seen the CARP tutorial on our site then you need to follow it.
I did exatly that.
It will guide you in setting up the primary box which sycns the configuration to the secondaries. The reason this is important is because it ensures that the advskew and also the vhid are correct across all cluster members. It also ensures that the passwords match per vhid. Place a crossover cable between the two wan interfaces.
I have already tried this. Not only the wan but all the interface pairs, one by one. I will make some other xover cables tomorrow and will make a try with connecting all interface pairs (WAN, WAN2, DMZ and LAN) with xover (they carp syncronization interface is ofcourse permanently xovered).
Does the problem persist?
Yes :(
If so you have a mismatched configuration somewhere.
Yes probably, but I have tried to build up several times from scratch, with only the (as I guess) the minimal neccessary configuration. So now I have no idea what could be the problem.
Anyhow, it seems to function well, on all the two WAN interfaces either from LAN or DMZ, but I afraid that there is a hidden problem which can cause a collapse in the worst moment. -
Post screen shots of each of the machines virtual ips configuration so we can inspect.
-
Post screen shots of each of the machines virtual ips configuration so we can inspect.
I attached as you asked. I reduced the sizes as possible, hoping that they are still readable.
Thank you for your helpImre
-
Each of the same ip's need to share the same vhid group… They are unique in your setup which also tells me that you didnt follow the tutorial as it would have sync'd the configuration to the backup node ensuring this is all the way it should be. >:(
-
Each of the same ip's need to share the same vhid group… They are unique in your setup which also tells me that you didnt follow the tutorial as it would have sync'd the configuration to the backup node ensuring this is all the way it should be. >:(
Sorry .then I probably misunderstandig something :(
xxx.xxx.xxx.165's VHID=1
xxx.xxx.xxx.116's VHID=2
10.0.254.4'd VHID=3
192.168.0.10's VHID=4
the same kind of interfaces have the same vhid group number.
I'm confused. All of the 4 should have the same? -
Each of the same ip's need to share the same vhid group… They are unique in your setup which also tells me that you didnt follow the tutorial as it would have sync'd the configuration to the backup node ensuring this is all the way it should be. >:(
Sorry .then I probably misunderstandig something :(
xxx.xxx.xxx.165's VHID=1
xxx.xxx.xxx.116's VHID=2
10.0.254.4'd VHID=3
192.168.0.10's VHID=4
the same kind of interfaces have the same vhid group number.
I'm confused. All of the 4 should have the same?Each unique IP needs to have its on VHID. The VHID needs to match on each machine.
If you are using the Sync option as the tutorial shows, this is all automatic.