CARP on Switch ports without portfast leading to double master-master problems ?
-
Any ideas how this issue (which has come up several times in the OpenBSD mailing-lists) can be handled under FreeBSD/pfSense ?
http://marc.info/?l=openbsd-misc&m=137414729621596&w=2
List: openbsd-misc
Subject: CARP on Switch ports without port fast leading to double master-master problems
From: Andy <andy ()="" brandwatch="" !="" com="">Date: 2013-07-18 11:34:11
Message-ID: 51E7D2B3.8050006 () brandwatch ! comOthers have discussed our problem but I cannot see that this has been
implement (I cannot find a man page referring to this).
http://openbsd.7691.n7.nabble.com/carp-init-delay-td226187.htmlI.e. When a firewall boots up, the connected switch port starts STP and
is initially blocked, causing the newly booting firewall to think it is
master, the port then starts forwarding and I have double master.This causes issues with other daemons too which monitor the CARP state
like sasynd, BGPD etc…I have enabled port fast where I can. However I cannot guarantee this
and the WAN connections to our data centre network do not want to enable
port past. This means I have to set a high advbase, but this ruins the
response time.I could add "!sleep 5" to the top of carp interfaces as suggested in the
link above but this really belongs in the kernel as this only helps with
the firewall reboot condition and not all the other possible network
state changes etc like the removal of a NIC and reconnection (which
restarts STP etc).Has this been done? :)</andy>
http://marc.info/?l=openbsd-misc&m=137459269317077&w=2
List: openbsd-misc
Subject: Re: CARP on Switch ports without port fast leading to double master-master problems
From: Andy <andy ()="" brandwatch="" !="" com="">Date: 2013-07-23 15:17:33
Message-ID: 51EE9E8D.7050608 () brandwatch ! comFantastic,
Thanks Stuart, That was really helpful!
Without even knowing it your thoughts (suggesting manipulating
carpdemote) has also just helped me to resolve /another/ CARP issue I
have been battling with when using a direct crossover cable between the
firewalls.Same issue as:
http://old.nabble.com/Unexpected-carp-failovers-when-using-crossover-cable-as-pfsync-syncdev-in-5.1-p33921868.htmlWhen the backup is rebooted, pfsync interface goes down, which causes
carpdemote to increment on the primary;
stfw1 kernel: carp: pfsync0 demoted group carp by 1 to 1 (pfsync link state down)
stfw1 kernel: carp: pfsync0 demoted group pfsync by 1 to 1 (pfsync link state down)When the backup is rebooting the pfsync interface goes up and down a few
times during POST'ing and NIC BIOS etc, before OpenBSD starts to load.
This seems to cause the Primary to start the process of attempting a
bulk update 'carp interlock' before the backup is ready.When the backup finally comes up and requests a bulk update (even though
the primary is still attempting a bulk update in the opposite direction
I think (CARP interlock in place)) which fails, the backup goes master
as the Primary has carpdemote=1 while the backup has a carpdemote=0,
thus multiple masters.On the Primary we saw;
carp: pfsync0 demoted group carp by 1 to 1 (pfsync link state down)
carp: pfsync0 demoted group pfsync by 1 to 1 (pfsync link state down)
carp0: state transition: MASTER -> BACKUP <- Due to multi-master!
carp1: state transition: MASTER -> BACKUP <- Due to multi-master!
carp: pfsync0 demoted group carp by -1 to 0 (pfsync link state up)
carp: pfsync0 demoted group pfsync by -1 to 0 (pfsync link state up)
carp0: state transition: BACKUP -> MASTER <- Later corrects itself
carp1: state transition: BACKUP -> MASTER <- Later corrects itselfWe can see the Primary firewall had to quickly drop to 'backup', as the
seconadry firewall made itself master.On the secondary we saw;
carp: carp1 demoted group carp by 1 to 149 (carpdev)
carp: pfsync0 demoted group carp by 32 to 181 (pfsync init)
carp: pfsync0 demoted group pfsync by 32 to 32 (pfsync init)
carp: pfsync0 demoted group carp by 1 to 182 (pfsync bulk start)
carp: pfsync0 demoted group pfsync by 1 to 33 (pfsync bulk start)
carp: carp1 demoted group carp by -1 to 181 (carpdev)
carp: pfsync0 demoted group carp by -1 to 180 (pfsync bulk done)
carp: pfsync0 demoted group pfsync by -1 to 32 (pfsync bulk done)
carp: pfsync0 demoted group carp by -32 to 148 (pfsync init)
carp: pfsync0 demoted group pfsync by -32 to 0 (pfsync init)
carp0: state transition: BACKUP -> MASTER
carp1: state transition: BACKUP -> MASTER
carp0: state transition: MASTER -> BACKUP
carp1: state transition: MASTER -> BACKUPThis was fixed by adding;
!ifconfig -g carp carpdemote 1
!ifconfig -g pfsync carpdemote 1To each physical interface 'hostname.if', and then adding
sleep 120
ifconfig -g carp -carpdemote 3
ifconfig -g pfsync -carpdemote 3NB; There are 3 physical interfaces (INT, EXT, and PFSYNC's pysical
interface).Completely stabilises a flapping pfsync interface during reboots :)
Cheers, Andy.
On 22/07/13 22:26, Stuart Henderson wrote:
On 2013-07-22, Andy andy@brandwatch.comwrote:
For example we are connected to a various providers in various
locations (we have many OpenBSD firewalls and this is only a problem in
some locations) where they won't enable port fast/configure as static
access ports.
I would think this is the minority, and that most places are either on switches
not smart enough for STP, or where the admins can configure them appropriately
for the connected devices, in either case the extra delay would be unwanted..
(and how long would you delay for anyway? it depends on switch configuration).BTW an alternative to "sleep" in the network scripts would be to use
"!ifconfig -g carp carpdemote" in a hostname.if file, then in rc.local
maybe a sleep and then "ifconfig -g carp -carpdemote".. However neither of
these account for the situation where you lose and re-gain link after boot./andy@brandwatch.com</andy>