Getting pfsense to failover with a bridge using the CD-ROM platform
-
This is how I setup a failover bridge.
I've read you can setup spanning tree to do this too, but I wanted something I know would always work since I have had issues with STP in the past.
I hope this helps someone else who really wants a failover bridge.
For the record I am using 1.2BETA1 live CD, 1.0.1 did not seem to work correctly with the necessary cron entries.I configured a standard bridge.
Then I configured a third interface (OPT1) to manage the firewall.
I setup CARP with the following settings:
Synchronize Enabled
sync interface = management interface
Synchronize rules
Synchronize Firewall Schedules
Synchronize aliases
Synchronize Virtual IPs
And obviously setup the Sync to IP and password.Next I created a virtual IP address on the management interface so CARP has something to work with.
This IP is not used for anything except to create a carp0 interface.Then I created a scripts folder on the floppy disk
in the scripts folder I created a file named brstat.shbrstat.sh:
#!/bin/sh
if ifconfig carp0 | grep BACKUP>/dev/null 2>&1 ; then
/sbin/ifconfig bridge0 down
else
/sbin/ifconfig bridge0 up
fiThen I manually edited the config.xml file
In <system>I added:
<shellcmd>cp -R /tmp/mnt/cf/scripts /tmp;chmod +x /tmp/scripts/*;/sbin/ifconfig bridge0 down</shellcmd>This copies the script(s) from the floppy to the /tmp disk and makes them executible.
It also shuts down the bridge on bootup.In the <cron>section I added:
<minute>/1</minute>
<hour></hour>
<mday></mday>
<month></month>
<wday>*</wday>
<who>root</who>
<command></command>/tmp/scripts/brstat.shHow it works:
The server that is the MASTER according to CARP will have it's bridge0 interface brought up via the brstat.sh script run from cron.
The server(s) that are not the MASTER wil have the bridge interface taken down via the btstat.sh script.
Failover usually takes 30-120 seconds.It seems to work really well for me using managed and unmanaged switches on both sides of the bridge.
Enjoy!</cron></system>
-
Very interesting!
I'd be very leery of running this though if you don't have STP on your switches (if they're unmanaged, or it's disabled). This will likely create a temporary L2 loop upon failover, unless the entire system or one of the bridged interfaces fails. It also has the potential, if for some reason the bridges wouldn't be brought up/down properly, of creating a permanent L2 loop. For those that have never experienced a L2 loop, it means your network is going to completely stop working.
STP (on switches) would be much better than something of this nature, assuming the switch you're using isn't buggy. But it can be a real pain to properly configure STP if you're not very familiar with its intricacies and best practices, which can vary from one switch vendor to another. If not done right, it can cause all kinds of problems.
-
I agree that STP would be best if you have a managed switch.
Configured properly, having both bridges active at the same time should not be an issue.
However, I have also seen STP not work so well on some switches, I feel more comfortable having this solution to ensure that only one bridge is active at a time.It should be noted that this solutions is not perfect.
If something goes wrong with CARP and both systems think they are the master you will end up with a loop and the bridge will quit working. -
It should be noted that this solutions is not perfect.
If something goes wrong with CARP and both systems think they are the master you will end up with a loop and the bridge will quit working.But don't you have this problem too when you use normal failover on pfsensen non-bridge firewalls ?
It should be the same buggy in that case.
Maybe we can work on a second check or somthing like it ?
When you have a disk-installation, it's just simple changing that shellcmd-line ?
Other, question, why do you use the management interface as the sync-interfaces and not a seperate one ? Just ran out of nics ?
Can you please descrive what you have done on what system ? OK, the Carp settings are known I think for everyone when reading the docs, but more about the scripts and the changements in files.
For the rest, the solution seems to be very nice, thanks !
-
Does really no-one use this solution ?
-
Guys, sorry to kick this one again, but is anyone able to explain this a little bit more ?
- "BACKUP" in the script is the name of the second Pfsense machine ?
- How to make and where to make that script folder on a HD-installation ?
- more info is also welcome.
I think a lot of people would like to use this also, so please respond with detailed info.
-
I had another question on this - if one firewall fails (ie the link isn't dropped, the firewall just locks up or stops working properly) will this fail over correctly? We just had our pfsense box fail after nearly 1 year of uptime today, which made me re-approach the idea of having a failover solution. We're running a filtered bridge and we run public webservers behind the filtered bridge, so switching to a routed setup isn't the correct solution.
-
Guys, sorry to kick this one again, but is anyone able to explain this a little bit more ?
- "BACKUP" in the script is the name of the second Pfsense machine ?
- How to make and where to make that script folder on a HD-installation ?
- more info is also welcome.
I think a lot of people would like to use this also, so please respond with detailed info.
I've been away a long time and was just revisiting the forums here….
"BACKUP" is the status of the CARP interface, the script is trying to determine if it's the BACKUP server or the PRIMARY server.
It does this by looking at the output of this command: ifconfig carp0
If the machine is the BACKUP then it disables the bridge.
If the machine is not BACKUP then it enables the bridge.I've never used a hard drive install, I would assume you could place the script anywhere.
Just make sure to chmod it so it's executible. (chmod +x)
Make sure to change the path to the script in the cron job too:
<minute>/1</minute>
<hour></hour>
<mday></mday>
<month></month>
<wday>*</wday>
<who>root</who>
<command></command>/tmp/scripts/brstat.shI believe I used vi to create the file, it was some time ago so I do not remember exactly.
You could create it on another machine and copy it into your installation using a floppy too.With STP switches both bridges can be active at the same time, as long as one of the bridges are up everything will work.
So with STP you do not need to do any of this.
STP works great and you can still use Carp to sync rules so you only need to maintain one set of rules.If you have switches that do not supprot STP this is the only method I know of that will allow you to have a failover bridge.
This might be handy for other uses besides bridges.
Maybe you need to perform some action when the status of carp changes on a pfsense firewall.
Simply modify the script to perform the necessary tasks.mbreitba also asked if this works if one of the bridges fail.
Yes, thats exactly what this is trying to solve. As long as carp can detect the failure, the bridge will become active on the other node.This solution has worked well for me after nearly a year in production.
Hopefully someone else will find it useful too. -
Hi,
This way works quite OK actually, in the speakings for sure.
The problem where I'm thinking about is that the master needed to be totally dead of the slave will takeover.
Actually you should make a script that pings the wan and lan interface and when it can't ping one of the two, the slave takes over.
I think this can be the only problem so far….
-
A slightly better fix for this is possible now with 1.2.3-RC and 2.0, but it's still not ideal.
The CARP interface will now report a transition to MASTER as a "link up" event, and a transition to BACKUP as a "link down" event to the system. These can be caught with devd and used to call scripts on these events – no more need to rely on cron or a delay. This will happen instantaneously once the CARP interface on the backup takes over.
This is more meant for a full install, but I suppose it could be altered to work as the initial solution was for a livecd/embedded platform.
If you are running a recent (as of the date on this post) snapshot of 1.2.3, or 2.0, you can try this.
Edit /etc/devd.conf, and add the following:
notify 100 { match "system" "IFNET"; match "type" "LINK_UP"; match "subsystem" "carp"; action "/usr/local/bin/carpup $subsystem"; }; notify 100 { match "system" "IFNET"; match "type" "LINK_DOWN"; match "subsystem" "carp"; action "/usr/local/bin/carpdown $subsystem"; };
In this instance, you don't really need the $subsystem variable, but it may be useful if you want to perform other actions. It contains the name of the actual carp interface that transitioned. If you want to lock this down to just one carp interface, you could change the subsystem match to "carp0" or "carp1", whichever you like.
Restart devd (or reboot):
killall -9 devd && /sbin/devd
You can then create the scripts mentioned on the "action" line above. For this case, it would be two different scripts:
/usr/local/bin/carpup
#!/bin/sh /sbin/ifconfig bridge0 up
/usr/local/bin/carpdown
#!/bin/sh /sbin/ifconfig bridge0 down
Finally, make sure those are executable:
chmod a+x /usr/local/bin/carpup chmod a+x /usr/local/bin/carpdown
You could add anything else that you want to these scripts. Calling some sort of notification program would be useful, or whatever else is desired.
I'm trying to come up with some sort of generic detection code that would take the carp interface, and attempt to see if its parent interface is a bridge member, and if so, bring down that bridge member. A little more complex, but it is a more generic solution that should work in more, similar, scenarios.