STP hack-fix for bridged firewall:



  • Hello everybody! I am using the 2.2.6-RELEASE (amd64) PfSense.

    My scenario:

    In my company, we use the PfSense for a bridged firewall. On each side of the bridge is a switch:

    [switch inside] –lan [PfSense firewall in bridge-mode] wan– [switch outside]

    We want to establish redundancy, meaning the setup will look like this:

    [switch inside 1] –lan [PfSense firewall 1] –- [switch outside 1]
        |                                                                        |
        |                                                                        |
    [switch inside 2] –lan [PfSense firewall 2] –wan [switch outside 2]

    In order to prevent bridge-loops, we use RSTP to close off redundant connections. We configure RSTP with the following priorities, so that connection is cut between outside 1 and 2:

    [    0    ] –lan [ 4096 ] wan– [ 32768]
        |                                            X
        |                                            X
    [  8192] –lan [12288] wan– [ 32768]

    Since the two outside switches are not directly managable by us, they have default priorities. In my test-setup, I have tried dialling these priorities too, but to no avail - the problem persists.
    As far as I can see, these priorities should fix the RSTP-configuration, with a blocking state between outside 1 and 2.

    Now for the problem:

    When I enter the configuration-interface for our bridge, enter the settings and press enter, RSTP correctly shuts down the connection between outside 1 and 2.

    [    0    ] –lan [ 4096 ] wan– [ 32768] outside 1
        |                                            |
        |                                            X
    [  8192] –lan [12288] wan– [ 32768] outside 2

    But when I reboot PfSense 1, for example, STP no longer configures correctly. Instead, STP blocks on PfSense 1 wan interface:

    [    0    ] –lan [ 4096 ] XX –- [ 32768] outside 1
        |                                            |
        |                                            |
    [  8192] –lan [12288] wan– [ 32768] outside 2

    When I examine outside switch 1, is reports that the root bridge is the interface conntected to outside 2 - this shouldn't be the case!

    From outside 1's perspective, two connections to our root bridge (inside 1, priority 0) are available (outside 2 –> PfSense 2 --> inside 2 --> root bridge and PfSense 1 --> root bridge), with the path through the primary equipment line (PfSense 1 and inside 1) is consistently of lower priority-numbering (effectively higher priority). Not only is the overall path to our root bridge the highest priority through the primary route, but also on the basis of immediate neighbour, PfSense 1 is desirable;

    [PfSense 1 - 4096 ]–--[outside 1 - 32768]
                                                  |
                                                  |
                                      [outside 2 - 32768]

    This has led me to think that something is wrong with the way PfSense configures STP on bootup. I have even tried firing off ifconfig-commands to flip the interfaces on our PfSense-boxes. But this method only works if it is the actual PfSense software calling the bridge configuration (I think PfSense has some bytecode implemented to control interfaces?).

    My solution (hack-fix):

    I tried installing PfSense 2.3 after an answer to my other question on r/PfSense (https://www.reddit.com/r/PFSENSE/comments/438lim/pfsense_stp_not_starting_on_bootup/), but to no avail - that introduced some other bugs instead.

    Instead, I observed that it works whenever I enter and submit the configuration through the web-interface. My course of action is to call this function on boot-up. So I found out which function is called by reading through the code. I found out that interface_bridge_configure(&$bridge,$checkmember) is called from /etc/inc/interfaces.inc.

    This function can be called from the official PfSense Shell, but for that to work I need to know the input parameters. To know tihs, I inject code into the interface_bridge_configure-function:

    In /etc/inc/interfaces.php function interface_bridge_configure(&$bridge,$checkmember):
            file_put_contents("/root/custom/bridge.json",json_encode($bridge)); // CUSTOM
            file_put_contents("/root/custom/checkmember.txt",$checkmember); // CUSTOM
            I save the input parameters as json-array and plaintext variable

    Then I create a shell-script in /root/custom/bootup-stp.sh:
          /usr/local/sbin/pfSsh.php < /root/custom/stp-commands.php
          This calls the PfSense Shell with our input commands

    My input commands are as follows in /root/custom/stp-commands.php:
          $bridge = json_decode(file_get_contents('/root/custom/bridge.json'),true);
          $checkmember = file_get_contents('/root/custom/checkmember.txt');

    interface_bridge_configure($bridge,$checkmember);

    exec
          exit
          With this I load our saved input variables, parse them into the PfSense bridge-configuration function and then exit

    Then in our configuration-file in /cf/conf/config.xml:
          …
          <shellcmd>/bin/sh /root/custom/bootup-stp.sh</shellcmd>

    <interfaces>...
          I tell PfSense to load my startup-script bootup.stp at the end of boot-time, as recommended by PfSense

    In short, this means that every time I change the configuration from the GUI, input-parameters used for the interface-function is saved in our custom folder /root/custom.
    It is largely dynamic, but vulnerable in case PfSense updates /etc/inc/interfaces.inc.

    What I would like to see in the future:

    I get that it may not be a high priority, as a redundant, bridged firewall configured via STP/RSTP is unconventional. But still, it would be lovely to see a fix for this :)
    As mentioned earlier, I tried 2.3 when it was in beta, but that added too many quirks other than this, so it wasn't viable for our situation. It may be or may have been solved with an even newer release, but I haven't tested that yet.

    Thanks for reading - and please reply correcting me if I have wrong assumptions about STP, a stupid setup or something else! :)</interfaces>