Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Working on getting OpenVPN server bridging to fly.

    Scheduled Pinned Locked Moved OpenVPN
    94 Posts 13 Posters 86.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • N
      Numbski
      last edited by

      /etc/rc.bootup, line 181.

      Seen any harm in moving that down two commands so it comes after openvpn_resync_all();?  Theoretically it would mean openvpn would be up, tap0 would be created prior to bridges being brought online, right?

      Only thing that comes to mind that runs all the time is /usr/local/sbin/check_reload_status, which is a binary daemon, not php, not a cron job.  It appears that it just keeps checking /tmp/check_reload_status, which usually says "sleeping", unless something more interesting is going on.  I don't know what it does is there's something more interesting going on though.

      1 Reply Last reply Reply Quote 0
      • N
        Numbski
        last edited by

        Promise, last post for the night.

        the shellcmd tags DO work, but it requires not one, but two reboots to take effect.  I haven't the slightest idea why that is, but upon reboot, nothing happens.  Reboot again, it works.  ???

        Really.  Going to go rest now.

        Really.

        1 Reply Last reply Reply Quote 0
        • S
          sullrich
          last edited by

          It's handy to remove /tmp/config.cache before signaling a reload.

          Ie: rm /tmp/config.cache from the command prompt after making config.xml changes.

          1 Reply Last reply Reply Quote 0
          • N
            Numbski
            last edited by

            Had another lockup today.  Lasted for about 18 hours, and then same behavior.  I'm seriously going to have to recompile the kernel with sw_watchdog until I figure this out.  It's completely maddening, as the second firewall never picks up because "technically" the first box is still up, but not really. :\

            1 Reply Last reply Reply Quote 0
            • N
              Numbski
              last edited by

              Okay.  I just finally got the kernel with sw_watchdog enabled built.  I'll share it here just in case someone else finds it useful.

              http://www.numbski.net.nyud.net:8080/downloads/pfSense/kernel-with-sw_watchdog.tar.gz

              1 Reply Last reply Reply Quote 0
              • N
                Numbski
                last edited by

                Since I have now given the sw_watchdog a proper workout, now I need to figure wtf is causing this.  grrr….

                Please note I'm now officially grasping for straws.  This post is from 2004:

                http://lists.osdl.org/pipermail/bridge/2004-January/000146.html

                I've added a crontab to remove stp from both bridge interfaces.  We'll see how it goes.

                Also, if anyone has a good idea of what I can do to get a proper dump of the kernel when the watchdog fires, please let me know.  Nothing useful is getting logged.

                1 Reply Last reply Reply Quote 0
                • S
                  sullrich
                  last edited by

                  Did you ever send your configuration to Andrew Thompson?

                  1 Reply Last reply Reply Quote 0
                  • N
                    Numbski
                    last edited by

                    No. :(

                    Part of the problem is that the main config I'm doing this on is kinda confidential.  I have another one I can send him, but I've been too tied up to get it over to him.

                    I'll make a concerted effort to get that over to him "soon".

                    1 Reply Last reply Reply Quote 0
                    • S
                      sullrich
                      last edited by

                      No offense, but we cannot help you until you send the configuration.

                      Andrew IS the maintainer of the if_bridge subsystem and he expressed his willingness to help but you continue to post messages at an alarming rate, not sending him the information he needs.

                      It will never get fixed at this rate.  Please send him the information he needs or just accept the fact that this will not work.

                      1 Reply Last reply Reply Quote 0
                      • N
                        Numbski
                        last edited by

                        I just e-mailed him asking if a sanitized version of the config.xml would suffice.  I would really prefer not to go giving out password hashes and IP addresses. :(

                        1 Reply Last reply Reply Quote 0
                        • S
                          sullrich
                          last edited by

                          Sanatize the passwords but your fear about ip addresses is kinda silly.

                          If you trust the code that we put into this product then I don't see why you cannot trust someone knowing your ip address.

                          1 Reply Last reply Reply Quote 0
                          • N
                            Numbski
                            last edited by

                            Sent.

                            1 Reply Last reply Reply Quote 0
                            • N
                              Numbski
                              last edited by

                              Just made an observation.

                              These hangups seem to occur consistently when I'm sending a whole lot of traffic through the firewalls, such as a cvsup.  Doesn't have to be traffic across the vpn, just traffic in general.

                              1 Reply Last reply Reply Quote 0
                              • N
                                Numbski
                                last edited by

                                Heh, sullrich.  You're not going to believe this.

                                I fully understand what you told me in irc about you guys not doing anything to or with tun/tap interfaces, and that everything is done via openvpn.

                                That said, after setting sysctl net.link.tap.user_open to 1, I've had the most uptime since I've started this whole debugging fiasco.  Totally odd.  Just thought I'd point it out in case someone might have an explanation for it.

                                To bring people who might be reading this up to speed, net.link.tap.user_open is set to 0 by default.  What that means is that only root (or similarly privileged users) have permission to make changes to, or siginficantly impact a tap interface.  When set to 1, non-privileged users can do the same.  This might be construed as a security concern, but for testing purposes there's no harm.  If indeed this "fixes" my problem, it raises more questions than it answers, as OpenVPN runs as root right now, meaning that either something else is touching the tap interface, OR openvpn is somehow dropping privs at some point.

                                1 Reply Last reply Reply Quote 0
                                • N
                                  Numbski
                                  last edited by

                                  Uptime is up over a day on the box that was kicking the bucket about once every three hours before with the sysctl set.  (Crosses fingers and prays….)  Putting a pretty solid load on it too.

                                  1 Reply Last reply Reply Quote 0
                                  • S
                                    sullrich
                                    last edited by

                                    I'll commit a change to force this sysctl for OpenVPN.

                                    Update: commited to /etc/sysctl.conf

                                    1 Reply Last reply Reply Quote 0
                                    • N
                                      Numbski
                                      last edited by

                                      Thanks!

                                      1 Reply Last reply Reply Quote 0
                                      • N
                                        Numbski
                                        last edited by

                                        Just updating the status on this.

                                        The watchdog daemon is still having to kill the machine if it has any active OpenVPN sessions about once every 24 hours.  If no one connects, it stays up indefinitely.

                                        There is definitely a difference between a pfSense box that is bridged to a carp-enabled interface vs one that is not.  I have one with an uptime of over a month with the exact same config that has traffic flowing on it pretty consistently.  The difference is that neither WAN nor LAN is running CARP, whereas on the configs where the hangups occur, both WAN and the bridged interface are part of a CARP cluster.  That fact that I'm not all that familiar with how CARP really functions underneath doesn't help matters much.  All I know is that it broadcasts (which pfSense passes all bridge traffic by default, so that means CARP broadcasts are getting onto the OpenVPN tap interface), but I don't see how that would case harm.

                                        1 Reply Last reply Reply Quote 0
                                        • N
                                          Numbski
                                          last edited by

                                          is a glutton for punishment, I kid you not. :P

                                          Doing some research on CARP and OpenVPN, I came across this document:

                                          http://openvpn.net/archive/openvpn-devel/2005-10/msg00017.html

                                          The thought occurs to me.  We synchonrize states across firewalls in a CARP cluster.  Just speculating on how this happens, but it is possible that OpenVPN on system A tries to synchonize to system B and fails somehow.

                                          (This is mostly a note to myself to look into after I get back into the country, feel free to ignore me!)

                                          1 Reply Last reply Reply Quote 0
                                          • N
                                            Numbski
                                            last edited by

                                            Since this thread is turning more into a blog and less into a support thread, I figured I should update it. :)

                                            I've posted a doc topic on how to get things running as I have them currently here:

                                            http://doc.pfsense.org/index.php/Setting_up_OpenVPN_with_pfSense#OpenVPN_Client_Bridging

                                            Now, what has changed for me since the last time I posted?  Well, up until sullrich beat it into me that I should not have tap0 assigned as an opt interface (heh), I had tried to bridged from the ui.  I have since scrapped that, and the bridge is brought up at boot time using shellcmd/earlyshellcmd.  Also, my uptime is at a new record since doing this….1 1/2 days. :)

                                            We may finally have hit stability on this.  Crossing my fingers.  I'll update if my good luck continues, and if so, I'd like someone to volunteer to do a similar config.  If we have this licked, I'll start petitioning to have the config merged into the OpenVPN webui pages.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.