No WAN connectivity after 2.3 upgrade



  • Dear All,

    upgrading a 2.2.6 system to 2.3 with dual WAN, link aggregation, CARP, Quagga and Nat-Reflection led to no WAN connectivity. The system is based on C2758 supermicro hardware. Please advise me on how to cure that.

    To gain confidence, I did first upgrade a single Alix board system with no packages. That did work with no issues. Then, I did upgrade the secondary server of a two-server CARP setting as indicated above. My aim was to first test that and then upgrade the primary system - which I did not do yet, because the secondary system remains unusable.

    Accessing the system through the webConfigurator from the LAN is possible (albeit since the upgrade, I do need to restart the webConfigurator via the console once after booting up the system). Pinging an obvious address like 8.8.8.8 via the WAN does not work. It does work neither via the genuine WAN IPs nor through WAN CARP VIPs. The only two exceptions over the WAN are (a)  the IPs used by dpinger to monitor the gateway and (b) the devices immediately in front of the WAN interfaces. What does also work is pinging the sync interface of the other server in the CARP stack and pinging around in the LAN. Furthermore, the system does determine its update status in the dashboard and it does do DNS resolution through its unbound resolver. CARP interfaces are in backup mode, which is normal for the secondary system.

    I normally direct LAN traffic to a round robin load balanced group of two WAN interfaces. Changing the rules to forwared traffic to a single interface only does not make a difference.

    HAProxy is not working. When trying to open Sevices -> HAProxy, the result is:

    Warning: require_once(haproxy_utils.inc): failed to open stream: No such file or directory in /usr/local/pkg/haproxy.inc on line 36 Call Stack: 0.0001 226976 1. {main}() /usr/local/www/haproxy_listeners.php:0 0.0628 2285528 2. require_once('/usr/local/pkg/haproxy.inc') /usr/local/www/haproxy_listeners.php:34 Fatal error: require_once(): Failed opening required 'haproxy_utils.inc' (include_path='.:/etc/inc:/usr/local/www:/usr/local/captiveportal:/usr/local/pkg:/usr/local/www/classes:/usr/local/www/classes/Form') in /usr/local/pkg/haproxy.inc on line 36 Call Stack: 0.0001 226976 1. {main}() /usr/local/www/haproxy_listeners.php:0 0.0628 2285528 2. require_once('/usr/local/pkg/haproxy.inc') /usr/local/www/haproxy_listeners.php:34

    Removing HAProxy (which is not a useful step for me) does not solve the issue.

    Regards,

    Michael Schefczyk

    P.S. Further findings:

    The state table seems to be empty. The primary system can ping the sync interface of the secondary system. The routing table is filled and probably plausible at first glance. The ARP table does also contain useful content, including both devices just outside the WAN interfaces.

    The gateway status was unknown at first. This could be changed by starting dpinger through services or the watchdog.

    I did remove Snort as the log did contain: "FATAL ERROR: pf.conf => Table snort2c does not exist in packet filter: No error: 0".

    Removing quagga does not solve the issue.

    I did press "remove" on traffic shaping for all interfaces and I think that no traffic shaping has been active beforehand and afterwards.



  • After removing/reinstalling haproxy package do you still see errors related to haproxy? In the 2.3 package most haproxy files moved to /haproxy subfolders of the original www and pkg locations..

    Empty state table indicates pf didnt load any rules.. can you try?: ```
    pfctl -f /tmp/rules.debug



  • Dear PiBa,

    Thank you very much for your reply!

    The 2.3 system creating the problem is a secondary carp member receiving HA sync from a primary 2.2.6 system. I thought that the state table might be empty due to the state table sync (primary 2.2.6 -> secondary 2.3) not working in that scenario. If I reinstall HAProxy, I see even the frontends and backends, but HAProxy will not start, which is normal for a secondary system, of course. Still a seconday system would have WAN connectivity.

    If I enter pfctl -f /tmp/rules.debug at my three running 2.2.6 systems (two running as primary carp members and one as a secondary), I get no output (silent).  The shell output of the problematic 2.3 system is:

    pfctl: /tmp/rules.debug: No such file or directory
    pfctl: cannot open the main config file!: No such file or directory
    pfctl: Syntax error in config file: pf rules not loaded

    What does this imply, please?

    Regards,

    Michael



  • Remove the old haproxy.inc that's still there.

    rm /usr/local/pkg/haproxy.inc
    

    and see where that leaves you. I don't think that's enough to cause the extent of problems you're seeing though.

    If you can get me into that system, PM me here or /msg cmb on Freenode if you're on IRC.



  • Dear Chris,

    Thank you very much! I did apply the rm /usr/local/pkg/haproxy.inc before without success. I suspect that this will boil down to something with CARP. Thus, I will PM.

    Regards,

    Michael



  • We got Michael's system going. Odd one, the filter reload process ended up stuck (running /etc/rc.filter_configure_sync via SSH just sat there and never returned).

    It's because of something that was in /usr/local/pkg/ but after removing them one by one it never started working again. Then we rebooted it in case it got something stuck somehow (though also had restarted php-fpm and killed any processes that should have been related), and everything was perfectly fine post-reboot.

    Have some ideas on the source of the issue, looking at how to replicate.



  • Just thought I'd add that I had the same problem this morning, with a similar configuration (HAProxy, CARP). Eventually I gave up and reverted to factory defaults, then imported a previous config file and that seemed to get it going again. HAProxy is still broken, but at least WAN routing works again.

    I'll keep these remarks in mind for the next round of upgrades. Thanks Chris (and Michael).



  • Dear all,

    Removing freeradius, haproxy, NRPE, NUT and Quagga packages before the upgrade did solve the issues on my three futher machines. Reinstalling freeradius and haproxy did work after the futhre upgrades.

    Looking at the packages, I very much welcome freeradius2-2.2.9, as that can do WLAN EAP-TLS with all recent Windows and Android devices. I had side-stepped to a dockerized version outside pfSense for five months, which is now no longer needed. I severely miss NUT and I do somewhat miss NRPE.

    The only package that did continue to give me trouble was Quagga. I have an OpenVPN Site-to-Site with Dual-WAN like in chapter 20.13.4 in the pfSense book but plus CARP. That required start / stop commands in /etc/rc.carpmaster and /etc/rc.carpbackup which do not seem to work anymore. As I did notice before that the setup does work without Quagga/OPSF, I am runing it without Quagga for the time being.

    Huge thanks to CMB!!

    Regards,

    Michael


Log in to reply