Private WLAN

Marthin

Hi guys,

The call for testing of the 2.8.1 beta stated I must report problems here, but the issue is wider than 2.8.1 so feel free, moderators, to move this post somewhere else.

I've run into some tricky territory to navigate while installing into a qemu vm under Proxmox, I've followed the instructions carefully but some of the installation attempts would either hang or report failed installations afterward. on the odd occasion that I got in installing, I really battled to get access to the gui or remote shell.

For clarity, I'm not a network engineer but systems architect forced through my own choices in life to sort out my own networking, so take my feedback in that context.

The issue I figured in the end caused all my issues that could in my mind also be why the installer was struggling, was caused by the fact that these two installations were done behind my normal pfSense firewall which resulted in using private (IANA) ip addresses on the WAN interface. As you know, pfSense installs with a setting to implicitly lock private networks on WAN interfaces. I'm not disputing that it's the correct way to do that, but I do want to suggest that the installer be adapted to take into account that in some scenarios that default option can cause a lot of difficulty.

My suggestion, should you wish to give it some thought, is that:

a) If the WAN interface is assigned a static IP in one of the IANA private address ranges, the users should be asked to confirm awareness of that fact and that though the address is used on the firewall's WAN side it is in fact in a private network rather than facing the public. If the installer has confirmation that the user accepts the consequences, it should then intall pfSense with block IANA addresses on WAN interfaces turned off.
If the WAN adress is provisioned to get an address from DHCP things might get a little trickier, because there might no longer be a UI invlved when the IP is obtained and proves to be in an IANA private range. Ideally the same reasoning as suggested above should apply, resulting in the implicit blocking default be removed.
Perhaps the better approach would be to make the IANA private range detection part of the IP Address settings dialog in both the install fase, console and through the GUI, basically confirming with the user that the configured (statically or obtained from DHCP or PPPoE results in an address on a WAN interface that isn't legal for direct connections to the internet.
If all of the above is too much to address at once in the beta testing fase, perhaps a temporary workaround might be to publish something about deplying with privste WAN addresses and offer a console shell command that would turn that flag in the Reserved Networks section of the WAN interface off and/or offer a more permanent way to turn the firwall off than during initial setup than pfctl -p which has to be redone after each configuration step.

Kind regards
Marthin

johnpoz

@Marthin not sure what issue your trying to overcome - you want to access the wan interface from rfc1918 for the web gui to complete a setup?

Turning off the block rfc1918 on the wan during setup is not going to allow access to the gui - because there are no other rules to allow access on the wan no matter what your source IP is.

If you are unable to setup pfsense where you would be able to access the lan IP. The only setup 1 interface, this would be the wan and you would be able to access it - because when pfsense only has 1 interface it puts in the antilock rule on that interface and would allow access to the gui.

You can then setup the rules you want on the wan to allow access before you enable a lan interface.

Marthin

@johnpoz OK, good to know about the 1 interface setup. I didn't expect the block rfc1918 removal would allow access to the webConfigurator all by itself. I said that the block being applied meant that none of the rules I added from the shell command line had any effect.

Look, I've run the same 2.8.0 setup off the netgate intaller ISO many times now, with the result being a genuine case of "mileage may vary". The latest two one times, one on ewach VM on which I faithfully reproduced exactly that VM setup specified in the netgate documentation about instlalling on Proxmox VM, both ran the installation to completion but when the installed version booted that never completes, it just sits there, as it turns out, waiting for input. I noiced, just before it scrolled off the screen with startup messages, that the question usually asked if you assign interfaces from the console menu (should VLANs be set up at this stage, or something like that) was being asked, so I entered no and it asked my to identify which NICs the WAN and LAN was on. Those are questions the installer asked and used, so why would it be asked again douring bootup. To make matters worse, the answers I provided to the installer were completely ignored and defaulted to the WAN getting an IP from DHCP and the LAN on a static IP of 196.8.1.1/24 with DHCP enabled. I reassigned the IPs once the cosole menu was up, enabled SSHd but could not get ssh access from the LAN port, until I went in through the console and ran pfctl -d, then it connected without a hitch.

Another variance I noticed is that whenever I selected during setup to use a local resolver it never could not connect to the NetGate servers. I recently had trouble going from 2.7.2 to 2.8.0 on my internet facing pfSense which rendered my email servers broken. Eventually I got tipped off that the new default setting to bring in state policy bound to interface was causing the problem. So my primary pfSense has that option back to the floating option (it was an impossible mission to set the option on in the advanced section of each firewall rule that might be impacted since setting that option does not cause the gear icon to appear indicating that advanced options are in effect. But I painstakingly went through every rule twice and ensured the setting is on for all the rules. Still it required the default to be changed back to the floating before it would allow the mail serves to run their own recursive resolvers as they insist on doing. In both the VMs I installed since, I had tremendous difficulty getting rules in place to get the rules I add to work. The only way past it was to keep running pfctl -d while making changes to anything or else the configurator would just time out and even the pings I had running on another screen would stop. These instances being behind the real firewall I was happy to disable the firewall temporarily to get by but didn't want to disable i permanently in the advanced option so I kept trying to set up rules where things kept working when I run pfctl -e. I was disappointed every time, until I went to the state policy global setting and changed that to the floating option, then everything started working as expected. What it basically means, from what I can see, is that the installer cannot even set up 2.8.0 as a recursive resolver and that cursed state policy setting it introduced breaks a lot more things than just multi-WAN setups. So much so that not even the installer knows what it needs to do or 2.8.0 is just broken, period.

pfSense had been a fantastic product for a long time, especially for people that don't identify as netqork engineers needing to get a job done. By comparison to the MicroTIK community who's outright toxic even with their fellow network engineers, the pfSense community and documentation was extremely accommodating to uderinformed people such as myself. It is an absolute shame to see that getting diluted by buggy features, undocumented behaviours and a switch to make the netgate installer the only means to get the community edition when the installer does such a messed up job of it in the first place.

Part of what I'm saying is that the anti-lockout rule, on the LAN interface, wasn't having any effect whatsoever until I changed the state policy to floating, so even with the one-interface intalll option that would have been exactly the same thing regardless of the block rfc1918 setting. I'll try that next time and give you more feedback if you're going to be able to get something done about it. I don't know what your role and capability entail, perhaps you're just another user like me, but I can tell you this. You gave me feedback from a position of authority saying things that in practice does not match what I've experienced and am still experiencing.. It's like there's no regression testing for new releases or the installer to catch out stuff that does not work as assumed in the manual or the release notes or by the developers themselves. It genuinely pains me to see pfSense in such a steep decline, quality wise and I feel sorry for those who pay through their noses for the Plus version which i sure to have the same issues made worse by the paid support people not knowing about the root causes either. I used to be absolutely convinced that once I have the means that I'd upgrade my entire operation to high throughput Netgate devices running pfSense+, but as it stands it will be better for my business to hire a network engineer and buy the Cisco or MicroTIK devices they swear by. I'm sure you can see what bad marketing outcome that is. Poor quality will kill Netgate's feeder market, drive people away to OPNsense and when they need it, Cisco or Juniper or microTik which comes with qualified engineers to get the results they used to do for themselves on pfSense.

stephenw10

Hmm, you shouldn't need any rules there once the NICs have been assigned. They do need to be assigned though.

If you're accessing it from the WAN side one option is to assign only one NIC initially. In that setup traffic is allowed one that one interface so you would be able to access the gui and complete the setup.
If you do that you have to remember to add a pass rule on WAN before you enable a LAN interface to continue accessing from the WAN side.

Marthin

@stephenw10 The symptoms included that I cound not access the web GUI or (or SSH port after enabling it on the console) via the LAN side where the anti-lockout rules were in effect. I had to turn the firewall off with pfctl -d repeatedly for any change through the gui caused that to reset to enabled. Ultimately, the only way I could reliable run with the firewall enabled for nat and policiy bawsed routing to remain an option) but allowing all traffiic anywhere was to define a floating Allow All rule for both in and out.

Look, my objective was to use the pair of pfsense nodes as a table platform that needs minimal downtime and restarts and failover when they do by using the pfsense as not as firewall but more like a router. But it's been such a hastle I've mothboalled those two VMs and pivoted to Ubutnu server nodes with haproxy and keepalived installed. It'w going to require more manual intervention to keep fully operational (not waying it needs a restart) but my trust in pfsense as a rock-solid table platform has been broken and it's not going to recover by itself.

Marthin

@stephenw10 To be honest, when you tell me what should happen and it didn't and doesn't happen that way, the damage is done. Like when Hagrid says "I probably shouldn't have said that" in the Harry Potter movies. The pfSense software itself, once you can get to it, is predictable and the most of the settings and flags do what they proclaim to do except for this new state policy rule setting which has dire consequences that nobody at Netgate seems to have the foggiest idea about. That one has all the marking of getting shoved in there at the last mnute with the job of properly inorporating the whole concept into every aspect touched by it left less than half done. The stated rule is that if you've set any advanced options on a firwall rule, the rule displays a gear icon in the list. That setting doesn't, which is a dead giveaway that the job is not done. But you can argue it's largely a cosmetic or at best a visual aid, but it goes much deper than that because even when I took the pain to the floating states rule on for every rule I had I still could not get DNS traffic through my main firewall without changing the global back to the old default from before 2.8.0.

You can argue could've, should've and would've all day long but not when I lived through direct evidence that it somehow did't work like you say should have.

stephenw10

I have numerous 2.8/2.8.1 VMs installed in Proxmox here and they all behave exactly as expected. So something must be different.

Since you were connecting them behind an existing pfSense install did you set a different subnet? A subnet conflict between WAN and LAN in the VMs could present like that.

Marthin

Thank you @stephenw10, and yes, obviously something must be different. That much I said from the get-go because I recognise that the scenario (networking wise, not because it's on a VM) is atypical. The issue is that between the installer that is being enforced as the only way to install anything newer than 2.7.2 as a power move by NetGate, the software it installs, it's evolving and dare I be honest and say immature features with unknown side-effects the net result is failure presenting in a confounding array of different ways.

I tried (but apparently failed) to confirm that the network environment behind the primary firewall is suitable by using the Ubuntu server setup and network detection to affirm that on both network interfaces it gets the addresses I've assigned via DHCP and once up and running that it can see the rest of the networks as intended.

That might not have been a good enough test, I'll be the first to admit, but in whatever way there was or remain subnet conflicts, whatever that might mean, would have to be something pfSense and its cursed installer was a) crippled by, b) detected but never reported, or c) lead to a failure from which it was trying to recover from after installation or during reboot that didn't stop and say, oh, the installation failed to activate the specified interface assignments which now needs to be redone or if you believe this to be a mistake you can restart the installetion" before prompting the user in the midst of a flood of boot messages scrolling the question being asked off the screen. Say what you will, that wasn't a designed behaviour, but an ill-conceived attempt to remedy an unforeseen condition, also known as either a bug or a design flaw.

Please understand that I'm describing not one but a whole range of different results I encountered while trying this. It's encouraging when you say that

@stephenw10 said in Private WLAN:

A subnet conflict between WAN and LAN in the VMs could present like that

like "that" means each of the various ways it failed to install without locking me out on both interfaces. I suspect though you've taken only one of the scenarios I've described and focussed on that. If that's the case, you should also understand that the only way I got to the state where pfSense ran at all was once I found a combination of interface assignment options that at least allowed the intalletion to get to the reboot stage and intervened during the bootup so I could see the reason it appear to be hanging was a user prompt that got scrolled off the screen by other bootup messages.

Nevertheless, please tell me more about these subnet conflicts you speak of and what "present like that" means in your book means. Even the title of the post confirms that I know that I'm installing into an environment with a private WAN address and I explain that it's behind another firewall. I (hope) I also mentioned that for the LAN interface the intention was for that to sit on my /23 LAN as a DHCP client and as such not providing DHCP on that interface at all. Both network interfaces are VLANs through a managed switch through separate interfaces on the host. One (the LAN) is tagged by the switch port and the WAN interface requires a VLAN tag that is injected in the VM's network interface device on Proxmox. Once again, the network configuration had been a worry but I ironed out all the kinks using the ubuntu verification installation.

I'm making no secret of it, I think the netgate installer is utter junk getting forced into the process for lame reasons. That's not the only problem, but instead of making the installation process more robust, the logic in the installer knows even less about the side-effects of the software it's installing under boundary or unusual conditions.

I'm sure none of this had been your decision and might well have been forced on you against your own recommendations, and for that I truly empathise with your situation. The blame game is stupid because all that matters is the way forward. You can keep dismissing me or most of what I'm reporting experiencing as my own fault or the result of trying to use pfsense for something it's not geared to do, or you can choose to make an effort to learn a great deal from the various situations I've encountered. I'm more than happy to help you understand and recreate the conditions that lead to all these things going wrong, but not in the current pattern of dismissal and confrontation. If it's not going to lead to the installer and core software getting better an more robust it's not worth either of our time to discect what went wrong.

stephenw10

Yes sorry I meant specifically the fact you had to repeatedly disable pf from the CLI even though you were accessing it from the LAN side which should be allowed. That potentially could be a subnet conflict. Though usually it's because the client is actually on the WAN side.
If your LAN interface is set as DHCP it will pull a gateway and pfSense then sees that as WAN and may use it as the default route. That's not unique to the installer though.
Is your LAN using a /23 that's different to whatever private subnet the WAN is connected to?

Obviously we would like to make the install process as painless as possible. The next version of the installer includes the ability to pass the configured interfaces to the resulting install. That would have avoided one of the issues you saw.

Marthin

@stephenw10

@stephenw10 said in Private WLAN:

If your LAN interface is set as DHCP it will pull a gateway and pfSense then sees that as WAN and may use it as the default route. That's not unique to the installer though.

Both the interfaces I specified have DHCP on them, so yes, the LAN side interface would have resolved with a gateway specified. While the conslusion that what I specified specifically as LAN interface having a gateway defined is consistent with how assigning interfaces on the console figures out (in reverse though, asking you not to specify a gateway if its a LAN) the decision to treat the interface specified as LAN like a WAN because of DHCP might be one that wasn't thought through properly. The installer and pfSense (and most recently me) are all aware of single interface mode (single-arm I think its called) but the setup resulting in two WAN interfaces should be detected and dealt with the same way, i.e. setting up anti-lockout rules on both because there isn't a LAN interface to set it on.

@stephenw10 said in Private WLAN:

The next version of the installer includes the ability to pass the configured interfaces to the resulting install

One could argue that it would have been the first and most fundamental feature of the installer, not one left for later implementation. What good is an installer that asks you for the pertinent information and then lack the awbility to pass that on to the thing it is installing. It would have been better then if the installer didn't ask for the interface assignments but explicitly asked for details it will use purely to check for license and to do the network install of the indicated version, leaving the wan and lan assignments to be done through the console after booting into what amounts to a still completely unconfigured mode. It would still not prevent people tripping over poorly implemented features such as the state policy mess, but at least it would get the installation done reliably and focus people on getting the interfaces correctly set up from the console. I keep having to circle back to the installer, especially as sole means to install, being a disaster that did't wait to happen but already did. It's a marketing initiative, that much is understandable, and marketing execs are notorious for the disasters pushing their agendas cause. But the technical personnel really should either push back harder waying that they don't have a robust enough installer to support that move yet or made absolutely sure that the installer they do put out into the wild can get the job done reliably if it is doable or detect conditions that will result in a dysfunctional installation and refuse to start the installetion with an explanation of what needs to be fixed first. In this case, it should have seen both WAN and LAN interfaces are set to DHCP, checked the registration details, noticed that the LAN interface has a gateway on DHCP, and dealt with that either by saying no, that would give the firewall two WAN interfaces with no access to it from anywhere or forced the LAN interface to ignore the gateway setting which would force the pfSense's own access to the internet through its WAN interface regardless of what else is happening on the LAN interface.

The other question the installer explicitly asks (when not getting an address from) is if the LAN interface should serve DHCP requests. Why ask for something you're just going to through away? Yet even worse than that is letting the initial bootup of the firewall ask for user input to try to recover from missing or invalid inputs, and that with all the boot messages carrying on scrolling the question off the screen. That's beyond disasterous. When will people learn to fail early instead of trying to recover from bad user input?

@stephenw10 said in Private WLAN:

Obviously we would like to make the install process as painless as possible

That reads like a typical mission statement, born during the annual copany strategy workshop and having no effect other than keeping the sun off a few pathes of wall behind the gian posters throughout the office. In my business consultant days we taught companies that the true mission statement of any company needs no poster - you can read a company's effective mission statement straight off the behaviour of its people and the impact they're having. The installer's mission is not to make the insall process as painless as possible, but to ensure that every CE install is turned into a pfSense+ and/or Netate device sales opportunity. Of course Netgate can do as they please, but pfSense remains open source with all the benefits of having huge numbers of people testing it from every conceivable angle. Interfering with the opensource process with a (poorly conceived and executed) proprietory installer is one of the worst strategies for trying to force a return on investing in opensource projects I've come across, and I've seen quite a few pathetic attempts.

@stephenw10 said in Private WLAN:

Yes sorry I meant specifically the fact you had to repeatedly disable pf from the CLI even though you were accessing it from the LAN side which should be allowed

Now to that point. If we ignore all I had to do outside what can rightfully be expected from anyone trying to install the software to even get to that point, allow me to point out that I had to "correct" a number of things before I could allow the firewall back on.

First of all, I was most definitely accessing the newly installed software through the assigned LAN interface which by then was static with no gateway defined. The anti-lockout rules had no effect. No rules had any effect. Even though it wasn't meant to be the case, the blocking of rfc1918 address on the WAN interface somehow impacted the LAN side as well, blocking everything, perhaps because of the bad assumptions made by the installer or startup seeing the DHCPd LAN as a WAN, adding the rfc918 blocking but not removing it when the interface was given a default static (192.168.1.1/24) address which I later replaced with another static address without a gateway and no DHCP.

With the firewall temporarily disabled (pfctl -d) I tried configuring rules that were the equivalent of the firewall being disabled but still leaving rule processing on for NAT and rule-based routing to remain options. IN the end, the only way I could leave the firewall enabled was by adding a Floating bi-directional pass rule for any interface, any protocol, any source and any destination on any port. That's extreme. Sure, maybe none of this would have been in play if the firewall had a public IP on the WAN side as would normally be the expectation and the firewall undoing the pfctl -d command so religiously might have been a necessary precaution in case the firewall was indeed all that stood betwen my network and the toxic environment of the Internet. But it did happen and though my use-case wasn't the stock standard, it is not unusual by any stretch of the imagination because chaining multiple firewalls for any one of a variety of reasons is common and would always involve private addresses as WAN addresses from the second tier onwards simply because routable IPv4 addresses are a scarce and limited resource. If the installer and/or the options on the console menu is able to adjust it's effect based on whether or not a gateway is speciified on an interface it surely has the opportunity to sense that an rfc1918 address had been specified or obtained by DHCP and respond to that by asking the user about setting up the rfc1918 blocking option. Having a rule that blocks traffic from rfc1918 addresses on the (specified, not detected) WAN side is good, but if the interface itself is such an address the blocking rule as implemented breaks the firewall. A better soution might be to keep the rfc1918 block in place if that's the safer option, but then precede it with a higher order rule to pass traffic to the firewall itself.

@stephenw10 said in Private WLAN:

That potentially could be a subnet conflict

So I'll ask once more, with tears in my navy blue eyes, what is this thing you call a subnet conflict, what tests would detect it and point those out to me, why was ubuntu not affected by it, or if it was, what would the symptoms that I missed have been? Subsequent tests of ubuntu servers configured the same way as the two pfSense VMs confirms that the primary firewall was still blocking access to the http/s ports on the private WAN so that rules out the possibility that my attempts to get to the Web UI without repeated pfctl -d's inadvertently reached the box via the WAN interface to get blocked there but with the firewall disabled it got in just fine. To confirm, i believe I've seen and presented evidence that the client was not actually on the WAN side, which by your reasoning so far means it must have been a subnet conflict. That's why I need to augment my understanding of what that entails. I prefer using pfSense over the like of Cisco, Juniper and microbic not only because of the CE version affording me the time to set up and grow my business until I need more than what CE offers, but also because the software and community around it is 10 times more tolerant to people without network engineering credentials. The communities around those other products are nasty and toxic even towards network engineers they believe isn't professional enough. My world is software, not networking. Though I've learned some about it over the years I'm no expert and I'd much rather be mistaken for a complete noob than have information actively be withheld from me by network engineering gate-keepers. So in that light, please, what on this green earth is a subnet conflict in your context?

stephenw10

OK so it looks like you had two issues:

The installer didn't work as you expected it to but you were able to get 2.8 installed and booted.

The resulting install didn't behave as you expected. That's independent of the installer and 2.7.2 would have behaved identically in that situation.

So after install you assigned two interfaces, pfSense names them WAN and LAN but any interface can be anything. And you configured them both to be DHCP since both subnets already have a DHCP server?

The typical subnet conflict that users hit when installing behind another firewall if that pfSense uses 192.168.1.1/24 as the default LAN address and that subnet is also used by the upstream firewall WAN side.
I assume you didn't hit that since both subnets already existed in your network so must be using different subnets? What are they?

However you then say you set the LAN back to a static address? Presumably in the same subnet?

By default pfSense creates firewall rules on the LAN interface to allow access to the webgui there. That applies whether the LAN is static or DHCP.

How exactly were you trying to connect? From where?