Disable start up interface reassignment
-
@steveits
I do not think the primary fault is in the WAN NIC on pfsense computer.The primary problem is the more likely to be ISP cable modem is flapping and temporally takes it's NIC off line while the cable modem restarts. When this happens while pfsense is booting then pfsense sees no connection to the wan and forces an interface reassignment. Doing so takes out every other interface.
Looks like this is a fundamental limitation of pfsense.
-
@patch Hmm. Can’t say I’ve tried. The timing would have to be right if that’s the case. I’d have expected pfSense to maintain the assignment.
A $20 switch would fix that, though introduce another failure point.
-
@patch:
I'm not sure that what you want is possible due to how the underlying OS responds to missing interfaces.pfSense works with the physical interface name supplied by FreeBSD. So if your firewall has 3 Intel
e1000
interfaces, the FreeBSD OS will number themem0
,em1
, andem2
. These are the interface names passed to pfSense and are what it eventually stores inconfig.xml
and would assign to the friendly names WAN, LAN and DMZ.But if for some reason one of those Intel NICs disappears - for example assume it was
em1
- then the FreeBSD OS will, on boot, likely renumber the interfaces to beem0
andem1
since it sees only two now with the failure of the formerem1
.But in reality, in this example because
em1
died, what FreeBSD is now callingem1
is reallyem2
when all the NICs are present. So, now any firewall rules based on the oldem1
interface would actually be applied to theem2
interface. That's likely not what is desired. Exactly how the renumbering pans out will be governed by the precise NIC failure. But if the NIC totally disappears to the system, then the sequential numbering FreeBSD does would be a potential issue. That's why pfSense stops when the number of NICs returned by FreeBSD at boot-up does not match what is configured inconfig.xml
, or if any of the physical names have changed (say moved fromemX
toigcX
, for example).Here is a write-up from Netgate describing how the interface numbering and auto-detection works: https://docs.netgate.com/pfsense/en/latest/install/assign-interfaces.html.
-
@patch said in Disable start up interface reassignment:
there is a fault on one of the interfaces
That happens often ?
I do remember, as every body else, having a realtek interface dying on me.
happens ones, in my entire live.
You have interfaces failing all over ?For me, when you add remove hardware on the system, you should also make the system aware what you want with the 'new' configuration. The first comes with the second.
So, when you do 'hardware' you confirm that in the software (setup).
While doing hardware things, the console access isn't far anyway.This will also show you that newly added hardware is recognized at boot, etc.
See it like this :
We, as humans, give interfaces 'labels or names, like WAN and LAN, DMZ etc.
Internally, driver names are used em0, em1, igc0, ix1 etc.
At a lower level, numbers are used.When you add or remove an interface, actually any device, these can get renumbered,, and more as 1 can exist for a type of device (interface).
Drivers are loaded at random time, as soon as the hardware detection method found them.
So, internal numbering can change. And here comes the issue :
What happens when WAN is now 1 and LAN is now 0 ?
They get turned around, with the firewall rules and everything.Now you have a huge security issue.
( because you didn't re assign )If you really want to stop the re assignment : stop it.
It reading start here : /etc/rc.bootup, you'll find what you need.Just be warned.
-
@gertjan said in Disable start up interface reassignment:
That happens often ?
I had changed no hardware.
The fault was with my external internet provider / cable connection resulting in the ISP software restarting the ISP modem on my premises a lot.To investigate if the loss of internet was an issue with my equipment or the ISP & their wholesaler (NBN) I looked on the ISP phone app to see it they could see their modem on my premises (they could not). After I used their phone app to reset the port they used to access their modem on my premises their connection to their modem on my premises improved by my internet access was still down.
Restarting pfsense at that time (no hardware changed and external fault actually corrected unbeknown to me), pfsense could not restart with the known good configuration. As a result, rebooting pfsense resulted in loss of all lans as wall as the prior wan fault. An inconvenient situation during fault finding.
@gertjan said in Disable start up interface reassignment:
Internally, driver names are used em0, em1, igc0, ix1 etc.
At a lower level, numbers are used.Which is the crux of the problem during fault finding with a pfsense router.
The numbering of the internal names are some what random but at least constant if the system starts up in a constant fashion. Which is fine until it is not (a fault somewhere or a USB drive with a start up race condition).@gertjan said in Disable start up interface reassignment:
At a lower level, numbers are used
For pfsense resilience in fault conditions a very useful enhancement would be for the configuration to also record the lower level address information.
That way functioning interfaces could be assigned to the appropriate rules / configuration within pfsense. Doing so stops a single transient fault on one interface stopping all other interfaces. Resulting in a dramatic improvement in fault tolerance and subsequent reduction in debug time.
Btw
For now, my internet is restored.- A switch was added between the ISP modem & pfsense as pfsense locks up if the line fluctuates an the "wrong" time.
- My pfsense runs on a Proxmox single board computer. All nic pfsense uses are passed through to pfsense. Restarting Proxmox resulted in a very slow pfsense start up but now works again.
So current fault is resolve however the fragility of pfsense to fault conditions and the resulting debug time still concerns me.
-
@patch:
What model NIC is in your firewall on the WAN interface?I have an SG-5100 with 6 ports, but I only use three of them currently. The other 3 ports have nothing plugged in and thus no link. But on boot they all show up as available ports, but just with no active link.
You seem to say that your WAN NIC disappeared from the pfSense system when there was no active link. That would be unusual if I am correctly understanding your description of your problem.
If you have a total and complete hardware failure such that the NIC does not even show up during the POST hardware scan, then upsetting the interface numbering scheme is expected. But if the NIC shows up in the hardware scan, it should not get "lost". It should only show with no link if the other side of the connection is down (such as your ISP modem).
-
Like @bmeeks, I can start or reboot my pfSense with the WAN cable connected - or connected to the ISP router beeing kept off line or not switched on.
pfSense will still boot, find all it's network interfaces, and work just fine.There will be two issues :
- all LAN's have no Internet connectivity.
- when you visit https://192.168.1.1 to check up with the pfSense dashboard, you'll see that the WAN is down = no link. The WAN interface is still there, of course.
You will notice that the access to the man dashboard is slow.
This is because some of info showed on the dashboard comes from the Internet, like the list with packages and their 'upgradable' state.
Just be patient, the dashboard will show up.If the WAN interface isn't there any more .... then you have a (hardware) issue with the pfSense device itself.
In this case there won't be a GUI dashboard (web interface) that works, as all network interfaces have to be re assigned first. -
The decision to drop to the interfaces assignment prompt is deliberate and for security reasons.
As others have said if interfaces are added or removed resulting in an assigned interface not being present at boot the order of the other interfaces cannot be guaranteed. In that circumstance it's preferable to fail to boot rather than incorrectly connect network segments that should not be.
But, also yes, simply disconnecting a WAN cable should not trigger that. It's only triggered by an assigned interface not being present in the firewall.
Steve
-
This happens to me a lot and it's from using sr-iov assigned interfaces. Basically I can't reboot as the interfaces mac addresses are assigned with the interface from a pool of VFs. 1 mac address changes and I'm having to remap all of the interfaces! Also, the setup script includes tunnel, tap, bridge and ovpns interfaces, some of which weren't mapped to a physical address.
Is there a way to find the old interface configuration? Sometimes I'm lucky and remember the interface name to which then pfsense loads the correct configuration and rule set. If not I'm starting from scratch again which is becoming a pain.
-
@Justaguy-0 You can find them in a saved config file, in <interfaces>.
-
Thank you,
That should be enough to recreate them. Is there anyway to see past mappings of optX to mac address?
-
Unlikely since they would all have been replaced by the current values.
I'm unclear exactly what you are seeing here though. You have to re-assign the interfaces at each boot? I'd still expect the same number of interfaces with the same names even if all the MACs change?
-
Same number of interfaces. It's not every reboot but every now and again the assignment script starts and I have to reassign the interfaces. Some of my interfaces are bridges so the mac won't change. Some are pooled VFs when they are dynamically assigned. Why my hypervisor doesn't keep assigning the same VFs to pfSense I have yet to figure out.
-
The assignment script only starts if it fails interface check. That means that at least one interface that is assigned in the config file doesn't exist on the system.
So it shouldn't matter how the hypervisor presents them or what MAC it uses as long as the correct number of NICs of any type are present.
-
Thanks, good to know. I'm going to keep an eye on the number of interfaces between reboots.
-
@Justaguy-0
Btw for me the solutions was-
When it is working as desired, document externally the interface configuration including:- Physical computer box lan label, Function, Proxmox PCI device port address, Proxmox VBR, MAC, pfsense NIC lable, pfsenese IP & VLAN.
-
Reboot the Proxmox hypervisor. In hindsight one of the passed through NIC got in an unusable state at the hypervisor level.
I however still believe pfsense interface reassignment could be improved by better utilisation of information from the last working configuration. Perhaps the simplest would be just to display all data from the last assignment when reassignment is required. Better again allow the user to just reassign interfaces which have changed.
-
-
I agree the pfSense interface reassignment could use some improvement. As stephenw10 suggests it is the number of NICs that is causing the assignment script to run, but why should it reassign unchanged NICs? Or why does it list non physical interfaces such as bridges or VPN taps? What is going to happen when I free up some time and build some scripts to create interfaces with rule/routing sets to further my SD-LAN aspirations?
-
@Patch I think the concern is, a NIC is added, detected first, or in the middle, and all others shift up one…0>1, 1>2, etc. Assuming 0 is still LAN or 1 is router management could be dangerous depending on firewall rules.
-
Yes, the current behaviour is required because the NIC order is determined only by the order they are parsed in he PCIe device tree. Thus if an expansion card is removed or fails the remaining NICs, using the same driver, may not be assigned as the same interfaces. In that situation it is safer to drop to the re-assign prompt than to continue to boot and end up with the wrong rules on an interface.
To do anything else requires non-trivial work.Steve
-
I can see handing the problem completely back to the console is safer for pfsense (if it does nothing then it can't make a mistake).
However it could be more helpful for the user.
-
pfsense assigns interfaces based on the order however more data about past interface assignments is or could be recorded / displayed / matched.
-
With the current design approach loss of one NIC disabled the function of all NIC on reboot, making rebooting a more expensive (in operator time) debugging technique.
-