Handling of haredware changes should be MUCH better!!
-
Yesterday I changed one of the NIC's in my router, what should have been a s
simple action was in fact .... a drama! I was quite lucky to fix it .. after a couple of hours using all kind of tricks. Let me describe the rout cause and the problems I met.The root cause is the fact that pfSense can not handle a situation where something in the hardware has changed. A problem which occurs if:
- one of the NIC's fails (I had that in the past ... drama)
- when replacing a NIC
- not to say if you move to other hardware.
In the yesterday situation I was replacing one of the five NIC's in my system. In fact a less important NIC with only one VLAN assigned. So I removed that NIC and did boot the system. A situation identical with a NIC failure like I had in the past.
The one IMHO correct behavoir would be if the system would start normally, leaving the defect or removed NIC in status "disabled" however, that is far from what happens
!!
What happens is that the system:- can not match interface assignment any longer, also not of all unchanged hardware !!!
- the firewall does not start at all
- and since you need the pfsense vlans to access pfsense itself, you can not manage pfsense any longer
- the only option left is hacking the local console
Very very bad and unnecessary IMHO !!
So a bit more about my repair experiences
What I did do before removing the interface, is removing all definitions related to the interface before removing the interface. That would have made thing easier however.- You can not do that in case of a failure and
- my intention was to assign the settings to the new NIC
So after booting nothing worked, placing the old NIC back did not solve that problem. So the first thing I did is created an USB with an config directory with inside the last config as "config.xml". And trying to boot with that USB attached.
- NOP
From the console I tried to boot from a old boot boot snapshot. Before I describe that, normally I would like to state that I do not have a pfSense plus image, since that is is as far as I know not publicly available
So I did try to restore old boot snapshot and old settings from the console. To do that I had to go to the system with screen, mouse and keyboard etc very uncomfortable (I did not use KVM since pfSense case was still open due to the NIC change process
From the console you can limited load old configs however:
- you can not see them since they scroll outside the console window
- and that does not help since the boot environment was damaged
So I tried to change the boot environment ...... That worked .... a bit
The system started in the community version, complaining that the config was written with a newer version .... however the good news is that I could access the firewall from my PC via the GUI
Via the GUI I now removed all entry's related to the card to replace. stopped pfSense and removed the NIC and rebooted ... yet a system in community condition now still without the new NIC. Rebooting the system.
Next step was to add the new card to the config. Since the new card was to replace a lagg running on another card I used a trick. I did add the ports of new card to the existing lagg and rebooted. Yep OK lagg has four ports now two old two new. Did change the cables to the new ports and booted. Not working. But after adding and removing old ports one by one it worked. Saved the config and placed that config on the USB.
Now having a running system but still running the community edition. So I placed the config USB on the computer changed the boot environment to an older one and booted. Let me note here that the boot environment settings are in the system menu and not in the boot menu something I was not aware off in first instance.
So booted ... next problem .. system running the plus edition now ... however ... an older plus edition ... which did not see the update server ... so not possible to update to the actual version.
End good all good after some changes in the system boot selection menu I manged to go to the actual plus edition and reaching the update server.
I hope my experience helps other, but also makes clear that the way pfsense treads hardware failures or changes is absolutely not OK!
-
@louis2 said in Handling of haredware changes should be MUCH better!!:
The root cause is the fact that ...
That you probably missed an important detail
And this one is important, as pfSense isn't making coffee, its a router / firewall.What happens if it finds an 'unknown' (non admin defined declared initialized) interface upon boot ?
Should it block all traffic ? Pass all traffic ? Flip a coin an decide ?
Noop.
It will wait for your - the admin - instructions. On the one and only interface that always work - so not the Ethernet interface, but the console. As only the admin can, with physical, access to the console.
He has to assign the newly found interfaces, and the show can go on from there.When the OS (FreeBSD) boots, it will enumerate all devices found, and among them are the NICs.
The order in which it find the can change - example : you have several - lets say 3 "igc" type Intel NICs. You add a new one.
The new one, will it be igc3 ? Or igc0, and the interface that was using "0" before will become "1", the "1" before become "2" and he "2" before become "3" ?
Firewall rules present on the interface that was "igc0" before will be assigned to the new (other) NIC that uses "igc0" - which is another interface ! You get the issue ?If you are using .... what are you using btw ? 2.7.2 ? (edit : ok, yes, 2.7.2) Then lucky you, as "interfaces" determine also your registration license if you use pfSense Plus.
A bit like as the other OS out there : "Microsoft Windows" : change the graphics card and an network interface and : license invalidated ...., so you need to contact TAC, and they can 'fix' it for you.So, yeah, the hardware detection system isn't perfect. It might even be somewhat confusing.
Millions (probably more) have tested the current procedure during the last decade or more, and right now, the ones who understood why things are done as they are done, did find a better way of doing things ^^
I'm pretty (not 100%) that other big players like Cisco etc have the same procedure.
These devices don't have VGA or HDMI access, no GUI, just, initially, a console access.Btw : a router firewall : when you get the box, you'll think a bout it for a while, and decide that you need 5 interfaces (already a huge setup) - so you get a device with 3 NIX as spares.
NICS are pretty not-expensive these day.
Routers are not devices that you change 'every day'. Ones set up, they are good for years. -
I do understand your concern, but I do not agree.
The easy response is about your question, what to do with a new NIC. The answer
of course quite simple block.I did not know that freebsd is so easely changing nic names ex0 to ex1 etc, but I think that for that behavoir there are probably workarrounds like
- interface ports have a mac
- cards have a serial number
- probably more options
So, I stick to my idea that the actual situation is very bad!
-
@louis2 said in Handling of haredware changes should be MUCH better!!:
I did not know that freebsd is so easely ....
It's worse.
Some decisions ware made way in the past, read something about it in Tanenbaum best seller : "Modern Operating Systems".
So it was even before Unix .. Minix probably ?
Back in days, console input was channel 1, output was 2, and then they made the choice to use 2 of error output. That kind of choices.
NICs and other input output devices were numbered back then. Later they got 'names'. pfSense uses label for them, most of use LAN WAN etc. These are just pretty names.Have a look at what the firewall actually uses a 'rules'.
Not the one that you see in the GUI.
Take the lowest level possible :
Look at this file : /tmp/rules.debugYou'll see that all kind of rules use NIC driver names like "igc0".
@louis2 said in Handling of haredware changes should be MUCH better!!:
but I do not agree.
I get that.
I'm using pfSense for ... not since late 2006 - I was using m0n0wall before that, and the "assign interfaces" never really changed since.
I've did this "Interface assign thing" a couple of times, I guess.
I'm not setting up routers for a living btw.If things are as they are, then there is a valid reason for it.
-
It is even wors !!! Assuming that freebsd can assigning new names to an interface than:
- pfSense is principle wrong !! by using the interface name as the key to the rule sets !!!
- and the actual way of solving the situation via the administrator is a security leak as well. Reason if the operator has to reassign the vlans to an interface, 90+perc of the operators will assign the vlan to the interface having the old name. I estimate that the number of administrators having a list of interface related macs by hand and verifying the mac before pushing the button assign the vlan to that interface is very very low.
So that actual method and key may have / have a security risc at least as high !!!!
Not to mention the effort and the network downtime for the network users !!!
-
@louis2 said in Handling of haredware changes should be MUCH better!!:
and the actual way of solving the situation via the administrator is a security leak as well.
The reason it does that is for security. It's better to drop to the interface assign screen and not pass any traffic than to try and guess what NIC the admin intended to use and put traffic on it.
-
Stephan, I do agree with you and GertJan on the fact that the firewall must be secure.
Where we really disagree is that I really think that the actual method to reach that security is really really not contributing to the security. It seems secure but it is not !!
Hardly any operator will check if the ix1 is still the same ix1 as before. The admin will simply assume that and say assign vlan-x to the actual x1 !!
That does not at all contribute to security !!!
What really would contribute to the security is having the mac and/or nic serial number in the config and checking that (and not the name of the interface!)So the actual procedure is just complex suggesting security which is not added !! but in reality just adding inconvenience for the administrator an outage for the users !!
-
@louis2 agree.
pfsense interface assignment method while functional is not robust.Multiple characteristics of each interface should be maintained for each interface. The redundancy then used to (assist) assignment after a change. Ideally with an assignment promiscuously setting.
PS
Even just a promiscuous interface assignment option in the console would be usefulPPS
or console option to assign changed / non matching interfaces only -
I mean it's definitely more secure to stop passing traffic until the admin corrects a config mismatch than to just keep passing traffic on what might be correct. For example this is the same situation you might hit after importing a config into entirely new hardware where no NIC/Interface can be guaranteed.
I agree though it would be better to tie the interfaces to a specific MAC to prevent re-ordering. But doing so is non-trivial.
There have been a few feature requests open for this over the years. Like: https://redmine.pfsense.org/issues/410