Complete Fail replacing NIC
-
I have pfSense running on Proxmox as a VM. I recently upgraded the network to 10GB and while I have a 4 port 10GB being used via Proxmox bridges for the firewall the speed, especially via haproxy, is extremely slow. I have been trying to pass through the 4 port NIC entirely to the VM, however when I do that the pfsense box forces me to completely re-enter all of the interface information. Even after that is done nothing works - OpenVPN links, Wireguard VPN, VLAN's, sync to backup pfsense - almost nothing. Even screen with screen shots of every single screen and ensuring it is identical to the working configuration does not work. Additionally this box is upgraded to pfsense plus 23.11 and I get multiple errors regarding pfSense.conf not being found (broken symbolic link) to the repo's being broken, and my pfsense plus license is not registered. Attempting to register is does nothing except hang the GUI. Running
pfSense-upgrade -d -c
gets multiple errors. It takes an hour to boot and when it does the GUI is completely unresponsive. I've created a completely new VM and installed CE 2.7.2, however when I load the config with the interfaces changed the GUI hangs forever and I cannot do anything. I do not have copies of the full errors as I had to revert back to a working configuration as soon as possible.I've spent all night for a week, and at this point I'm pretty sure I'm going to switch the entire enterprise away from pfsense if I cannot get this resolved. If it is a licensing issue then I definitely will. I am not going to recreate the entire configuration from scratch on the same platform and risk having this happen again. If this is a known issue I would appreciate any input anyone can provide.
-
pfSense Plus uses a license which is (afaik) 'hardware' bound.
I can't tell how the validity of a license is determined, but let's I'm pretty sure the presence of NIC and their unique MAC ID are used.
So spinning up 'another' VM or changing 'something' with the type or number of NIC will fail the lisence test.
Btw : no valid license : no access to the pfSense server for package or update/upgrades.@rpm5099 said in Complete Fail replacing NIC:
pfsense plus 23.11
upgraded ? that's an old version.
@rpm5099 said in Complete Fail replacing NIC:
I've created a completely new VM and installed CE 2.7.2
That should work out of the box. Even on a more complex scenario like a VM.
Still, double check 10 Gb support, especially when used in a VM environment.
Keep in mind that a backed up pfSense config is meant to be restored on the same machine. It could be restored into another device, but be ready to deal with the Interface changes.
If version are different, like importing a 2.7.2 into a 24.03, be ready to handle even more differences.@rpm5099 said in Complete Fail replacing NIC:
Running pfSense-upgrade -d -c gets multiple errors
I agree, strange messages. They probably mean : license first.
@rpm5099 said in Complete Fail replacing NIC:
however when I load the config with the interfaces changed the GUI hangs forever and I cannot do anything
That's why the console exists. When you encounter problems like 'no more working NIC' then don't be surprised the GU is out of order. Go right away to the place with the answers : /var/log/ and have a look at the last changed last files and what they tell you.
Btw : Impressive : 10 Gb NICs on a VM : some bleeding edge configuration your have there.
-
So it was working as expected but slow before you tried to move to passing the NICs through to pfSense in Proxmox.
What NIC type was the VM seeing from Proxmox when using bridges?
What NIC is the new 4port 10G device? How does pfSense see it?
If nothing is working the first thing I would check is that the new NICs are assigned in the correct order. They may not be parsed in the same order as the virtual NICs.
-
S stephenw10 moved this topic from Problems Installing or Upgrading pfSense Software
-
Thank you all for your responses, here is some more information - I was a bit tired last night at 3am. BLUF - the nic passthrough works great and is much faster than going through the Proxmox bridges once you get the physical ports mapped over and manually re-assign all the VLAN's and interfaces, etc. However, it inexplicably causes all kinds of other bizarre issues with pfSense seemingly unrelated to interfaces at all. I am all but certain if I rolled a new box with all the hardware setup exactly this way and manually reconfigured every setting via the GUI that it would work perfectly.
The version is pfsense Plus 24.11 fully (23.11 was a typo).
(restoring onto 2.7.2) Keep in mind that a backed up pfSense config is meant to be restored on the same machine. It could be restored into another device, but be ready to deal with the Interface changes.
Yes, I went through and made sure that the interfaces are in the exact same order and have the exact same name as on the old system using screenshots. I also have made sure that all of the VLAN's are changed to the correct parent interface (LAN) which changed. I've done this 6 or 7 times, so I know it is being done correctly.
pfSense-upgrade -d -c gets multiple errors
I will restore the broken VM and post the full details of all errors and commands hopefully tonight.
So it was working as expected but slow before you tried to move to passing the NICs through to pfSense in Proxmox.
Let me explain better - I recently replaced a 4 port 2.5G nic with a 4 port 10G nic as part of an upgrade to the rest of the LAN to 10G. It currently works fine with the 4 port 10G nic when used via Proxmox bridge interfaces, but the bridges simply are not able to get more than ~2G through them.
The 4 port nic currently has 3 of the ports mapped to bridges, and those bridges are assigned to the pfSense VM. All of the other VM's and the Proxmox host itself are on two separate physical NIC cards. I didn't grab the output of
pciconf
while booted with the Nic passed through, but I can do that tonight. It is 10Gtek and the part number is XL710-10G-4S(4xSFP+) - it's a cheap card but I've tried four other brands that were no good and I'm using these in 3 other servers. The firmware is fully upgraded. I added Nic3 to move the VMLan off of that physical Nic so I can pass the entire Nic card through to the pfSense. In the broken configuration with Nic2 passed through to the VM I was able to see at least 5G without making any additional tweaks to the OS or Nic such as mtu, offload settings, etc.Upon starting up the pfsense with the 10G nic passed through it of course asks me to assign interfaces. LAN is now ixl1 and WAN is now ixl3. At this point I have internet and can connect to the pfsense gui via the LAN - so the passthrough is working and the ethernet ports are mapped correctly. However, booting takes forever and the gui is painfully slow. There seem to be lots of errors related to the pfsense Plus registration at boot, I will capture dmesg and add it. Also in the GUI if I go to
System -> register
rather than saying "Your device does not require registration" it appears ready to accept an activation token - however even if I put in the token the gui just freezes for 20 minutes and nothing changes.Could this be a license issue for pfsense plus? If so, how am I supposed to swap hardware in my pfsense devices in the middle of the night - do I have to do a bunch of license coordination stuff with Netgate the day before to swap out a Nic? This can't possibly be the case...can it?
Server Nics
x used; - not used Nic1 2.5G | x | -> Proxmox host Nic2 10G | - x x x | -> Pfsense VM Nic3 10G | x | -> VMLan (All other VM's)
Label is a literal label sticker on the server case for the 4 port Nic
#Label Name BEFORE AFTER MAC 0 ----- ----- ---- ixl0 98:b7:85:XX:XX:XX CONFIRMED 1 pflan LAN vtnet0 ixl1 98:b7:85:XX:XX:XX CONFIRMED 2 pfsync pfsync vtnet2 ixl2 98:b7:85:XX:XX:XX CONFIRMED 3 WAN WAN10500 vtnet1 ixl3 98:b7:85:XX:XX:XX CONFIRMED
Using Proxmox bridges:
$ pciconf -lv ... virtio_pci4@pci0:6:21:0: class=0x020000 rev=0x00 hdr=0x00 vendor=0x1af4 device=0x1000 subvendor=0x1af4 subdevice=0x0001 vendor = 'Red Hat, Inc.' device = 'Virtio network device' class = network subclass = ethernet
Proxmox host before passthrough:
# lshw -C network ... *-network:3 description: Ethernet interface product: Ethernet Controller X710 for 10GbE SFP+ vendor: Intel Corporation physical id: 0.3 bus info: pci@0000:01:00.3 logical name: enp1s0f3np3 version: 01 serial: 98:b7:85:XX:XX:XX size: 10Gbit/s capacity: 10Gbit/s width: 64 bits clock: 33MHz capabilities: pm msi msix pciexpress bus_master cap_list rom ethernet physical fibre 10000bt-fd configuration: autonegotiation=off broadcast=yes driver=i40e driverversion=6.8.12-10-pve duplex=full firmware=9.54 0x8000fb7a 1.2527.0 latency=0 link=yes multicast=yes speed=10Gbit/s resources: iomemory:600-5ff iomemory:600-5ff irq:16 memory:60e0000000-60e07fffff memory:60e2800000-60e2807fff memory:80a00000-80a7ffff memory:60e2000000-60e21fffff memory:60e2820000-60e289ffff ...
VLAN's - there are 10 but only 2 of them are in use and are not critical. The pfsense LAN port obviously needs to be VLAN aware, and on Debian you would configure these settings for the Nic's in
/etc/network/interfaces
i.e.:auto enp1s0f1np1 iface enp1s0f1np1 inet manual post-up /sbin/ethtool -K enp1s0f1np1 rxvlan off post-up /sbin/ethtool -K enp1s0f1np1 rx-vlan-offload off auto pflan iface pflan inet manual bridge-ports enp1s0f1np1 bridge-stp off bridge-fd 0 bridge-vlan-aware yes bridge-vids 20 30 40 50 60 70 80 90 55
-
So what will definitely happen here is that the NDI, which is calculated from the hardware at boot, will change because pfSense now sees different NICs in the system. That means it will be unable to connect to the pkg system to check for update etc which may cause some slowness on the dashboard which runs an update check by default. But that shouldn't cause any dramatic slowdown.
If you restore a config in that situation it will be unable to reload packages will could cause problems.
I can usually migrate the registered NDI to the new value one time one you have the final hardware config in place.
@rpm5099 said in Complete Fail replacing NIC:
However, it inexplicably causes all kinds of other bizarre issues with pfSense seemingly unrelated to interfaces at all.
Like what? Anything other that slowness in the gui?
-
@stephenw10 I contacted support yesterday and was told basically what you just said - I can change the NDI "one time" and then I'll have to pay again. What if I change the id and find out licensing wasn't the [only] problem and need to revert it? I cannot be in that situation - I'm not going to keep repeatedly paying hundreds of dollars.
If I had known this when I was considering whether to upgrade I would not have done it. For right now I am focused on downgrading to CE so I can remove licensing issues from the picture entirely. Last night I built a new VM using a 2.7.2 ISO I already had - not the new netgate installer iso - and loaded the config.xml with the interfaces updated for the passthrough. The gui hung for quite a while, but I suppose this could be normal while things are getting setup. I'm not seeing anything alarming in dmesg except possibly a message saying that SR-IOV failed to load on the NIC, but since I'm passing through the entire adapter and not doing any virtualization within the pfsense VM I don't think that should matter. top doesnt show any significant CPU usage - how do you check what is making the GUI hang? What is a 50x error in the GUI? Are there any go-to commands you run to troubleshoot issues when restoring a config on a new box to see what is going on?
I'm going to have to migrate to another platform - I will have no one's boot to my throat while attempting to troubleshoot or upgrade hardware.
Thanks again for your help.
-
After switching back to CE and re-enabling some of the nic offloads I'm now getting >6G through the haproxy ssl connection up from about 1.5G, which validates the reason I was attempting to remove the Proxmox virtual bridges. It's unfortunate that Netgate has a licensing policy that is completely unacceptable.
@stephenw10 - I've read a lot of your posts and they have been very helpful to me over the years. Keep up the great work. I am very interested in the Proxmox Nic passthrough guide that was mentioned in one thread but I have not been able to find it.
Thanks again,
RPM
-
If you have a legitimate reason to need to migrate the NDI then we can accommodate that. If you had a hardware failure for example. Or, here, if you upgraded and found your new hardware is incompatible. We're not completely inflexible.