pfSense-2.7.2 as ESXi 8 VM: crash when adding a vNIC
-
Yesterday I was asked to quickly add another vNIC to a pfSense-2.7.2, to prepare for another WAN-interface (PPPoE ... fiber).
I wasn't to happy to access that without a bit of preparation, but did.
The media converter of that fiber access was plugged into a port, I figured out which physical NIC that was, set up a vSwitch etcSame setup as with the existing three interfaces in that pfSense, same models of physical NICs etc
I am not sure if it is well supported to add a vNIC while the VM is running? Seems not, as soon as I tried that, the whole site went offline because that specific pfSense is the main router.
That VM didn't even boot correctly after a reset. With that vNIC atttached, even when no cable was plugged into the pNIC, the VM failed to boot normally.
I have a crash report, that's four files: info.0 and info.1, very short. And 2 tarballs(?).
May I attach them here, does that make sense?
I browsed the report in the GUI and spotted this:
pci7: <ACPI PCI bus> on pcib4 vmx3: <VMware VMXNET3 Ethernet Adapter> at device 0.0 on pci7 vmx3: Using 512 TX descriptors and 512 RX descriptors vmx3: Using 2 RX queues 2 TX queues vmx3: Using MSI-X interrupts with 3 vectors Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x2b8 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80af4a16 stack pointer = 0x28:0xfffffe00085e3910 frame pointer = 0x28:0xfffffe00085e3930 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (pci_hp taskq) rdi: fffff8008137c800 rsi: ffffffff82d8db88 rdx: 000000000000001e rcx: fffff800058df000 r8: 0000000000000000 r9: 0000000000000010 rax: 0000000000000000 rbx: fffff8011cc34800 rbp: fffffe00085e3930 r10: 0000000000000000 r11: fffff800054e8800 r12: fffffe00085e39e0 r13: fffff80012f51010 r14: fffffe00092a4000 r15: fffff8008137c800 trap number = 12 panic: page fault cpuid = 0 time = 1776856634 KDB: enter: panicAside from that crash (thinking of a cold boot): Is it possible that the 4th vNIC messes up the interface assignments?
My idea right now is to instruct the customer to add the 4th vNIC himself while the VM is properly shut down. Ah, the NICs are of the type vmx0 .. vmx1 .. etc
Thanks for any suggestions.
-
Seems I have to proceed on my own.
Planning to test that one pNIC in a separate VM tomorrow.
Hopefully that doesn't crash everything. -
@sgw It's generally not a good idea to hotplug a NIC like that even if there is support for it at the OS level. The crash shows it barfing right after pci_hp which is the PCI hotplug process.
Current pfSenseCE is 2.8.1 -- you're 2.5 years behind.
Add your NICs with the firewall VM powered off. Upgrade pfSense to current.
-
@KOM We will try tomorrow. I have to walk the customer through the process, he will have to edit the VM while being offline then. Sure, missing upgrade is on the list as well. thanks!
-
@sgw define “didn't even boot correctly”? pfSense will stop during boot to assign interfaces if it detects a change.
-
It's been a few weeks: as far as I remember the VM crashed with that vNIC hotplugged. OK.
Then the customer turned off the VM (via ESXi GUI) and tried to boot it again, without changing the virtual hardware. I obviously couldn't watch the console because the site was offline then ... but we waited for a few minutes and the connectivity didn't come up, so we concluded the VM was hanging somehow. It's very likely that it waited for us to assign the interface, as you say!I have a telco in 20 minutes where we will try to add that vNIC in the shut down pfSense-VM.
-
We didn't succeed adding the 4th vNIC. The system came up after adding it in powered off state, but the VM wasn't reachable via its WAN IP (on an untouched config). Very very strange. After removing that 4th vNIC just fine again.
I solved it by re-connecting an already existing vNIC (which was disabled as Interface in pfSense) to the needed port group on the ESXi. This vNIC and Interface worked immediately with the fiber hardware (via PPPoE).
Oh my.
We won't invest much more time in this VM and migrate to a dedicated hardware ...
-
@sgw It's fairly common for added NICs to disrupt the existing order. You need to have console access so you can see what's going on with the interface assignments.
-
@KOM Very likely, yes. Unfortunately that's not so easy ... it would be possible over some tethering setup probably.
But the priorities changed: that VM will not live very long anymore. And right now all WANs are available and usable (researching slow speedtests, but that's another issue).