Unexplained halt / and of course lan outage.
-
2100 - running 23.05.1 - generally very stable for months until 3:25 am this morning. (the only time system generally gets started is during updates)
The last log entry just prior to the halt/outage was ACME reporting that the certificate did not need update. (Normal for this time of the day)
There doesn't seem to be any reason logged, no crash report etc.This morning when it was discovered there was no internet from anything internal, I tried connecting to the NetGate, no connection.
Could not ping, Could not SSH. No response on console (of course it was hooked up after), but I couldn't get a response.
The blue light on the front was stilll blinking "normally".Devices going through the NG switch could not talk to each other,
that is ports LAN 1 <-> LAN 2 <-> LAN 3 could not talk (everything is on the same subnet, so there are no rules internally between the ports on the NG)
devices connected on the Switch connected to LAN1 could talk to each other, similarly devices connected to LAN2 or LAN3 could talk on their own switch, but through the NG was a no go.It is almost like the system received a halt now command. But there is nothing logged regarding this, and "normally there would be"
After several attempts to connect, I finally decided there was no other choice and pulled the power. (normally it is on UPS)It rebooted and has been fine since.
Every monitoring graph shows everything just stopped.
There was no WAN drop logged
There is zero traffic logged WAN or LAN during the outage until the reboot, those graphs are flat as well.Plenty'O-Disk
Packages installed are minimal, acme, apcupsd, pfBlockerNG, System_Patches (all are current)
No VPN, etcIs there anything else I can look at, before just saying oh well, stuff happens and move on ?
-
No crash report? Nothing logged at all?
If so there's probably nothing you can see at this point.
Steve
-
None that I could see. Every other log file stops and then restarts after I eventually pulled the plug.
the only thing I see different is at the start of the dmesg file the previous one contains 4 extra lines at the start
---<<BOOT>>---
GDB: debug ports: uart
GDB: current port: uart
KDB: debugger backends: ddb gdb
KDB: current backend: ddb
Copyright (c) 1992-2022 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994normally just looks like
Copyright (c) 1992-2023 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994other than that - they look the same.
I'm ok if there is nothing more to see here, because it has only happened this one time. Generally very stable and if it works don't fix it. Now if it started doing this more frequently I'd be worried.
Couple of follow up questions if I may - again for reference, this is a 2100
(i've never had an issue with all the memory being consumed, not even close)a) from day one out of the box running ZFS and there is NO Swap file. If I'm reading correctly to get a crash report on the GUI, there needs to be a swap file?
a1) can a swap file be created without having to do a complete reinstall?b) again from day one, in the dmesg file there is always a sequence of lines
ipw_bss: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw.LICENSE.
ipw_bss: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf.
module_register_init: MOD_LOAD (ipw_bss_fw, 0xffff000000252de4, 0) error 1
(repeats 6 times, the hex address is different in each sequence)
Since the licence file itself doesn't exist on the system, I've never looked at setting the item in the boot/loader.conf file. (just assumed on the 2100 it doesn't apply)Thanks for your feedback.
JR -
@jrey said in Unexplained halt / and of course lan outage.:
b) again from day one, in the dmesg file there is always a sequence of lines
ipw_bss: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw.LICENSE.
ipw_bss: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf.
module_register_init: MOD_LOAD (ipw_bss_fw, 0xffff000000252de4, 0) error 1
(repeats 6 times, the hex address is different in each sequence)
Since the licence file itself doesn't exist on the system, I've never looked at setting the item in the boot/loader.conf file. (just assumed on the 2100 it doesn't apply)Do as said :
If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf
Or, you shouldn't edit /boot/loader.conf as it can be overwritten by pfSense.
The solution is : create a /boot/ /boot/loader.conf.local text and add :legal.intel_ipw.license_ack=1
and no more Licence messages "interl_ipw" afterwards.
During boot, the kernel reads
/boot/loader.conf
and after that, it reads also
/boot/ /boot/loader.conf.local -
Those lines before the copyright notice are expected. Nothing unusual there.
-
@Gertjan said in Unexplained halt / and of course lan outage.:
The solution is : create a /boot/ /boot/loader.conf.local text and add :
Thanks for the tip about using the .local to set it.
is it a even a problem?Is the only point of setting it now to stop the messages from logging?
The license file itself has never existed (from day one, even through 3 OS updates).
There is nothing to read and/or agree to.
Seems other than the messages in the dmesg file, setting or/and not setting would have little effect on the operation. -
@jrey said in Unexplained halt / and of course lan outage.:
is it a even a problem?
None.
It's just a question that pops up every week or month.@jrey said in Unexplained halt / and of course lan outage.:
The license file itself has never existed (from day one, even through 3 OS updates).
There is nothing to read and/or agree to.It's just a phrase, thrown out in the system logs.
pfSense doesn't come packed with man and other documentation files.
If you want to / have to / need to : goto the FreeBSD repository, locate the driver in the source tree. There you will find the file.
Its just the usual disclaimer. -
@stephenw10 said in Unexplained halt / and of course lan outage.:
Those lines before the copyright notice are expected.
Thanks, "expected" but not consistent then, as they are not there every time. Good to know.
Then there is nothing different with file after the unexpected halt except those lines, and normally I don't see them.
Nothing to see here then, it appears the system just halted / unknown cause.
-
@jrey said in Unexplained halt / and of course lan outage.:
a) from day one out of the box running ZFS and there is NO Swap file. If I'm reading correctly to get a crash report on the GUI, there needs to be a swap file?
a1) can a swap file be created without having to do a complete reinstall?for the sole purpose of being able to possibly get a crash dump should this happen again, would the instructions in this old post still apply?
https://forum.netgate.com/post/784925The swap file created in the above link is 1GB, on the 2100 with 4GB ram does the swap need to be/should it be larger?
Thanks
-
@jrey said in Unexplained halt / and of course lan outage.:
can a swap file be created without having to do a complete reinstall?
A swap file isn't just a file, it's a file using it's own partition.
There are tools that allow you to shrink the current pfSense, to make place for a new one, add for that new partition in the MBR etc.When my 4100 was delivered (using 22.05 ?) : no swap file, there was partition, but it wasn't made viable to the kernel, I had to modify /etc/fstab (If I reacll that right).
By default - and not a bad choice : you earned the right to install from scratch. That's always a useful experience, as you need to be able to do so also in case of emergency.
-
@Gertjan said in Unexplained halt / and of course lan outage.:
By default - and not a bad choice : you earned the right to install from scratch
Thanks, but for the sole purpose of possibly grabbing a crash report for something that has only happened once in the life of the system, seems like a future project only if the halts becomes a bigger issue.
I'm familiar with the install from scratch process and have done it a few times on a virtual, for testing before even purchasing the real gear.
the only reason for putting the swap on a different partition is likely this warning on the FreeBSD pages
"Swap files on ZFS file systems are strongly discouraged, as swapping can lead to system hangs."https://docs.freebsd.org/en/books/handbook/config/#adding-swap-space
No pressing need to do a reinstall for me, just for this, so a future me might consider it, if I get bored or it becomes an issue someday.
Thanks for the feedback.
-
Yeah I would have to agree, if you want to add SWAP reinstalling is the way to go.
-
@jrey said in Unexplained halt / and of course lan outage.:
https://docs.freebsd.org/en/books/handbook/config/#adding-swap-space
True, mounting another ZFS partition eats up big quantities of system resources.
Happily enough, the swap partition is of type 'swap', whatever that might be. But probably a file system way easier to handle for the system as ZFS, ext3 or ntfs. Probably close to fat32 ;)
The small swap partition just exists so it can receive a oops kernel plus other crash details in file. I would trim down the system load if the swap partition was use for actual swapping.