pfSense became unresponsive, then no DNS resolution after reboot
- 
 @stephenw10 I can't see anything outstanding in the logs. But I might be missing something. Maybe I should dig into logs via the command line and not just look at the GUI. However, I found something that might be the cause. I am using HAProxy and I recently enabled the stats logging. Using ps aux I could see that HAProxy is consuming lot of memory. After some time I checked again and its memory consumption increased. So at this point my theory is that it caused an out of memory error. I disabled stats and since then the memory usage stays on the same level. Fingers crossed. BTW, all (known) devices on the network have static IPs. PfSense hands out pihole's IP to clients and I can confirm they are using pihole directly. Now that I think about, the DNS resolution issue is now even more mysterious, as pfSense should not be involved in DNS lookups. 
- 
 Check the graphs in Status > Monitoring. If there as memory exhaustion it should have been recorded there. 
- 
 @Sherwatt said in pfSense became unresponsive, then no DNS resolution after reboot: was the first time we experienced a power outage Next time when you boot up pfSense, do so while you are watching, following the boot process from the console, the serial access with the small wire. You'll know right away if there is a problem. 
 Also : when pfSense doesn't seem to react : connect to serial (console) interface first.
 Resetting or ripping out the power is like a Russian roulette "head shot".
 SSH access is the next best, but it needs 'interfaces' to work. Not being able to ssh in is already a 'bad' sign by itself. See it like this : nearly every device on the planet depends on SSH, and it's pretty rock solid. SSH not working is a big red flag. It could be as simple as the "Login protection" has excluded you after several login (password) errors, but you better be sure right away = try logging in from another device.
 The fact that pfSense handles (normally) DHCP, this is also a good sign that some parts are still working, but if all your devices use static IP settings, you 'miss' this check = run ipconfig /all on your PC, or check if your device re obtained a DHCP lease after removing the connection for a short time.When you install pfSense packages like HAProxy, it becomes important that you check regularly the system resources. After all, when RAM fills up, pfSense can start swapping and that's something you really do not want to happen, as the system might elect a random (the process using the most RAM) process and kill it. This will most probably have an impact as every process is essential. This "killing" will get signaled in the system log. And yeah, an UPS can pay itself back without you knowing about it ;) 
- 
 I'm just checking the Monitoring graph and memory consumption seems normal, nothing outstanding there. 
 However the States started increasing 10 days ago. I am not even sure I understand what States are, but I guess I need to see what I changed 10 days ago and see if is related.EDIT: that big spike is NOT when the issue happened, that spike is actually 24 hours before that. So maybe it is not even States that caused it.  Thanks for the tips @Gertjan, I will try to remember to use the serial access first. But it also depends how quickly the household needs internet as two people are working from home here. 
 And yes, after this outage I definitely want to buy a UPS, I just need to do some research, because I have never used one.
- 
 Yeah I doubt it's a states problem. 4000 states really isn't that much. Odd that it spiked like that though. Do you have any sort of content sharing applications running? bit torrent creates a lot of states for example. 
- 
 @stephenw10 Yes, I am running qBittorrent in a container. Can I run some kind of error checking and fixing command on pfSense to look for potentially corrupted files on the disk? Maybe the outage caused a corruption somewhere on the filesystem which is rarely accessed, but when fails, the whole system crashes. 
- 
 Is it UFS or ZFS? 
- 
 @stephenw10 I asked ChatGPT the same question as in my previous post and after a short chat it turns out it is ZFS: $ zpool status -v pool: pfSense state: ONLINE config: NAME STATE READ WRITE CKSUM pfSense ONLINE 0 0 0 mmcsd0p4 ONLINE 0 0 0 errors: No known data errorsIs there anything else I could use to retroactively diagnose the problem? I already fed the boot log to ChatGPT to look for errors, but it didn't find anything scary. Should I share it with you and if yes, is pasting it in a post acceptable? 
- 
 Then you can run a zfs pool scrub: zpool scrub pfSense
 https://docs.netgate.com/pfsense/en/latest/troubleshooting/filesystem-check.htmlYou can upload the logs here and I can look at them: 
 https://nc.netgate.com/nextcloud/s/zgpTGfKio3Fa5eb
- 
 @stephenw10 Thank you. I uploaded boot.txt. [2.7.2-RELEASE][admin@pfSense.lan.mydomain.com]/root: zpool scrub pfSense [2.7.2-RELEASE][admin@pfSense.lan.mydomain.com]/root: zpool status pool: pfSense state: ONLINE scan: scrub repaired 0B in 00:00:10 with 0 errors on Wed Mar 19 15:11:17 2025 config: NAME STATE READ WRITE CKSUM pfSense ONLINE 0 0 0 mmcsd0p4 ONLINE 0 0 0 errors: No known data errors
- 
 That's just the boot log from after the outage happened. We need to see the system covering the event. So from at least some hours before until and including the reboot. You should disable the on-board audio device though. It just uses resources and does nothing in pfSense. hdacc0: <Intel Jasper Lake HDA CODEC> at cad 2 on hdac0 hdaa0: <Intel Jasper Lake Audio Function Group> at nid 1 on hdacc0
- 
 @stephenw10 Thank you for looking into my issue. I uploaded system.log twice, because I messed up the first one. I guess this is what I should be looking at, right? (from /var/log). 
 I think the issue happened around 17:45 (March 18). I left my computer around 17:40 and when came back pfSense was dead.I should disable the audio device in UEFI, right? 
- 
 @Sherwatt said in pfSense became unresponsive, then no DNS resolution after reboot: I should disable the audio device in UEFI, right? Yup somewhere in the EFI/BIOS setup you should be able to disable it completely. 
- 
 Mmm, nothing really shown in the logs at all: Mar 18 17:17:00 pfSense sshguard[62427]: Now monitoring attacks. Mar 18 17:26:00 pfSense sshguard[62427]: Exiting on signal. Mar 18 17:26:00 pfSense sshguard[44994]: Now monitoring attacks. Mar 18 17:35:00 pfSense sshguard[44994]: Exiting on signal. Mar 18 17:35:00 pfSense sshguard[31294]: Now monitoring attacks. Mar 18 17:44:00 pfSense sshguard[31294]: Exiting on signal. Mar 18 17:44:00 pfSense sshguard[11995]: Now monitoring attacks. Mar 18 17:47:09 pfSense syslogd: exiting on signal 15 Mar 18 17:48:38 pfSense syslogd: kernel boot file is /boot/kernel/kernel Mar 18 17:48:38 pfSense kernel: ---<<BOOT>>--- Mar 18 17:48:38 pfSense kernel: Copyright (c) 1992-2023 The FreeBSD Project. Mar 18 17:48:38 pfSense kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Mar 18 17:48:38 pfSense kernel: The Regents of the University of California. All rights reserved. Mar 18 17:48:38 pfSense kernel: FreeBSD is a registered trademark of The FreeBSD Foundation. Mar 18 17:48:38 pfSense kernel: FreeBSD 14.0-CURRENT amd64 1400094 #1 RELENG_2_7_2-n255948-8d2b56da39c: Wed Dec 6 20:45:47 UTC 2023 Mar 18 17:48:38 pfSense kernel: root@freebsd:/var/jenkins/workspace/pfSense-CE-snapshots-2_7_2-main/obj/amd64/StdASW5b/var/jenkins/workspace/pfSense-CE-snapshots-2_7_2-main/sources/FreeBSD-src-RELENG_2_7_2/amd64.amd64/sys/pfSense amd64 Mar 18 17:48:38 pfSense kernel: FreeBSD clang version 16.0.6 (https://github.com/llvm/llvm-project.git llvmorg-16.0.6-0-g7cbf1a259152)If nothing is logged at reboot like that it can be a hardware issue. I assume you didn't see a crash report after rebooting? It doesn't look like you have SWAP configured so you wouldn't see one if it panicked. 
- 
 @stephenw10 Thank you for your time looking into the logs. I did not see any crash reports. Do you think I should configure swap in pfSense in case this happens again? 
- 
 You would need to re-install to do so. But that would then give you a crash report if it was the result of a kernel panic. 
- 
 @stephenw10 Then I'm just going to stick with my current setup and see if there is anything on the console the next time this happens, if happens. 
 Thank you for your help, much appreciated!
- 
S Sherwatt referenced this topic on

