PFsense hanging since version 2.4.4
-
@chrismacmahon said in PFsense hanging since version 2.4.4:
It's tricky to time it right, if you have an external logging server this is where it get's helpful.
Basically you want to get a copy of the system log right after a reboot, sometimes there is info logged in /var/crash that can be helpful.
When you say system log is that the system.log in /var/log? In that case it does not contain anything during the crash or before when it happens... /var/crash is empty so i guess it points to a hardware issue..
If there is nothing logged, chances are high it's hardware related, and it's starting to fail out (RAM, HDD, etc).
I have taken the mini pc apart. No dust in it since its fan-less system. I removed the M2 drive and RAM and put them back again just in case.
I also removed the Wifi chip in it since i do not use it anymore and one less problem cause to worry about :)
-
our book has all the information needed on logs: https://www.netgate.com/docs/pfsense/monitoring/system-logs.html
If you want to view from the CLI: https://www.netgate.com/docs/pfsense/monitoring/working-with-binary-circular-logs-clog.html
-
@marvosa said in PFsense hanging since version 2.4.4:
No software is perfect, but my money is on something in your hardware. I would do a deep dive into your hardware (e.g. blow out dust, re-seat connections, re-seat the RAM, try new RAM, etc, etc). After that, post the crash log here.
Yeah.. I had a nightmare with another system here a few months ago.. Windows system. It did bluescreen now and then.. After about 1-2 months searching for the fault, checking drivers, viruses, replaced hd, reinstalled windows on another hd, replaced the RAM sticks, changed GFX card and finally i replaced the PSU and that solved the problem! .. I have never in my 25-30 year PC career have had a faulty PSU that created blue screens... i even measured the old PSU with a Fluke measuring tool and the voltages seemed fine and no fluctuation what i could find... But bluescreens happened sporadically .. 1-3 days apart...
Anyway... thats off topic.. haha..
I opened the mini-pc.. no dust (fan less) removed RAM and Disk and put back again... Removed WIFI-mini chip. I will let it run for a few days now and see... If it crashes again i will move the machine to a stable 12V power supply to rule out the PSU (thats the easiest thing for me to test right now) .. After that i have to decide if i should buy new RAM or HDD first... What do you think? :)
-
How old is the system?
-
@chrismacmahon said in PFsense hanging since version 2.4.4:
How old is the system?
It's from 1st of April 2017 so not very old..
It is ordered from Aliexpress.com though...........
-
Computer hanged again.. After 1day and 8 hours. I will update to 2.4.4 p2 and if it hangs again i will connect it to another PSU to test one thing each time.. Will let you know in here to help out others maybe :)
-
It sounds like it's hardware.
I'm not in your shoes, but chasing hardware faults is difficult at best sometimes.
When I had this happen 10 years ago, in my home, my wife insisted i fix it.... I ended up buying new hardware as that was the fastest path to resolution. Good luck!
-
It hung again. So now im running on another PSU. Also running a memtest now with Memtest86+... Will keep you updated... :) When it hung the console screen were just frozen and no special messages.
If it freezes again next step is to run it from USB stick i guess...
If it frezes then i will test older releases of pfsense. -
This is a hardware fault.
If you have another device to swap out I would do so, or if you can run a VM to test I would.
-
@chrismacmahon said in PFsense hanging since version 2.4.4:
This is a hardware fault.
If you have another device to swap out I would do so, or if you can run a VM to test I would.
A VM to test what exactly? A VM of my current installation?
The router hanged again an hour ago. Im now making a USB stick to try to run my config from it if possible... Live.
-
If you have the hardware to spin up a virtual machine, you can import your working config into the VM and run off of that.
-
@chrismacmahon said in PFsense hanging since version 2.4.4:
If you have the hardware to spin up a virtual machine, you can import your working config into the VM and run off of that.
ok.. i dont have a ESXi machine at home.. but at work.. i will try to reinstall 2.4.4.. on the computer first.. or run 2.3 if that failes.
-
i tried to create a USB stick with 2.4.4 p1 but after installing i get into that crappy serial console bug.. i pressed ESC to type "set kern.vty=sc" but then booting into multi user mode i seem to get a crash.. text scrolls by too fast for me to read.. so after ESC i want to boot into single user mode to be able to see the boot errors.. but how do i boot from CLI (ESC) into single user mode.. i have tried to google for it but cannot find anything... i tried like boot single and such without success.. please help!
I really wanted to try 2.4.4 before i go back and try 2.3 or something that might work better...
-
Not sure, I would re-burn the image, try again...if it happens again the hardware issue is the problem.
We are a fan of Etcher.io
-
Found it... "boot -s" it is :) .. Will try it and review the logs.
-
Had to run fsck -y a few times due to unclean system because i did a power off because of the display issue in 2.4.4.. i did not realize the system would start up anyway..
So now the system is up .. 2.4.4 p1 with basic settings.. Nothing changed except password.. will let it run like that to see what happens.. if it hangs again i will try to go back to an old version where i know it was working... to rule out software too...
Is there anyone that has an older image available? 2.4 or 2.4.1 or 2.4.2? that i could have to test?
-
Another thing... i found this...
Intel Atom systems containing HD Graphics chipsets similar to the Z3700 may experience console problems after the update. Affected systems will boot successfully, but fail to display console output after the boot menu. To fix the problem, add the following lines to /boot/loader.conf.local: i915kms_load="YES" drm.i915.enable_unsupported=1 Systems with similar console problems not containing a graphics chip supported by the i915 driver may need to reinstall 2.4.4 to use a UEFI console. Alternately, try using the syscons console instead of VT in /boot/loader.conf.local: kern.vty=sc
I have been using kern.vty=sc.. but i see i can enter the other stuff instead... Could this be a source of my problems i have had? I'm running Atom with Z36xx or Z37xx gfx chip..
Here is pciconf -lv output of my system:
code [2.4.4-RELEASE][root@pfSense.localdomain]/root: pciconf -lv hostb0@pci0:0:0:0: class=0x060000 card=0x0f318086 chip=0x0f008086 rev=0x0e hdr=0x00 vendor = 'Intel Corporation' device = 'Atom Processor Z36xxx/Z37xxx Series SoC Transaction Register' class = bridge subclass = HOST-PCI vgapci0@pci0:0:2:0: class=0x030000 card=0x0f318086 chip=0x0f318086 rev=0x0e hdr=0x00 vendor = 'Intel Corporation' device = 'Atom Processor Z36xxx/Z37xxx Series Graphics & Display' class = display subclass = VGA ahci0@pci0:0:19:0: class=0x010601 card=0x0f238086 chip=0x0f238086 rev=0x0e hdr=0x00 vendor = 'Intel Corporation' device = 'Atom Processor E3800 Series SATA AHCI Controller' class = mass storage subclass = SATA xhci0@pci0:0:20:0: class=0x0c0330 card=0x0f358086 chip=0x0f358086 rev=0x0e hdr=0x00 vendor = 'Intel Corporation' device = 'Atom Processor Z36xxx/Z37xxx, Celeron N2000 Series USB xHCI' class = serial bus subclass = USB none0@pci0:0:26:0: class=0x108000 card=0x0f188086 chip=0x0f188086 rev=0x0e hdr=0x00 vendor = 'Intel Corporation' device = 'Atom Processor Z36xxx/Z37xxx Series Trusted Execution Engine' class = encrypt/decrypt hdac0@pci0:0:27:0: class=0x040300 card=0x0f048086 chip=0x0f048086 rev=0x0e hdr=0x00 vendor = 'Intel Corporation' device = 'Atom Processor Z36xxx/Z37xxx Series High Definition Audio Controller' class = multimedia subclass = HDA pcib1@pci0:0:28:0: class=0x060400 card=0x0f488086 chip=0x0f488086 rev=0x0e hdr=0x01 vendor = 'Intel Corporation' device = 'Atom Processor E3800 Series PCI Express Root Port 1' class = bridge subclass = PCI-PCI pcib2@pci0:0:28:1: class=0x060400 card=0x0f4a8086 chip=0x0f4a8086 rev=0x0e hdr=0x01 vendor = 'Intel Corporation' device = 'Atom Processor E3800 Series PCI Express Root Port 2' class = bridge subclass = PCI-PCI pcib3@pci0:0:28:2: class=0x060400 card=0x0f4c8086 chip=0x0f4c8086 rev=0x0e hdr=0x01 vendor = 'Intel Corporation' device = 'Atom Processor E3800 Series PCI Express Root Port 3' class = bridge subclass = PCI-PCI pcib4@pci0:0:28:3: class=0x060400 card=0x0f4e8086 chip=0x0f4e8086 rev=0x0e hdr=0x01 vendor = 'Intel Corporation' device = 'Atom Processor E3800 Series PCI Express Root Port 4' class = bridge subclass = PCI-PCI isab0@pci0:0:31:0: class=0x060100 card=0x0f1c8086 chip=0x0f1c8086 rev=0x0e hdr=0x00 vendor = 'Intel Corporation' device = 'Atom Processor Z36xxx/Z37xxx Series Power Control Unit' class = bridge subclass = PCI-ISA none1@pci0:0:31:3: class=0x0c0500 card=0x0f128086 chip=0x0f128086 rev=0x0e hdr=0x00 vendor = 'Intel Corporation' device = 'Atom Processor E3800 Series SMBus Controller' class = serial bus subclass = SMBus igb0@pci0:1:0:0: class=0x020000 card=0x00008086 chip=0x15398086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'I211 Gigabit Network Connection' class = network subclass = ethernet igb1@pci0:2:0:0: class=0x020000 card=0x00008086 chip=0x15398086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'I211 Gigabit Network Connection' class = network subclass = ethernet igb2@pci0:3:0:0: class=0x020000 card=0x00008086 chip=0x15398086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'I211 Gigabit Network Connection' class = network subclass = ethernet igb3@pci0:4:0:0: class=0x020000 card=0x00008086 chip=0x15398086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'I211 Gigabit Network Connection' class = network subclass = ethernet
-
Ok.. i entered:
i915kms_load="YES"
drm.i915.enable_unsupported=1Into /boot/loader.conf.local
I now have much smaller text on the monitor connected to the system (VGA) when i boot the system than i have had before.. Maybe this could have been the issue why i had system freeze? .. I will run with the default config for 2 days... if no freeze i will load my config into the system and see if everything works as it should....
-
This looks promising.. I really hope this was the issue.. seems like it though... Uptime of 2 days and 6 hours now.. I will let it run until 3 days then i will apply my own config..
-
While you're applying, switch to p3.