Pfsense stops. Please help.



  • Hello,

    I am using pfsense 2.4.4. Until today pfsense was working okay. Today I saw all network stopped, went to server room and saw there is a red light flashing in front of server (HP ML150G6), but server was on. I looked at monitor, pfsense was on, but frozen, keyboard not answering. I turned server and started again, it began to work. But after a few hours the problem occurred again. I looked at logs, but didn't understand much more. I am attaching the photo and logs here for your review. Please help me to solve this issue.

    Logs

    0_1540833256801_logs.txt

    0_1540833248597_SAVE_20181029_211115.jpeg



  • Check the log in HPs iLO for errors.



  • You think, it is server problem, right? Did you have a look at logs?


  • Netgate Administrator

    Mmm, a red light on that little ECG logo sure looks like some hardware failure...

    Steve



  • Thanks. Did you have a look at logs? Did you see any strange things there? I don't know much more about logs.

    For example:

    Oct 29 19:38:05 php-cgi rc.bootup: The command '/usr/sbin/arp -s '192.168.4.245' '00:50:56:8a:94:a4'' returned exit code '1', the output was 'arp: writing to routing socket: Cannot allocate memory'
    Oct 29 19:38:04 php-cgi rc.bootup: The command '/usr/sbin/arp -s '192.168.4.244' 'f0:76:1c:37:62:35'' returned exit code '1', the output was 'arp: writing to routing socket: Cannot allocate memory'


    Oct 29 19:38:04 kernel ppc0: cannot reserve I/O port range
    Oct 29 19:38:04 kernel atkbd0: [GIANT-LOCKED]


    Oct 29 19:38:05 sshd 93222 Server listening on 0.0.0.0 port 33426.
    Oct 29 19:38:05 sshd 93222 Server listening on :: port 33426.


    Oct 29 19:38:04 kernel pci0: <base peripheral, interrupt controller> at device 20.2 (no driver attached)
    Oct 29 19:38:04 kernel pci0: <base peripheral, interrupt controller> at device 20.1 (no driver attached)
    Oct 29 19:38:04 kernel pci0: <base peripheral, interrupt controller> at device 20.0 (no driver attached)


    Oct 29 19:38:04 kernel uhub3: 2 ports with 2 removable, self powered
    Oct 29 19:38:04 kernel uhub1: 2 ports with 2 removable, self powered


  • Netgate Administrator

    Only that first log is a real error. It could be a number of things but potentially it could be bad RAM.

    You need to check the server logs if you can reach them.

    Steve



  • The flashing LED is the system health indicator.
    Blinking red means: Critical system failure detected (processor, memory, regulator, thermal event, fan, NMI)

    In iLO you may find detailed information to the failure.



  • @stephenw10 This error can be because of bad RAM?

    Oct 29 19:38:05 php-cgi rc.bootup: The command '/usr/sbin/arp -s '192.168.4.245' '00:50:56:8a:94:a4'' returned exit code '1', the output was 'arp: writing to routing socket: Cannot allocate memory'

    This server is old and I don't know how to setup its management port and see server logs.



  • You will also find system failure events in the BIOS.



  • I will have a look at that events at work tomorrow. Could you please let me know the meanings of logs with numbers above?


  • Netgate Administrator

    pfSense tried to create an ARP entry for that IP/MAC and failed because it couldn't write to the routing socket due to a memory allocations failure. Hard to say more than that. I imagine those are fixed DHCP leases you have set static ARP on.

    Really I wouldn't even look at that until your hardware issue is addressed. Which resolve it anyway.

    Steve



  • This memory allocations failure is due to hardware problem, right? After fixing that, these error will not occur again? Until today, I have never seen these errors before.


  • Netgate Administrator

    Then they are probably related.

    Steve



  • I changed all rams with new ones, but still get this errors in system logs.

    Oct 31 09:06:05 php-fpm 336 /rc.linkup: The command '/usr/sbin/arp -s '192.168.2.240' '00:1a:81:00:1a:f4'' returned exit code '1', the output was 'arp: writing to routing socket: Cannot allocate memory'
    Oct 31 09:06:05 php-fpm 336 /rc.linkup: The command '/usr/sbin/arp -s '192.168.2.235' '00:0c:29:a8:72:2b'' returned exit code '1', the output was 'arp: writing to routing socket: Cannot allocate memory'
    Oct 31 09:06:05 php-fpm 336 /rc.linkup: The command '/usr/sbin/arp -s '192.168.2.234' '00:0c:29:23:82:78'' returned exit code '1', the output was 'arp: writing to routing socket: Cannot allocate memory'
    Oct 31 09:06:05 php-fpm 336 /rc.linkup: The command '/usr/sbin/arp -s '192.168.2.230' '00:0c:29:d0:17:c5'' returned exit code '1', the output was 'arp: writing to routing socket: Cannot allocate memory'


  • Netgate Administrator

    Is that a static DHCP lease defined on the firewall? If not what is that device, where is it defined?

    Did you see those errors logged prior to the hardware event?

    Steve



  • Clients are getting ip address via dhcp with static mappings. I began to see these errors after restarting pfsense. Actually I have noticed pfsense after restart in the past and haven't see these error logs.

    0_1541007532153_Screenshot from 2018-10-31 21-29-22.png

    0_1541007540727_Screenshot from 2018-10-31 21-32-45.png

    0_1541007555419_Screenshot from 2018-10-31 21-34-23.png

    0_1541007564007_Screenshot from 2018-10-31 21-37-45.png



  • @emammadov said in Pfsense stops. Please help.:

    Cannot allocate memory'

    Time for a trip to the console. Here are several useful commands.



  • It has been 2 days that pfsense doesn't stop. But I want to understand why I see these errors in system logs.


  • Netgate Administrator

    So you're seeing that for all the static ARP entries then?

    Do you actually see them in the ARP table?

    Steve



  • Yes, I see all the static arp entries and they are also located in the arp table. I changed all RAMs with new ones. Network cards are new.

    I am attaching logs in .txt file.
    0_1541015861024_logs.txt


  • Netgate Administrator

    Hmm, odd. Do you need those to be static ARP entries?

    Did you find any logging in the bios or iLO indicating what the hardware issue was?

    Steve