Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    100% CPU problem with pfSense 2.3

    Scheduled Pinned Locked Moved General pfSense Questions
    18 Posts 6 Posters 8.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • E
      epionier
      last edited by

      Hello,

      I am using pfSense 2.3 in a ESXI VM configured with a single core and enough RAM (4GB). Everything is working fine but from time to time I suddenly have a 100% CPU without end. I can see that in the ESXI manager that the CPU for the pfSense VM is 100% during hours (at night e.g.) and does not lower down anymore.

      When the CPU is 100% I cannot access the Webconfigurator anymore and when I SSH to the machine it displays only the first the welcome line but nothing more (like the selection). Also on the console. But IPSec for example or squid is still working, I am just unable to access pfSense for configuration, etc. The only way to solve that is rebooting the machine which takes a while. Afterwards sometimes the CPU hits 100% immediately again and sometimes not.

      Sometimes when I was quick after the reboot I restarted the webconfigurator and PHP on the console/ssh and then it seems to be stable (for some time).

      The system logs in the webconfigurator shows no errors or something like that. Also not on the Dashboard.

      How do I find out whats causing this? Because when I have 100% CPU I cannot SSH to the machine to check "top -aSH" to find the faulty process.

      Last night I assume the problem occurred after the PPPoE renewal because afterwards I did not show any snort entries anymore.

      Here is the log for last night: (newest on top, some information XXXed):

      May 15 12:15:34 shutdown reboot by root:
      May 15 12:09:02 sshd 65760 Accepted keyboard-interactive/pam for admin from 10.21.32.1 port 53297 ssh2
      May 15 12:07:58 php-fpm 17736 /index.php: Successful login for user 'admin' from: 10.21.32.1
      May 15 03:31:35 kernel pppoe0: promiscuous mode enabled
      May 15 03:30:49 xinetd 11808 Reconfigured: new=0 old=1 dropped=0 (services)
      May 15 03:30:49 xinetd 11808 readjusting service 6969-udp
      May 15 03:30:49 xinetd 11808 Swapping defaults
      May 15 03:30:49 xinetd 11808 Starting reconfiguration
      May 15 03:30:48 SnortStartup 34071 Snort START for WAN(61399_pppoe0)…
      May 15 03:30:48 php-fpm 65608 /rc.start_packages: [lightsquid] Updating cronjobs…
      May 15 03:30:48 check_reload_status Syncing firewall
      May 15 03:30:47 check_reload_status Syncing firewall
      May 15 03:30:47 php-fpm 65608 /rc.start_packages: [lightsquid] Removing old cronjobs…
      May 15 03:30:47 php-fpm 65608 /rc.start_packages: [lightsquid] Successfully created '/usr/local/etc/lightsquid/lightsquid.cfg' configuration file.
      May 15 03:30:47 php-fpm 65608 /rc.start_packages: [lightsquid] Loaded default '/usr/local/etc/lightsquid/lightsquid.cfg.sample' configuration file.
      May 15 03:30:47 check_reload_status Reloading filter
      May 15 03:30:46 php-fpm 65608 /rc.start_packages: [squid] Starting a proxy monitor script
      May 15 03:30:46 php-fpm 65608 /rc.start_packages: [squid] Reloading for configuration sync…
      May 15 03:30:45 php-fpm 65608 /rc.start_packages: [squid] Stopping any running proxy monitors
      May 15 03:30:45 php-fpm 65608 /rc.start_packages: [squid] Reloading C-ICAP…
      May 15 03:30:45 php-fpm 65608 /rc.start_packages: [squid] Reloading ClamAV…
      May 15 03:30:45 php-fpm 65608 /rc.start_packages: Checked cron job for /usr/local/bin/freshclam --config-file=/usr/local/etc/freshclam.conf, no change needed
      May 15 03:30:45 php-fpm 65608 /rc.start_packages: [squid] Adding freshclam cronjob.
      May 15 03:30:45 php-fpm 65608 /rc.start_packages: Checked cron job for /usr/local/pkg/swapstate_check.php, no change needed
      May 15 03:30:45 php-fpm 65608 /rc.start_packages: Checked cron job for /usr/local/sbin/squid -k rotate -f /usr/local/etc/squid/squid.conf, no change needed
      May 15 03:30:45 php-fpm 65608 /rc.start_packages: [squid] Adding cronjobs …
      May 15 03:30:33 xinetd 11808 Reconfigured: new=0 old=1 dropped=0 (services)
      May 15 03:30:33 xinetd 11808 readjusting service 6969-udp
      May 15 03:30:33 xinetd 11808 Swapping defaults
      May 15 03:30:33 xinetd 11808 Starting reconfiguration
      May 15 03:30:32 php-fpm 65608 /rc.start_packages: [squid] - squid_resync function call pr:1 bp: rpc:no
      May 15 03:30:32 check_reload_status Reloading filter
      May 15 03:30:31 php-fpm 65608 /rc.start_packages: [squid] Starting a proxy monitor script
      May 15 03:30:31 php-fpm 65608 /rc.start_packages: [squid] Reloading for configuration sync…
      May 15 03:30:29 php-fpm 65608 /rc.start_packages: [squid] Stopping any running proxy monitors
      May 15 03:30:28 php-fpm 65608 /rc.start_packages: [squid] Reloading C-ICAP…
      May 15 03:30:28 php-fpm 65608 /rc.start_packages: [squid] Reloading ClamAV…
      May 15 03:30:28 php-fpm 65608 /rc.start_packages: Checked cron job for /usr/local/bin/freshclam --config-file=/usr/local/etc/freshclam.conf, no change needed
      May 15 03:30:28 php-fpm 65608 /rc.start_packages: [squid] Adding freshclam cronjob.
      May 15 03:30:28 php-fpm 65608 /rc.start_packages: Checked cron job for /usr/local/pkg/swapstate_check.php, no change needed
      May 15 03:30:28 php-fpm 65608 /rc.start_packages: Checked cron job for /usr/local/sbin/squid -k rotate -f /usr/local/etc/squid/squid.conf, no change needed
      May 15 03:30:28 php-fpm 65608 /rc.start_packages: [squid] Adding cronjobs …
      May 15 03:30:16 php-fpm 65608 /rc.start_packages: [squid] - squid_resync function call pr:1 bp: rpc:no
      May 15 03:30:16 php-fpm 97269 /rc.start_packages: Skipping STARTing packages process because previous/another instance is already running
      May 15 03:30:16 php-fpm 65608 /rc.start_packages: Restarting/Starting all packages.
      May 15 03:30:15 check_reload_status Starting packages
      May 15 03:30:15 php-fpm 60012 /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 217.248.XXX.XXX -> 93.208.XXX.XXX - Restarting packages.
      May 15 03:30:15 xinetd 11808 Reconfigured: new=0 old=1 dropped=0 (services)
      May 15 03:30:15 xinetd 11808 readjusting service 6969-udp
      May 15 03:30:15 xinetd 11808 Swapping defaults
      May 15 03:30:15 xinetd 11808 Starting reconfiguration
      May 15 03:30:15 check_reload_status Starting packages
      May 15 03:30:15 php-fpm 65608 /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - -> 10.21.31.1 - Restarting packages.
      May 15 03:30:15 check_reload_status Reloading filter
      May 15 03:30:15 php-fpm 65608 /rc.newwanip: rc.newwanip: on (IP address: 10.21.31.1) (interface: []) (real interface: ovpns1).
      May 15 03:30:15 php-fpm 65608 /rc.newwanip: rc.newwanip: Info: starting on ovpns1.
      May 15 03:30:14 xinetd 11808 Reconfigured: new=0 old=1 dropped=0 (services)
      May 15 03:30:14 xinetd 11808 readjusting service 6969-udp
      May 15 03:30:14 xinetd 11808 Swapping defaults
      May 15 03:30:14 xinetd 11808 Starting reconfiguration
      May 15 03:30:13 check_reload_status rc.newwanip starting ovpns1
      May 15 03:30:12 kernel ovpns1: link state changed to UP
      May 15 03:30:12 php-fpm 60012 /rc.newwanip: Creating rrd update script
      May 15 03:30:12 check_reload_status Reloading filter
      May 15 03:30:12 kernel ovpns1: link state changed to DOWN
      May 15 03:30:12 php-fpm 60012 /rc.newwanip: Resyncing OpenVPN instances for interface WAN.
      May 15 03:30:12 check_reload_status Reloading filter
      May 15 03:30:12 php-fpm 60012 /rc.newwanip: Forcefully reloading IPsec
      May 15 03:30:11 php-fpm 60012 /rc.newwanip: phpDynDNS (): (Success) IP Address Updated Successfully!
      May 15 03:30:11 php-fpm 60012 /rc.newwanip: phpDynDNS: updating cache file /conf/dyndns_wancustom''0.cache: 93.208.XXX.XXX
      May 15 03:30:11 php-fpm 60012 /rc.newwanip: Message sent to XXX@XXX.de OK
      May 15 03:30:07 php-fpm 60012 /rc.newwanip: Removing static route for monitor 217.237.XXX.XXX and adding a new route through 87.186.XXX.XXX
      May 15 03:30:07 php-fpm 60012 /rc.newwanip: ROUTING: setting default route to 87.186.XXX.XXX
      May 15 03:30:07 xinetd 11808 Reconfigured: new=0 old=1 dropped=0 (services)
      May 15 03:30:07 xinetd 11808 readjusting service 6969-udp
      May 15 03:30:07 xinetd 11808 Swapping defaults
      May 15 03:30:07 xinetd 11808 Starting reconfiguration
      May 15 03:30:06 php-fpm 60012 /rc.newwanip: IP has changed, killing states on former IP 217.248.XXX.XXX.
      May 15 03:30:06 php-fpm 60012 /rc.newwanip: rc.newwanip: on (IP address: 93.208.XXX.XXX) (interface: WAN[wan]) (real interface: pppoe0).
      May 15 03:30:06 php-fpm 60012 /rc.newwanip: rc.newwanip: Info: starting on pppoe0.
      May 15 03:30:05 ppp [wan] IFACE: Rename interface ng0 to pppoe0
      May 15 03:30:05 ppp [wan] IFACE: Up event
      May 15 03:30:05 check_reload_status rc.newwanip starting pppoe0
      May 15 03:30:04 check_reload_status Rewriting resolv.conf
      May 15 03:30:04 ppp [wan] 93.208.XXX.XXX -> 87.186.XXX.XXX
      May 15 03:30:04 ppp [wan] IPCP: LayerUp
      May 15 03:30:04 ppp [wan] IPCP: state change Ack-Sent –> Opened
      May 15 03:30:04 ppp [wan] SECDNS 217.237.151.142
      May 15 03:30:04 ppp [wan] PRIDNS 217.237.150.188
      May 15 03:30:04 ppp [wan] IPADDR 93.208.XXX.XXX
      May 15 03:30:04 ppp [wan] IPCP: rec'd Configure Ack #3 (Ack-Sent)
      May 15 03:30:04 ppp [wan] SECDNS 217.237.151.142
      May 15 03:30:04 ppp [wan] PRIDNS 217.237.150.188
      May 15 03:30:04 ppp [wan] IPADDR 93.208.XXX.XXX
      May 15 03:30:04 ppp [wan] IPCP: SendConfigReq #3
      May 15 03:30:04 ppp [wan] SECDNS 217.237.151.142
      May 15 03:30:04 ppp [wan] PRIDNS 217.237.150.188
      May 15 03:30:04 ppp [wan] 93.208.XXX.XXX is OK
      May 15 03:30:04 ppp [wan] IPADDR 93.208.XXX.XXX
      May 15 03:30:04 ppp [wan] IPCP: rec'd Configure Nak #2 (Ack-Sent)
      May 15 03:30:04 ppp [wan] IPV6CP: LayerFinish
      May 15 03:30:04 ppp [wan] IPV6CP: state change Req-Sent –> Stopped
      May 15 03:30:04 ppp [wan] IPV6CP: protocol was rejected by peer
      May 15 03:30:04 ppp [wan_link0] LCP: protocol IPV6CP was rejected
      May 15 03:30:04 ppp [wan_link0] LCP: rec'd Protocol Reject #74 (Opened)
      May 15 03:30:04 ppp [wan] SECDNS 0.0.0.0
      May 15 03:30:04 ppp [wan] PRIDNS 0.0.0.0
      May 15 03:30:04 ppp [wan] IPADDR 0.0.0.0
      May 15 03:30:04 ppp [wan] IPCP: SendConfigReq #2
      May 15 03:30:04 ppp [wan] COMPPROTO VJCOMP, 16 comp. channels, no comp-cid
      May 15 03:30:04 ppp [wan] IPCP: rec'd Configure Reject #1 (Ack-Sent)
      May 15 03:30:04 ppp [wan] IPCP: state change Req-Sent –> Ack-Sent
      May 15 03:30:04 ppp [wan] IPADDR 87.186.XXX.XXX
      May 15 03:30:04 ppp [wan] IPCP: SendConfigAck #28
      May 15 03:30:04 ppp [wan] 87.186.XXX.XXX is OK
      May 15 03:30:04 ppp [wan] IPADDR 87.186.XXX.XXX
      May 15 03:30:04 ppp [wan] IPCP: rec'd Configure Request #28 (Req-Sent)
      May 15 03:30:04 ppp [wan] IPV6CP: SendConfigReq #1
      May 15 03:30:04 ppp [wan] IPV6CP: state change Starting –> Req-Sent
      May 15 03:30:04 ppp [wan] IPV6CP: Up event
      May 15 03:30:04 ppp [wan] SECDNS 0.0.0.0
      May 15 03:30:04 ppp [wan] PRIDNS 0.0.0.0
      May 15 03:30:04 ppp [wan] COMPPROTO VJCOMP, 16 comp. channels, no comp-cid
      May 15 03:30:04 ppp [wan] IPADDR 0.0.0.0
      May 15 03:30:04 ppp [wan] IPCP: SendConfigReq #1
      May 15 03:30:04 ppp [wan] IPCP: state change Starting –> Req-Sent
      May 15 03:30:04 ppp [wan] IPCP: Up event
      May 15 03:30:04 ppp [wan] IPV6CP: LayerStart
      May 15 03:30:04 ppp [wan] IPV6CP: state change Initial –> Starting
      May 15 03:30:04 ppp [wan] IPV6CP: Open event
      May 15 03:30:04 ppp [wan] IPCP: LayerStart
      May 15 03:30:04 ppp [wan] IPCP: state change Initial –> Starting
      May 15 03:30:04 ppp [wan] IPCP: Open event
      May 15 03:30:04 ppp [wan] Bundle: Status update: up 1 link, total bandwidth 64000 bps
      May 15 03:30:04 ppp [wan_link0] Link: Join bundle "wan"
      May 15 03:30:04 ppp [wan_link0] Link: Matched action 'bundle "wan" ""'
      May 15 03:30:04 ppp [wan_link0] LCP: authorization successful
      May 15 03:30:04 ppp [wan_link0] PAP: rec'd ACK #1 len: 5
      May 15 03:30:04 ppp [wan_link0] LCP: LayerUp
      May 15 03:30:04 ppp [wan_link0] PAP: sending REQUEST #1 len: 60
      May 15 03:30:04 ppp [wan_link0] PAP: using authname "XXXXXXXXX@t-online-com.de"
      May 15 03:30:04 ppp [wan_link0] LCP: auth: peer wants PAP, I want nothing
      May 15 03:30:04 ppp [wan_link0] LCP: state change Ack-Sent –> Opened
      May 15 03:30:04 ppp [wan_link0] MAGICNUM 0xeb21b71a
      May 15 03:30:04 ppp [wan_link0] MRU 1492
      May 15 03:30:04 ppp [wan_link0] LCP: rec'd Configure Ack #2 (Ack-Sent)
      May 15 03:30:04 ppp [wan_link0] MAGICNUM 0xeb21b71a
      May 15 03:30:04 ppp [wan_link0] MRU 1492
      May 15 03:30:04 ppp [wan_link0] LCP: SendConfigReq #2
      May 15 03:30:04 ppp [wan_link0] PROTOCOMP
      May 15 03:30:04 ppp [wan_link0] LCP: rec'd Configure Reject #1 (Ack-Sent)
      May 15 03:30:04 ppp [wan_link0] LCP: state change Req-Sent –> Ack-Sent
      May 15 03:30:04 ppp [wan_link0] MAGICNUM 0x4b0a8b6e
      May 15 03:30:04 ppp [wan_link0] AUTHPROTO PAP
      May 15 03:30:04 ppp [wan_link0] MRU 1492
      May 15 03:30:04 ppp [wan_link0] LCP: SendConfigAck #73
      May 15 03:30:04 ppp [wan_link0] MAGICNUM 0x4b0a8b6e
      May 15 03:30:04 ppp [wan_link0] AUTHPROTO PAP
      May 15 03:30:04 ppp [wan_link0] MRU 1492
      May 15 03:30:04 ppp [wan_link0] LCP: rec'd Configure Request #73 (Req-Sent)
      May 15 03:30:04 ppp [wan_link0] MAGICNUM 0xeb21b71a
      May 15 03:30:04 ppp [wan_link0] MRU 1492
      May 15 03:30:04 ppp [wan_link0] PROTOCOMP
      May 15 03:30:04 ppp [wan_link0] LCP: SendConfigReq #1
      May 15 03:30:04 ppp [wan_link0] LCP: state change Starting –> Req-Sent
      May 15 03:30:04 ppp [wan_link0] LCP: Up event
      May 15 03:30:04 ppp [wan_link0] Link: UP event
      May 15 03:30:04 ppp [wan_link0] PPPoE: connection successful
      May 15 03:30:04 ppp PPPoE: rec'd ACNAME "XXXR73-se800-B222E170705301"
      May 15 03:30:04 ppp [wan_link0] PPPoE: Connecting to ''
      May 15 03:30:04 ppp [wan_link0] LCP: LayerStart
      May 15 03:30:04 ppp [wan_link0] LCP: state change Initial –> Starting
      May 15 03:30:04 ppp [wan_link0] LCP: Open event
      May 15 03:30:04 kernel ng0: changing name to 'pppoe0'
      May 15 03:30:04 ppp [wan_link0] Link: OPEN event
      May 15 03:30:04 ppp [wan] Bundle: Interface ng0 created
      May 15 03:30:04 ppp web: web is not running
      May 15 03:30:03 ppp process 19511 terminated
      May 15 03:30:03 ppp [wan_link0] Link: Shutdown
      May 15 03:30:03 kernel pppoe0: promiscuous mode disabled
      May 15 03:30:03 ppp [wan] Bundle: Shutdown
      May 15 03:30:03 ppp [wan_link0] LCP: LayerFinish
      May 15 03:30:03 ppp [wan_link0] LCP: state change Stopping –> Stopped
      May 15 03:30:03 ppp waiting for process 19511 to die...
      May 15 03:30:02 ppp waiting for process 19511 to die...
      May 15 03:30:01 ppp [wan_link0] LCP: LayerDown
      May 15 03:30:01 ppp [wan_link0] LCP: SendTerminateAck #3
      May 15 03:30:01 ppp [wan] IPV6CP: state change Closed –> Initial
      May 15 03:30:01 ppp [wan] IPV6CP: Down event
      May 15 03:30:01 ppp [wan] IPCP: state change Closing –> Initial
      May 15 03:30:01 ppp [wan] Bundle: No NCPs left. Closing links…
      May 15 03:30:01 ppp [wan] IPCP: LayerFinish
      May 15 03:30:01 ppp [wan] IPCP: Down event
      May 15 03:30:01 ppp [wan] IPV6CP: Close event
      May 15 03:30:01 ppp [wan] IPCP: Close event
      May 15 03:30:01 ppp [wan] Bundle: Status update: up 0 links, total bandwidth 9600 bps
      May 15 03:30:01 ppp [wan_link0] Link: Leave bundle "wan"
      May 15 03:30:01 ppp [wan_link0] LCP: state change Opened –> Stopping
      May 15 03:30:01 ppp [wan_link0] LCP: rec'd Terminate Request #180 (Opened)
      May 15 03:30:01 ppp [wan] IPV6CP: state change Stopped –> Closed
      May 15 03:30:01 ppp [wan] IPV6CP: Close event
      May 15 03:30:01 ppp [wan] IFACE: Rename interface pppoe0 to pppoe0
      May 15 03:30:01 ppp [wan] IFACE: Down event
      May 15 03:30:01 ppp [wan] IFACE: Removing IPv4 address from pppoe0 failed(IGNORING for now. This should be only for PPPoE friendly!): Can't assign requested address
      May 15 03:30:01 check_reload_status Rewriting resolv.conf
      May 15 03:30:01 ppp [wan] IPCP: LayerDown
      May 15 03:30:01 ppp [wan] IPCP: SendTerminateReq #4
      May 15 03:30:01 ppp [wan] IPCP: state change Opened –> Closing
      May 15 03:30:01 ppp [wan] IPCP: Close event
      May 15 03:30:01 ppp [wan] IFACE: Close event
      May 15 03:30:01 ppp caught fatal signal TERM
      May 15 03:30:01 ppp waiting for process 19511 to die…
      May 15 03:30:01 ppp process 57918 started, version 5.8 (root@pfSense_v2_3_0_amd64-pfSense_v2_3_0-job-14 22:52 6-Apr-2016)
      May 15 03:30:01 ppp Multi-link PPP daemon for FreeBSD
      May 15 03:30:00 check_reload_status Configuring interface wan
      May 15 03:30:00 pppoe0 PPPoE periodic reset executed on wan
      May 15 03:24:32 snort 16007 [1:2008578:6] ET SCAN Sipvicious Scan [Classification: Attempted Information Leak] [Priority: 2] {UDP} 62.138.0.118:5277 -> 217.248.XXX.XXX:5060
      May 15 03:24:32 snort 16007 [1:2011716:4] ET SCAN Sipvicious User-Agent Detected (friendly-scanner) [Classification: Attempted Information Leak] [Priority: 2] {UDP} 62.138.0.118:5277 -> 217.248.XXX.XXX:5060
      May 15 03:19:21 snort 16007 [1:2008578:6] ET SCAN Sipvicious Scan [Classification: Attempted Information Leak] [Priority: 2] {UDP} 85.114.135.163:5084 -> 217.248.XXX.XXX:5060
      May 15 03:19:21 snort 16007 [1:2011716:4] ET SCAN Sipvicious User-Agent Detected (friendly-scanner) [Classification: Attempted Information Leak] [Priority: 2] {UDP} 85.114.135.163:5084 -> 217.248.XXX.XXX:5060

      1 Reply Last reply Reply Quote 0
      • K
        killmasta93
        last edited by

        What packages are you running? I see your running squid remember on the local cache area dont give more ram then what you have.

        Tutorials:

        https://www.mediafire.com/folder/v329emaz1e9ih/Tutorials

        1 Reply Last reply Reply Quote 0
        • E
          epionier
          last edited by

          Thank you for your statement. I am essentially running squid, lightsquid and snort.

          My local hdd cache is 6GB, files stored in RAM 256MB and system memory is 4GB and usually only ~30-40% are in use by the system. So this should not be the problem and I am aware of the circumstance that files stored in ram by squid for caching is not the only ram squid uses.

          Are there more detailed system logs that I can access via SSH?

          1 Reply Last reply Reply Quote 0
          • K
            killmasta93
            last edited by

            on ssh can you run

            top -aSH
            

            Tutorials:

            https://www.mediafire.com/folder/v329emaz1e9ih/Tutorials

            1 Reply Last reply Reply Quote 0
            • C
              cmb
              last edited by

              Which process is responsible for the CPU usage? You can run top at the VM's console if SSH isn't working at the time. Need to know that before being able to suggest anything further.

              1 Reply Last reply Reply Quote 0
              • E
                epionier
                last edited by

                The problem is neither the console nor the SSH is working properly in this case so I am unable to run “top -aSH“. The console still displays new messages like a login (although the ssh login stops at the welcome line) but does not allow any selection (shell access e.g.).
                That's the point I don't know how to solve the problem ;)

                1 Reply Last reply Reply Quote 0
                • C
                  cmb
                  last edited by

                  Maybe add a second core to your VM and it won't be completely stuck. Though that might not help, it's worth a shot at least.

                  Or just leave 'top -SH' running in the VM console, then it'll be there to review when it dies. Even if it's so hung up that it can't update top, you'll at least have the last update it got which will likely show the culprit.

                  1 Reply Last reply Reply Quote 0
                  • E
                    epionier
                    last edited by

                    @cmb

                    That is the conclusion I came up with, too. Thanks. I will try that and post updates when the CPU problem will occur again.

                    One more question by the way. I also disabled planned PPPoE reconnection for testing purposes because I believe that restarting all services will lead to the CPU problem. I had a reconnect (due to the provider) about on hour ago on 13.30 pm.

                    I noticed in the system logs that SNORT did not reload automatically for the "new" PPPoE connection.

                    I checked the (automated) cron entries regarding snort:

                    */5 * * * * root /usr/bin/nice -n20 /usr/local/bin/php -f /usr/local/pkg/snort/snort_check_cron_misc.inc
                    0 6,18 * * * root /usr/bin/nice -n20 /usr/local/bin/php -f /usr/local/pkg/snort/snort_check_for_rule_updates.php
                    2 */2 * * * root /usr/bin/nice -n20 /sbin/pfctl -q -t snort2c -T expire 86400

                    In the first system log I posted above SNORT was renewed for the new PPPoE connection at the end of the process (although it didn`t seem to work because there were no new entries in the system log regarding snort afterwards):

                    May 15 03:30:48  SnortStartup  34071  Snort START for WAN(61399_pppoe0)…

                    Is this a bug (of rc.newwanip or rc.start_packages) or a misconfiguration in my case because I do not find a appropriate setting under snort services?

                    The snort service is still running according to Status->Services and according to Services->Snort but I think only for the "old" PPPoE connection that is not valid anymore.

                    1 Reply Last reply Reply Quote 0
                    • E
                      epionier
                      last edited by

                      I think I could locate the problem in connection with snort. As I stated above snort was not working after PPPoE refresh although the service seemed to be running. I waited for a couple of hours but snort did not re-activate for the "new" PPPoE connection. That is why I restarted the service manually. Having done this the CPU went to 100% and the problem with nonfunctional SSH/Console was back. Unfortunately I did not ran "top -aSH" in advance to verify this which I should have done.

                      1 Reply Last reply Reply Quote 0
                      • bmeeksB
                        bmeeks
                        last edited by

                        @epionier:

                        I think I could locate the problem in connection with snort. As I stated above snort was not working after PPPoE refresh although the service seemed to be running. I waited for a couple of hours but snort did not re-activate for the "new" PPPoE connection. That is why I restarted the service manually. Having done this the CPU went to 100% and the problem with nonfunctional SSH/Console was back. Unfortunately I did not ran "top -aSH" in advance to verify this which I should have done.

                        How large is the system disk partition where /tmp and /var/log are located?  Is one or more of these on a RAM disk?  If so, Snort will give you fits when any of its "disk space" is on a RAM disk because they almost can never be large enough.  Doing so then limits the working memory Snort has for loading rules.  Snort needs lot of free disk space in /tmp and /var/log.  Lots as in at least 256 MB free in /tmp and 1 GB or more free in /var/log (preferably a lot more to hold the Snort log files).

                        You gave some information in one of your earlier posts, but I'm not clear if that is for all packages or just Squid.

                        Bill

                        1 Reply Last reply Reply Quote 0
                        • E
                          epionier
                          last edited by

                          @bmeeks

                          The SSD disk partition is 15GB with 18% in use. RAM disks are not in use neither for /tmp nor /var.
                          Here are actual stats of my pfsense having a good life ;) :

                          MBUF Usage  4%  1016/26584
                          Load average  0.12, 0.07, 0.07
                          CPU usage  6%
                          Memory usage  23% of 3037 MiB
                          SWAP usage  0% of 4095 MiB
                          Disk usage ( / )  18% of 15GiB - ufs
                          Disk usage ( /var/run )  4% of 3.4MiB - ufs in RAM

                          I reduced the RAM to 3GB to try something but as I found out RAM is not the issue. The issue is mostly connected when pfSense reboots / WAN Ip changes or when I re-activate snort services (because of not being "really" active).

                          A backup of the pfSense VM is done every workday, like last night. The VM is powered off in advance (Open-VM-Tools installed) und is being restarted after the Backup. When I checked it last night after the backup I noticed that the CPU hits again 100%. So I rebooted via Vmware ESXI Manager (connected via IPSec) and quickly started "top -aSH" in the shell and the CPU went striaght to 100% again.

                          I noticed that there were two identical snort processes running which is the first point I cannot explain. And the CPU main usage changed between the two snort processes and the process "currentipsecpinghosts" and "netstat". I took a picture of that: http://fs5.directupload.net/images/160517/zyfl9k8q.jpg

                          As I said sometimes the both snort processes mainly used the cpu and then it changed to the other mentioned processes and back again, but CPU was always 100%.

                          1 Reply Last reply Reply Quote 0
                          • bmeeksB
                            bmeeks
                            last edited by

                            Snort can sometimes get double-started by the firewall system processes during boot up or when the WAN IP address changes.  I put some code into the /usr/local/etc/rc.d/snort.sh shell script to try and prevent that, but it still sometimes happens.  On an IP change (can be during boot as the interface comes up then gets say a DHCP or PPPoE address), the system issues a few "restart all packages" commands in quick succession.  You can see this if you examine the system log.  It takes Snort quite a while to startup, and those multiple "restart all packages" commands sometimes result in duplicate Snort processes getting started.

                            This does not always happen, and for some users it almost never happens, but others see it frequently.  On my personal firewall, it happens very rarely if my cable connection bounces several times in quick succession and my WAN IP gets toggled between the modem's private address space and the actual public IP assigned by my cable provider.

                            Bill

                            1 Reply Last reply Reply Quote 0
                            • E
                              epionier
                              last edited by

                              @bmeeks

                              Thank you for your explanation. Some days have passed since then and I have the following experiences:

                              1. I generally run into the 100% CPU problem when pfSense is shutdown and restarted after a couple of minutes. I changed my sort of backup (pfSense is a VM) that it backups from a snapshot and does not power off in advance and restarts pfSense after backup. But this just a workaround and not a permanent solution in my opinion. I strongly believe the CPU problem is mainly due to SNORT (see 3.)

                              2. The (2nd) problem still remains that SNORT is not properly re-activated when WAN IP changes. E.g. I just looked in the system logs and the WAN IP changed 3 hours ago because of the 24h provider disconnection. Since then there is no SNORT alert (which usually are in an interval of approx. 5 min) in the system logs. The script /rc.newwanip did not restart snort at all, there is no entry for snort in the system logs but the ("old") snort process is still running according to TOP.

                              3. After snort did not re-activate I manually restarted snort via Services->Snort->Reload. And the CPU went to 100% again. I noticed that a second snort service started and CPU is almost 100% for this process. The second snort process is 0% CPU. But after minute it changed and two processes "sh" took almost 80% of the CPU and the remaining 20% by snort. Some minutes later one "sh" process is 0% but a "cat /tmp/tmpHOSTS" process is using 70% (20% snort, 10% remaining sh process) and a further couple of minutes later the second sh process is 0% too and a "sleep 55" process is using 70%CPU (20% snort, 10% cat). And then CPU usage changes between cat, sleep and snort and so on. Sometimes there are a couple of processes "/usr/local/bin/php -f /usr/local/pkg/snort/snort_check_cron_misc.inc" active, too (in total >10%). The CPU does not lower but when I kill the snort process with its PID (kill -9 PID) the CPU goes down to 0%.

                              Maybe this information helps with improving the scripts.

                              1 Reply Last reply Reply Quote 0
                              • PerforadoP
                                Perforado Rebel Alliance
                                last edited by

                                I suspect the cronjobs aren't checking wheter they're already running maybe?

                                Had a similar problem with pfblockerng cronjobs.
                                Spawning every hour for updates and having selected 1M hosts in the "Filter with Alexa"-subconfig just wasn't something an Atom D525 could handle in 60 minutes
                                or less.
                                So an hour later another one spawned and so on …

                                My first firewall ever with 2,5Gb swapped  :o

                                1 Reply Last reply Reply Quote 0
                                • E
                                  epionier
                                  last edited by

                                  I don`t belive so because the process connected with the WAN IP reconnect is a script that is loaded when IP changes. Also when the VM starts. I believe the problem is in the script or in the (startup/reload) script the script invokes itself (maybe in conjunction with the snort process).

                                  I listed here: https://forum.pfsense.org/index.php?topic=111883.msg623353#msg623353 my cronjobs that are directly related with snort and snort2c. Perhaps you could compare to yours if they are identical because they were set automatically.

                                  1 Reply Last reply Reply Quote 0
                                  • P
                                    phil123456
                                    last edited by

                                    Hello,

                                    just installed snort today on pfsense 2.3

                                    cpu is up to 100%

                                    cannot get the shell in the terminal window

                                    is there any fix yet ??

                                    1 Reply Last reply Reply Quote 0
                                    • P
                                      phil123456
                                      last edited by

                                      ok I added a core and put 2gb instead of 512mb of ram, and now it seem to work fine

                                      jee snort is such a resource hog

                                      1 Reply Last reply Reply Quote 0
                                      • bmeeksB
                                        bmeeks
                                        last edited by

                                        @phil123456:

                                        ok I added a core and put 2gb instead of 512mb of ram, and now it seem to work fine

                                        jee snort is such a resource hog

                                        Yes, all IDS/IPS systems are resource hogs because of what they have to do.  If you start to run a full Snort or Suricata rule set, you may find even 2 GB of RAM can get a bit tight.  4 GB is a good RAM number for either Snort or Suricata in my view.  I suggest at least 2 cores for CPU, and 4 is even better.

                                        Bill

                                        1 Reply Last reply Reply Quote 0
                                        • First post
                                          Last post
                                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.