Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Suricata 2.0.3 Package Preview

    Scheduled Pinned Locked Moved pfSense Packages
    121 Posts 17 Posters 35.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B
      binaryjay
      last edited by

      @bmeeks:

      @binaryjay:

      Another thing I'm wondering about, sometimes I find my Suricata dead in the morning and this is in the logs:

      
      1/10/2014 -- 02:17:07 - <info>-- cleaning up signature grouping structure... complete
      1/10/2014 -- 02:17:08 - <notice>-- Stats for 'bce1':  pkts: 0, drop: 0 (nan%), invalid chksum: 0
      1/10/2014 -- 02:17:08 - <error>-- [ERRCODE: SC_ERR_DAEMON(87)] - Child died unexpectedly</error></notice></info> 
      

      As part of testing for an experimental feature in Snort, I've found what I think is the root cause of these random errors where the process dies.  Snort has the same issue.  Unfortunately the problem is not really within the two IDS packages themselves.  It is caused by the way pfSense handles certain events that involve installed packages.  It's a bit technical and complicated to explain, but there a number of processes within pfSense that can kick of a procedure to "start/restart all packages".  While debugging/testing my theory with Snort recently I saw the Snort package get sent three separate "start" commands within a 7-second window.  I had a special shell script that was logging the exact time it was called and with what argument.

      It takes Snort much longer than 7 seconds to startup.  So when you get multiple start commands, the shell script can wind up launching multiple processes.  Two of those "start" commands happened only 2 seconds apart!  I have made several attempts to workaround this in both the Snort and Suricata packages.  The developers who worked on Snort before me apparently also tried some tricks because their code is still in there.  Nothing works 100% when you have your startup script called repeatedly before you can even get started from the first call.

      This is how you sometimes wind up with duplicate Suricata or Snort processes.

      Bill

      I'm actually a professional developer, granted I am of the type that spends my days doing corporate stuff for (gasp) Windows so I am still a novice when it comes to BSD in particular but can one not export a global variable right at the start of the script that will be set to indicate suricata is starting which subsequent calls to the script can do a sanity check effectively putting in place a lock on the startup?  I will not be shocked if this kind of scope is disallowed, but of course there are other options like creating a temporary file to use as a lock immediately when the script is invoked.

      I assume that the issue is that you can't even get a process id early enough to lock based on finding a pid?

      Anyhow I have not thought this through and it is so obvious that it must have already been attempted lol.  If I remember right there is another pfSense package that will continually check if services are started or not and relaunch if they die so that I guess is a good enough workaround.

      1 Reply Last reply Reply Quote 0
      • bmeeksB
        bmeeks
        last edited by

        The problem is the amount of time between when Suricata or Snort get the initial "start" signal and when the PID file is created.  So you have a window of time where the process is kicking off but not fully running.  I have tried various lock file tricks, and they work in some cases but not all.

        Both binaries offer a type of "warm restart" where you can send a running process a signal.  The current Snort and Suricata packages attempt to use this so that if a PID file exists, and another "start" command is received, the PID file is used to signal the running process with the "warm restart" instead of "start".  I've discovered that if a running process receives a "warm restart" command too early in its startup phase, it will sometimes crash.  I think that is what is happening to folks posting here with randomly crashing processes.  Compounding this problem is the fact that package startup times for Suricata and Snort are different on different systems due to complexity and number of enabled rules and the CPU horsepower.  So what is 15 seconds on one box is 90 seconds on some others.

        In defense of pfSense, it is trying to make sure running packages are immediately aware of big configuration changes like new WAN IP addresses, newly started VPNs, etc.  The way used to tell packages about these changes today is to simply restart them.  During a reboot is when most of these rapid-fire "start" commands are issued as the various init scripts run in succession.

        I am certainly open to ideas.

        Bill

        1 Reply Last reply Reply Quote 0
        • ?
          A Former User
          last edited by

          How about fundamentally changing the way services are started? Instead of using a lock file to prevent a service restarting, use a lock file to indicate that the service is running as expected. The pfsense startup scripts could check for the existence of this file before attempting to start up snort/suricata. Just an idea.

          1 Reply Last reply Reply Quote 0
          • B
            binaryjay
            last edited by

            @jflsakfja:

            How about fundamentally changing the way services are started? Instead of using a lock file to prevent a service restarting, use a lock file to indicate that the service is running as expected. The pfsense startup scripts could check for the existence of this file before attempting to start up snort/suricata. Just an idea.

            I think you just paraphrased the same concept.

            1 Reply Last reply Reply Quote 0
            • ?
              A Former User
              last edited by

              Lock file: usually created when the process is starting up. Usually removed when the process ends.

              Special file: created after the process is up and running. Removed only after a successful manual shutdown command.

              Yes, as you said slightly different variations of the same concept, it's just the file's creation that matters. As far as actually working, that's not for me to decide, since I've always hated programming for a simple reason: Computers do what I tell them, not what I meant.

              1 Reply Last reply Reply Quote 0
              • bmeeksB
                bmeeks
                last edited by

                Without getting down into all the technical details, I have tried a few variations of "lock files" and "special files".

                Here is a brief history of how I got back into this problem.  Due to popular demand, I was altering the Snort package so that each configured interface would show up in pfSense as a distinct service.  So if you ran Snort (and Barnyard2) on both the WAN and LAN, then in the SERVICES applet you would see four services for Snort listed:  one each for the WAN and LAN labeled as "snort_wan" and "snort_lan"; plus matching "barnyard2_wan" and "barnyard2_lan" entries.  Since these would be showing up as individual services, they could be monitored by something like the Service Watchdog package.

                I added the new code and everything was working great on my testing VMs.  I recruited some of my private testers and let them try.  Low and behold they started getting lots of duplicate processes.  After trying a bunch of different approaches, nothing seemed to really work reliably.  Since 50% of my testers had the problem of duplicate processes, I don't think this idea is ready for prime time.  Investigating with my testers we found that whether or not you got the duplicate processes on reboot was controlled to some degree by the order in which other packages were installed.  If you installed Snort first and then after it the Service Watchdog package, things usually worked OK.  However, there was an annoyance from the Service Watchdog package because it runs every 60 seconds looking at the services.  It immediately tries to restart any service that is down.  So during the nightly rules update, it can attempt to restart Snort during the rules update as Snort is restarting itself.  There is no facility within the Service Watchdog package to tell it to "stand down" for a while.  At least not a way a package like Snort can use.  So you get e-mail and log spam from Service Watchdog while Snort is restarting.  If you have a minimal Snort setup with few active rules and an i7 3.3 GHz CPU, then you may get nothing from Service Watchdog because Snort can stop and restart quickly.  However, if you have thousands of active rules and/or a weaker CPU like an Atom, then Snort takes a while to restart and Service Watchdog will spew restart commands every 60 seconds.

                Service Watchdog is not the only issue.  The init scripts of pfSense also issue multiple package start/restart commands during a reboot.  You get the expected "package start" command as part of the initial boot up script, but then you get additional "package restart" commands from the WAN IP and NEW WAN IP scripts as the interfaces come up.  So if you have IPv6 and IPv4 configured, for example, that's three package "start" commands within 7 seconds.  One from the initial boot script, then one from NEW IPv4 WAN IP, and then another from NEW IPv6 WAN IP.  These can be followed by more "package restart" commands for things like VPNs, etc.  In my humble opinion, the boot up process should only issue a single package start command as like the very last thing it does.  That means after all the WAN IPs have been assigned and any VPNs have been started, etc.

                I have not given up, but for the moment I'm scratching my head looking for a viable solution that works in all cases.

                Bill

                1 Reply Last reply Reply Quote 0
                • bmeeksB
                  bmeeks
                  last edited by

                  @jflsakfja:

                  Hey Bill,
                  As far as I can tell, the bug with the logs not getting properly removed happens with the packet captures as well. Dunno if it happens with other logs, those are the first that went over the limits.

                  Clarification and Update on this issue:

                  There are configuration parameters on the INTERFACE SETTINGS tab where packet capture files are enabled that allow you specify the size limit in megabytes for each pcap file and the total number of pcap files to retain in the log directory.  The defaults are 32 MB for file size and 1000 pcap files maintained.  Once the number of pcap files to retain value is exceeded, the oldest file overwritten.

                  With that said, there is also a type of "fail safe" feature associated with both Snort and Suricata that lets you set an overall limit in MB on the amount of disk space in /var/log that will be consumed by ALL of the Suricata (or Snort, in that package) log files.  It's this latter fail-safe code that was neglecting to remove the pcap files when the log directory size limit was exceeded.  That will be fixed in the next update of Suricata.

                  Bill

                  1 Reply Last reply Reply Quote 0
                  • ?
                    A Former User
                    last edited by

                    @bmeeks:

                    Clarification and Update on this issue:

                    There are configuration parameters on the INTERFACE SETTINGS tab where packet capture files are enabled that allow you specify the size limit in megabytes for each pcap file and the total number of pcap files to retain in the log directory.  The defaults are 32 MB for file size and 1000 pcap files maintained.  Once the number of pcap files to retain value is exceeded, the oldest file overwritten.

                    With that said, there is also a type of "fail safe" feature associated with both Snort and Suricata that lets you set an overall limit in MB on the amount of disk space in /var/log that will be consumed by ALL of the Suricata (or Snort, in that package) log files.  It's this latter fail-safe code that was neglecting to remove the pcap files when the log directory size limit was exceeded.  That will be fixed in the next update of Suricata.

                    Bill

                    No no, I meant that the log rotation ignores the limits set for the packet logs. If setting a limit of 100MB, and a log number of 100, logs get cut off at 100MB as expected, and a new one is created until the size is reached, but the number of logs increases to 160, for example.

                    I identified the bug when the packet logs started filling up all the space that was assigned to suricata's log files. For some strange reason the blocked tab would cut off at a random number of hosts, and suricata would constantly complain that the log directory is full. Did an ls on the directory, which showed that packet logs weren't properly getting rotated, which led to the directory becoming full, and suricata freaking out and cutting off the alerts/blocks/other logs, but keeping the packet logs intact.

                    1 Reply Last reply Reply Quote 0
                    • bmeeksB
                      bmeeks
                      last edited by

                      @jflsakfja:

                      No no, I meant that the log rotation ignores the limits set for the packet logs. If setting a limit of 100MB, and a log number of 100, logs get cut off at 100MB as expected, and a new one is created until the size is reached, but the number of logs increases to 160, for example.

                      I identified the bug when the packet logs started filling up all the space that was assigned to suricata's log files. For some strange reason the blocked tab would cut off at a random number of hosts, and suricata would constantly complain that the log directory is full. Did an ls on the directory, which showed that packet logs weren't properly getting rotated, which led to the directory becoming full, and suricata freaking out and cutting off the alerts/blocks/other logs, but keeping the packet logs intact.

                      Oh…that sounds like perhaps a bug in the Suricata binary.  The limits set on the INTERFACE SETTINGS tab (where the pcap feature is turned on or off) are all handled by built-in parts of the binary.  Only settings on the LOG MGMT tab are controlled by PHP code I wrote.  I will test the binary rotation in a VM by setting the number very low (like 1 file) to see what happens.

                      Bill

                      1 Reply Last reply Reply Quote 0
                      • bmeeksB
                        bmeeks
                        last edited by

                        @jflsakfja:

                        No no, I meant that the log rotation ignores the limits set for the packet logs. If setting a limit of 100MB, and a log number of 100, logs get cut off at 100MB as expected, and a new one is created until the size is reached, but the number of logs increases to 160, for example.

                        I identified the bug when the packet logs started filling up all the space that was assigned to suricata's log files. For some strange reason the blocked tab would cut off at a random number of hosts, and suricata would constantly complain that the log directory is full. Did an ls on the directory, which showed that packet logs weren't properly getting rotated, which led to the directory becoming full, and suricata freaking out and cutting off the alerts/blocks/other logs, but keeping the packet logs intact.

                        Looked into the binary source code and this is not really a bug.  It's more a case of a bit misleading feature.  Suricata keeps track of the number of pcap files it creates during a given session and correctly rotates them.  However, the key here is "a given session".  It does not maintain state through startup/shutdown cycles.  That means as far as Suricata is concerned, when it starts up, it has opened "zero files".  So any that existed prior to the last Suricata shutdown are ignored.  So the misleading part is that the "max-files" parameter really only works so long as Suricata is never restarted…  :-\

                        So to address this I will need to put something in the GUI code's cron job.  I will do that in the update I am currently working on.

                        Bill

                        1 Reply Last reply Reply Quote 0
                        • B
                          binaryjay
                          last edited by

                          @bmeeks:

                          @jflsakfja:

                          No no, I meant that the log rotation ignores the limits set for the packet logs. If setting a limit of 100MB, and a log number of 100, logs get cut off at 100MB as expected, and a new one is created until the size is reached, but the number of logs increases to 160, for example.

                          I identified the bug when the packet logs started filling up all the space that was assigned to suricata's log files. For some strange reason the blocked tab would cut off at a random number of hosts, and suricata would constantly complain that the log directory is full. Did an ls on the directory, which showed that packet logs weren't properly getting rotated, which led to the directory becoming full, and suricata freaking out and cutting off the alerts/blocks/other logs, but keeping the packet logs intact.

                          Looked into the binary source code and this is not really a bug.  It's more a case of a bit misleading feature.  Suricata keeps track of the number of pcap files it creates during a given session and correctly rotates them.  However, the key here is "a given session".  It does not maintain state through startup/shutdown cycles.  That means as far as Suricata is concerned, when it starts up, it has opened "zero files".  So any that existed prior to the last Suricata shutdown are ignored.  So the misleading part is that the "max-files" parameter really only works so long as Suricata is never restarted…  :-\

                          So to address this I will need to put something in the GUI code's cron job.  I will do that in the update I am currently working on.

                          Bill

                          You've got it worse than I have it in my day job.  Thumbs up to you, sir.

                          1 Reply Last reply Reply Quote 0
                          • B
                            bonkas
                            last edited by

                            Has this issue been resolved?

                            I am also running a PPPoE Connection and trying to get Suricata configured but my logs are flooded with "SC_ERR_DATALINK_UNIMPLEMENTED" errors.

                            Suricata will not start automatically after a pfsense restart, is this by design or caused by these errors?

                            Regards,

                            1 Reply Last reply Reply Quote 0
                            • ?
                              A Former User
                              last edited by

                              Not sure (since I've never needed pppoe), and mind is a bit fuzzy right now (been working straight for the past 50 hours, with a pass-out-on-the-office-chair in between) but have you tried making suricata listen on the underlying interface, instead of the pppoe one? Oh well, someone more knowledgeable on this will come along. Please correct me if I'm wrong in any way.

                              1 Reply Last reply Reply Quote 0
                              • B
                                bonkas
                                last edited by

                                I have not tried that yet as it was noted earlier in the thread it was a bad idea to do that as Suricata will receive header information from the PPPoE interface which it wont know how to handle.

                                I will have to research how to do this as at the moment suricata is not generating any alerts at all.

                                1 Reply Last reply Reply Quote 0
                                • ?
                                  A Former User
                                  last edited by

                                  No alerts usually means that it's not recognizing the subnets (technically home_net). Are they declared correctly?

                                  1 Reply Last reply Reply Quote 0
                                  • bmeeksB
                                    bmeeks
                                    last edited by

                                    @bonkas:

                                    Has this issue been resolved?

                                    I am also running a PPPoE Connection and trying to get Suricata configured but my logs are flooded with "SC_ERR_DATALINK_UNIMPLEMENTED" errors.

                                    Suricata will not start automatically after a pfsense restart, is this by design or caused by these errors?

                                    Regards,

                                    No, not resolved and can't be really resolved until there is an upstream patch to Suricata so it recognizes the DLT_NULL data link type that FreeBSD assigns to PPPoE interfaces.  This is a problem in the Suricata binary that comes from upstream.

                                    You can make it partially work by hand-editing the suricata.yaml file and telling it the interface is an Ethernet physical link (by using the real physical NIC driver name), but you will still get occasional errors due to the PPPoE frame headers since Suricata does not expect to see them on a physical Ethernet interface.  Also, I can't guarantee you that all the detection signatures will work properly.  In short, this kind of hack is unsupported and you would be on your own.

                                    If you have a PPPoE interface and want IDS/IPS on it, then you should use Snort instead of Suricata.  The Snort binary fully supports the DLT_NULL data link type that FreeBSD uses.

                                    Bill

                                    1 Reply Last reply Reply Quote 0
                                    • First post
                                      Last post
                                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.