problems with suricata 6.0.0
-
I will fire up my pfSense-2.5 snapshot virtual machine with Suricata 6.0.0 on it and let that run for a while to see what happens.
-
ok, again, clean install, stopping and restarting the interface lead to a duplicate instance running for some time, it needs 1 minute (+/-) to stop the first instance
now it's running, let's see how long it lasts -
It is very apparent that Suricata 6.0.0 and netmap have problems. Likely the issues are in FreeBSD-12.2/STABLE as I don't immediately see any changes in the C source code for the netmap portion of the Suricata binary.
When I booted up my virtual machine that has Suricata-6.0.0 installed on it, the interface running Inline IPS Mode with the em NIC driver and an e1000 virtual NIC was throwing continuous "netmap buffer full" errors and was totally inaccessible.
I'm applying the latest pfSense-2.5 snapshot to the VM and will test with all the interfaces using Legacy Mode blocking.
When I initially tested the 6.0.0 update, I was using what was a current pfSense-2.5 snapshot at the time. Was a couple of weeks ago I think.
-
@kiokoman said in problems with suricata 6.0.0:
ok, again, clean install, stopping and restarting the interface lead to a duplicate instance running for some time, it needs 1 minute (+/-) to stop the first instance
now it's running, let's see how long it lastsI was able to reproduce the duplicate instances issue. I'm pretty positive that one is related to GUI code changes. Have not investigated exactly what it is yet. I believe the GUI is updating the icon to "stopped" before the actual process has fully shutdown, so if you see the icon change to "stopped" and then immediately click "start" you can wind up with duplicate instances. I tested the "restart" icon and did not notice a duplicate process. But I will do further testing to be sure.
-
you are right, for the other interfaces, it eventually stop. but not for the pppoe interface
[2.5.0-DEVELOPMENT][root@pfSense.kiokoman.home]/root: ps aux | grep suricata root 95762 15.9 2.0 214220 167412 - Ss 21:04 30:47.18 /usr/local/bin/suricata -i pppoe0 -D -c /usr/local/etc/suricata/suricata_43940_pppoe0/suricata.yaml --pidfile /var/run/suricata_pppoe043940.pid root 90673 0.0 0.0 11264 2508 0 S+ 23:33 0:00.00 grep suricata
I waited 5 minutes to see if the process was killed
-
This looks more like an issue with PPPoE and Suricata. My test VM still appears to be working. But it's not PPPoE. It is a DHCP WAN and LAN.
I will let the VM run for quite a while and see if it still logs alerts.
-
Okay, I can confirm the loss of alerting capability and the long shutdown times. Something is wrong within the Suricata binary itself I think.
To be sure, I'm going to revert my VM to the suricata-5.0.3 binary and see if everything behaves better then. I have the ability to do that in my test environment.
The long shutdown time is very strange. Suricata almost immediately deletes the PID file in
/var/run
, so that's why the GUI icon changes so fast. The GUI detects the PID file to know if the process is running or stopped. However, even though the PID file is quickly removed, the actual process hangs around for a lot longer before dying.Edit: one more data point. I have two interfaces configured with Suricata in that VM. One has zero rules configured besides the default built-in rules. The other interface has a number of rules categories enabled. The interface with just the built-in rules shuts down almost immediately as it should. The interface with more rules enabled is the one taking a long time to shutdown.
-
i have the problem also on lan interface (vmx0) no alert after some minutes
pppoe is (igb0 with vlan 835) passthrough, esxi 7.0u1truss show when it start alot of
mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34624663552 (0x80fca7000) mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34624667648 (0x80fca8000) mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34624671744 (0x80fca9000) mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34624675840 (0x80fcaa000) mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34624679936 (0x80fcab000) mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34624684032 (0x80fcac000) mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34624688128 (0x80fcad000) mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34624692224 (0x80fcae000) mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34624696320 (0x80fcaf000) mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34624700416 (0x80fcb0000) mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34624704512 (0x80fcb1000) munmap(0x819adb000,8192) = 0 (0x0) munmap(0x81a544000,8192) = 0 (0x0) munmap(0x819ab2000,8192) = 0 (0x0) munmap(0x819a81000,8192) = 0 (0x0) munmap(0x819a68000,8192) = 0 (0x0) munmap(0x819a77000,8192) = 0 (0x0) munmap(0x819d19000,4096) = 0 (0x0) munmap(0x819a57000,8192) = 0 (0x0) madvise(0x81b3c4000,12288,MADV_FREE) = 0 (0x0) madvise(0x816811000,12288,MADV_FREE) = 0 (0x0) madvise(0x819bed000,12288,MADV_FREE) = 0 (0x0) madvise(0x81b799000,12197888,MADV_FREE) = 0 (0x0) mmap(0x7fffdfbfc000,2101248,PROT_READ|PROT_WRITE,MAP_STACK,-1,0x0) = 140736947273728 (0x7fffdfbfc000) mprotect(0x7fffdfbfc000,4096,PROT_NONE) = 0 (0x0) thr_new(0x7fffffffdd40,0x68) = 0 (0x0) <new thread 100862> mmap(0x0,2097152,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34840248320 (0x81ca40000) nanosleep({ 0.000100000 }) = 0 (0x0) munmap(0x81ca40000,2097152) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) mmap(0x0,4190208,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34842599424 (0x81cc7e000) nanosleep({ 0.000100000 }) = 0 (0x0) munmap(0x81cc7e000,1581056) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) munmap(0x81d000000,512000) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) write(1,"\^[[32m6/11/2020 -- 00:31:24\^[["...,148) = 148 (0x94) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) write(3,"6/11/2020 -- 00:31:24 - <Notice>"...,119) = 119 (0x77) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) thr_self(0x7fffffffccc0) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) __sysctl("kern.hostname",2,0x7fffffffcb80,0x7fffffffbe28,0x0,0) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) getpid() = 55820 (0xda0c) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) sendto(4,"<141>1 2020-11-06T00:31:24.88355"...,184,0,NULL,0) = 184 (0xb8) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) sigprocmask(SIG_SETMASK,{ SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SI> nanosleep({ 0.000100000 }) = 0 (0x0) sigaction(SIGUSR2,{ 0x800a57140 SA_SIGINFO ss_t },{ SIG_IGN 0x0 ss_t }) = 0 (0x0) sigprocmask(SIG_SETMASK,{ SIGUSR2 },0x0) = 0 (0x0) sigprocmask(SIG_UNBLOCK,{ SIGUSR2 },0x0) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0) nanosleep({ 0.000100000 }) = 0 (0x0)
after this you only have nanosleep and no more alert coming
-
I've just compiled and loaded the 5.0.3 binary on my VM. Will test with that. I'm thinking Suricata 6.0.0 and possibly even 5.0.4 are broken within the binary. The first time I stopped Suricata on an interface it stopped after a long time. The last time, in preparation for installing the 5.0.3 binary, it hung up in a kernel spin lock and I had to reboot the firewall to get rid of it!
-
The 5.0.3 binary is installed and running, but with the 6.0.0 package GUI code. So the few small GUI bug fixes are still in place, but the suricata binary is reverted to the older version. Will let this combo run for a while and make sure alerting still works.
Tested stopping an interface with rules defined and it worked properly. Suricata immediately shutdown and the process disappeared as it should. No duplicate process created when restarting.
Update: tested alerts after 30 minutes of runtime and still working fine.
-
with 5.0.3 or with 6.0.0 ?
i'm still doing some test with 6 and i have a long run of alert if i disable all Logging Settings
nope still not working, false alarm
i tried out of curiosity inline mode against igb0 and i had the same problem of the other thread, wan goes offline as soon as truss show "nanosleep"
inline stop workin, nothing is logged anymore and kill the wan
legacy mode stop working and nothing is logged anymore but wan still working -
@kiokoman said in problems with suricata 6.0.0:
with 5.0.3 or with 6.0.0 ?
i'm still doing some test with 6 and i have a long run of alert if i disable all Logging Settings
nope still not working, false alarm
i tried out of curiosity inline mode against igb0 and i had the same problem of the other thread, wan goes offline as soon as truss show "nanosleep"I installed 5.0.3 on my test VM. I have my own private pfSense packages repository server, so I can build any variation of binary and GUI package versions I need to. So I built the 5.0.3 binary and paired it with the 6.0.0 PHP GUI code. I wanted to prove to myself the issues were within the 6.0.0 binary, and I'm now pretty sure they are.
The GUI serves no function with regards to blocking and alerting. All of that happens in the binary piece. The GUI simply generates the
suricata.yaml
conf file for the interface and copies the rules file to a specified directory. It has nothing to do with detection and actual alerting. -
yeah i'm with you, the problem is the binary
-
The upstream Suricata team has made many changes to the binary after deciding to require Rust and rewrite major parts of Suricata to use Rust. That started with the 5.x branch and has greatly intensified in 6.0.0. There were also a bunch of other changes in the source code for 6.0.0.
Tomorrow I will build the 5.0.4 binary and see how it performs. Mainly to see if it contains the same "no alerts after a time" bug.
-
@kiokoman said in problems with suricata 6.0.0:
with 5.0.3 or with 6.0.0 ?
i'm still doing some test with 6 and i have a long run of alert if i disable all Logging Settings
nope still not working, false alarm
i tried out of curiosity inline mode against igb0 and i had the same problem of the other thread, wan goes offline as soon as truss show "nanosleep"
inline stop workin, nothing is logged anymore and kill the wan
legacy mode stop working and nothing is logged anymore but wan still workingOne more test you can try, if you have not already, is to disable all blocking and run in just plain IDS mode. See if alerts still cease being logged. I suspect they will, but would be nice to be sure.
-
IDS mode 5-minute running and I'm still receiving alerts
idk how long it will last but it's a new record
i'll tell you in a few hours, 2:30am here, it's time to hit the sack ..same behavior 5 minutes only
-
My 5.0.3 binary VM with the 6.0.0 GUI is still working fine after running all night. So the problems are definitely within the new 6.0.0 binary.
I will test the newer 5.0.4 binary today to see if it has the same issues. Depending on what I find there, I will work with the pfSense team to pull down the 6.0.0 binary and replace it with either 5.0.4 (if it works okay) or 5.0.3.
I have already closed the pull request that was sitting in the pfSense-2.4.5_p1 RELEASE branch, so we will not be deploying the 6.0.0 binary there. The RELEASE branch is still on the older 5.0.3 package (both binary and GUI).
-
nice ! .. my grafana maps is empty without suricata
-
So far my testing with the 5.0.4 binary has been good. Still getting alerts after an hour. Will let it go a bit longer in Legacy Mode blocking, then I will test Inline IPS Mode.
Have traded emails with a member of the pfSense team. The 6.0.0 binary will be reverted shortly to 5.0.3 for now. Depending on how my 5.0.4 testing goes, we will either stay at 5.0.3 or update to 5.0.4. But we are not going to push out 6.0.0 for now. Will wait at least until 6.0.1 is released upstream.
-
fair enough
if you need me to do some testing when the time comes, i will be happy to help.