Snort won't start after upgrade to 21.02 on SG-3100
-
This post is deleted! -
This sounds like it fixes the PHP issue, but what about the Snort issue? These are two separate problems on the SG-3100. PHP is crashing, but also, on my SG-3100, Snort starts and then Snort core dumps when it starts looking at traffic.
I applied the diff above and restarted PHP and the GUI. I no longer see the PHP errors (in the few hours since I have applied it), but I still see Snort crashing:
"Jun 12 10:21:42 sg3100 kernel: pid 92832 (snort), jid 0, uid 0: exited on signal 10"
My SG-2220 (Intel-based) of course is working fine with Snort.
-
@rsm4 said in Snort won't start after upgrade to 21.02 on SG-3100:
This sounds like it fixes the PHP issue, but what about the Snort issue? These are two separate problems on the SG-3100. PHP is crashing, but also, on my SG-3100, Snort starts and then Snort core dumps when it starts looking at traffic.
I applied the diff above and restarted PHP and the GUI. I no longer see the PHP errors (in the few hours since I have applied it), but I still see Snort crashing:
"Jun 12 10:21:42 sg3100 kernel: pid 92832 (snort), jid 0, uid 0: exited on signal 10"
My SG-2220 (Intel-based) of course is working fine with Snort.
Snort on the SG-3100 is likely never going to be 100% fixed. The Signal 10 error is due to choices made by the compiler used for 32-bit ARM processors. The compiler is choosing to use a pair of memory load/store opcodes that do not perform automatic fixup of non-aligned memory access. This happens in the Snort binary due to the way certain data structures are packed in memory and then later accessed via pointers with casting. Intel and AMD processors (along with the 64-bit ARM models) perform the auto-fixup always, and thus the memory alignment hardware bus error (Signal 10 error) is avoided with that hardware.
That means the flaky Snort binary code works fine on all Intel and AMD hardware, and lots of other 64-bit stuff as well. There is thus zero incentive for the Snort team to spend the time and energy required to refactor the Snort binary code so that it works in the edge case with 32-bit ARM hardware. Ergo, I don't see the Signal 10 errors on SG-3100 boxes ever getting fixed.
If Snort won't run for you on that hardware, either change hardware, or abandon Snort. Those are really the only two options.
-
@bmeeks said in Snort won't start after upgrade to 21.02 on SG-3100:
or abandon Snort
You’ve probably answered/posted this, but does that apply to Suricata also? That’s probably the easiest pivot and I don’t think we’ve had problems with it on 3100s.
-
@steveits said in Snort won't start after upgrade to 21.02 on SG-3100:
@bmeeks said in Snort won't start after upgrade to 21.02 on SG-3100:
or abandon Snort
You’ve probably answered/posted this, but does that apply to Suricata also? That’s probably the easiest pivot and I don’t think we’ve had problems with it on 3100s.
If Suricata starts throwing Signal 10 errors, then yes, it would be likely unfixed. The most recent PHP patch fixed (or rather worked around) another ARM 32-bit problem with the regular expression engine in PHP. That problem caused Signal 11 (segmentation fault) errors in PHP which then crashed the Suricata PHP code. Signal 10 errors are memory bus hardware errors caused by non-aligned memory access (that is, attempting to access memory locations at addresses not word-aligned).
This is just me speaking and my personal opinion, but the 32-bit ARM platform should never have happened. The compiler tools for it (in FreeBSD, at least) seem ill-suited to compiling some legacy C-code, because they choose to use CPU opcodes that do not perform auto-fixup on non-aligned memory access. There are many legacy C-code programs that do just that (attempt non-aligned memory access). True, these are technically "errors" on the part of the programmer, but as I've stated before, all Intel and AMD hardware automatically fixes these non-aligned access attempts and the binary executable just rolls along (albeit, with a very slightly reduced efficiency due to the auto-fixup). So there is really no incentive for the code to be fixed as it works just fine on all other hardware.
The 32-bit ARM hardware, though, will generate a hardware fault when non-aligned access is attempted with two of its opcodes. The thing that chaps me is every opcode executed by the ARMv7 chip in the SG-3100 will perform auto-fixup of non-aligned access save just two of them, and those two are the ones the compiler chooses to use for some memory operations unless you disable all optimizations. This is what causes the Signal 10 errors in Snort (and possibly Suricata as well). The errors in the C code are not easy to fix, because they are caused by packed data structures, and changing the "packing" very likely will have unintended consequences all over the C code. In short, it ain't an easy fix at all!
Lastly, there is another large looming problem with Suricata on ARM 32-bit applicances. The version is currently 4.1.9 and it is not going to be updated. That's because Suricata versions 5 and higher require Rust, and Rust does not exist for the 32-bit ARM platform.
The SG-3100 is a fine firewall appliance when used as just a firewall. It has problems, though, with certain packages including Snort, Suricata and possibly others. Part of the reason is the more limited RAM, but another is the quirkiness of the 32-bit ARM compiler.
-
Thanks for the info. If I understand correctly, this should be avoided with the patch at:
https://github.com/pfsense/FreeBSD-ports/blob/devel/security/snort/files/patch-pfSense-ARM317.diffHowever, I'm not entirely sure how to verify if that patch is being applied correctly or not.
@rsm4
Are you still getting the signal 10 after rebooting the firewall? -
@marcos-ng said in Snort won't start after upgrade to 21.02 on SG-3100:
@rsm4
Are you still getting the signal 10 after rebooting the firewall?Yes. I just rebooted the firewall and this is how it showed in the logs:
Jun 13 17:03:09 fw snort[96250]: Commencing packet processing (pid=96250)
Jun 13 17:03:36 fw kernel: pid 96250 (snort), jid 0, uid 0: exited on signal 10It automatically started snort on bootup and then it crashed with the signal 10 almost 30 seconds after it started processing packets.
I'm not seeing any PHP crashes since applying the patch diff, but Snort is still broken.
-
@rsm4 Suricata is also having a lot of issues I would avoid.
-
All I can't pinpoint exactly whats wrong but the packages I am having trouble with on the SG-3100 in 21.05 with the patch applied are
- suricata
- pfblockerng-dev
It seems they both stop working without much of an error message other than stopped. Unfortunately my time for troubleshooting has come to an end. And I need to return this connection to production w/o things breaking every 2 or 3 days.
I would really like some insight into what the DEVs are using to test AND what is the recommended HW for small sites. 5100 seems like overkill but we need something that is going to be supported and not dropped after 3 or so years. Juniper/cisco make great firewall only routers, I was led to believe that the SG-3100 could do more than that. But it appears thats going away.
So in small sites what is the recommended HW that devs are testing for?
-
@steveits said in Snort won't start after upgrade to 21.02 on SG-3100:
@styxl said in Snort won't start after upgrade to 21.02 on SG-3100:
is there a fix for this in the hopper
I'm not a dev but if the issue is with PHP then it could go all the way back to Zend to find and fix it. And then it would presumably be in a p2 patch. I'm not optimistic it will be "soon."
@rek0n I haven't done this myself but this seems like the right path:
- get a copy of the 2.4.5 installer from Netgate (go.netgate.com)
- install 2.4.5
- set to Previous Stable Version in System/Update (to install 2.4.5 packages not 2.5)
- install desired packages
- restore configuration
- double check Previous Stable Version is still set
We just bought 2 x SG-3100 with 21.02 how we get the stable version ?
Because snort and suricata dont work and we can't use 2 x SG-3100 without IDS/IPS in our datacenter. -
@hichem 100% honesty. If you just bought them I would return them for the sg-5100. Otherwise you will need to downgrade to 2.4.5p1 by putting a ticket to support. But netgate doesn't seem to be interested in fixing this issue... Currently we are sitting at around 5 months with one patch.
-
@hichem said in Snort won't start after upgrade to 21.02 on SG-3100:
how we get the stable version
Open a ticket per the message you quoted. There is no support contract needed to get the firmware.
@s0m3f00l said in Snort won't start after upgrade to 21.02 on SG-3100:
around 5 months with one patch
For the PHP issue you mean? They've had 21.02-p1 and 21.02.2, and 21.5. I am of the opinion it's lucky Netgate found a way to work around the PHP bug without PHP/Zend fixing it in code.
@s0m3f00l said in Snort won't start after upgrade to 21.02 on SG-3100:
what is the recommended HW for small sites
What bandwidth do you have? If it's low enough the 2100 should be OK otherwise there is a jump to the 5100.
Another option would be any x64 hardware with multiple NICs and the pfSense CE version.
-
For the PHP issue you mean? They've had 21.02-p1 and 21.02.2, and 21.5. I am of the opinion it's lucky Netgate found a way to work around the PHP bug without PHP/Zend fixing it in code.
The first patch for SG-3100 was in 21.05 I am subscribed to the bugs on redmine and I saw it rollout. But honestly I do understand that the underlying issue maybe PHP or some obscure code in the compiler. I just don't see it as my issue. The problem in truth is that I and many others need SNORT/Suricata. The PHP breaks that. I appreciate that you seem to have repaired PHP? But Suricata and Snort (after testing) still appear to be broken for about 4 months now. Even more distressing is, after testing I cannot easily rollback as the old software packages appear to have moved on at this point and the newer ones have issue on the old software.
What bandwidth do you have? If it's low enough the 2100 should be OK otherwise there is a jump to the 5100.
Bursting to 1 Gig overnight as I said in the other thread. But again I think there is miscommunication here @steveits
The real question is what HW is being developed on so I can avoid any issue like this in the future. Is it the ARM64? AMD64? ARM32? SG-5100? Third party Hardware? If you tell me what is being developed for I would be more than happy to buy it. The SG-3100 I bought in 2018 is just a firewall now. I could have gone in the USG direction if I had known that we were losing support. And deprecating the old packages so they cannot be installed. I appreciate that you are trying to help, but I do not want to have to change platforms this quickly in the future. -
@s0m3f00l Oh I understand, we have primarily sold the 3100 to clients, and have one in use ourselves. We haven't tried to upgrade any of those to 21.x yet. We've sold a few 2100s since their introduction last fall, and it would be great if there was a 4100 but that doesn't exist.
These issues are all specific to the 3100 which is a 32 bit ARM CPU. I think the discontinued SG-1000 was also 32 bit (?) but the rest are either 64 bit ARM or Intel.
The issue I think (and I'm not a dev just a customer/partner) is the 32 bit ARM CPU is not in wide usage so is basically an edge case for the people developing FreeBSD and these other projects. As bmeeks pointed out there are compiler issues which is coming from the FreeBSD developers, and if they don't dedicate time to fixing it then it's not really something Netgate can fix. The PHP bug is new and apparently tied to the JIT (just in time) precompiler for PCRE (regular expression engine), and since it's coming from Zend/PHP they need to allocate resources to test on it. Likewise the newer Suricata changed languages to Rust which no one's ported to 32 bit ARM.
The 64 bit ARM (specific to Netgate hardware) and 64 bit Intel/AMD (Netgate or third party) are all fine. If you want to look at third party hardware, FreeBSD 12.2 will have a compatibility list somewhere, but offhand I'd say Intel NICs are widely recommended here and the rest is probably not of any significant concern.
-
@steveits said in Snort won't start after upgrade to 21.02 on SG-3100:
@hichem said in Snort won't start after upgrade to 21.02 on SG-3100:
how we get the stable version
Open a ticket per the message you quoted. There is no support contract needed to get the firmware.
I find only this url : https://netgate.myfreshworks.com/login?redirect_uri=https%3A%2F%2Fnetgate.myfreshworks.com%2F
But can't create account for opening ticket.
-
@hichem I see what you mean. Try https://go.netgate.com/support/home?login_error=true.
-
@steveits said in Snort won't start after upgrade to 21.02 on SG-3100:
@hichem I see what you mean. Try https://go.netgate.com/support/home?login_error=true.
Tkx Steveits.
-
I opened a separate bug to cover this as it was getting conflated with the PHP issue whoch is a separate (and solvable) problem:
https://redmine.pfsense.org/issues/12157Steve