Snort won't start after upgrade to 21.02 on SG-3100
-
@teamits said in Snort won't start after upgrade to 21.02 on SG-3100:
Just to be clear this issue affects Suricata also?
Yes, it likely can. No direct reports yet, but I would not be surprised. Especially in the case of the PHP code that is causing the Signal 11 segfault in PHP. That code is identical in the two packages. And it's not really the code itself that is the problem. It's the PHP engine underneath that is crashing. The same GUI code is used in all of the pfSense images (CE and pfSense+, and for all hardware platforms such as aarch64, x86-64 and arm. So if the upper-level GUI code was faulty, you would expect it to fail on all the platforms. That's not the case. It is only failing on the ARM 32-bit platform.
-
Yeah, I saw your posts in other threads, after mine. Sure sounds like PHP. Unfortunately that presumably means it's on someone upstream to fix it which means a 21.02p2. Good sleuthing.
-
Just for some analytics on Snort and PHp crashes, I ran a Splunk query against my syslogs from the router going back 30 days to see what was logged:
2021-02-18T14:02:43.000-0500,Feb 18 14:02:43 kernel: pid 46827 (snort) uid 0: exited on signal 10 2021-02-18T14:07:54.000-0500,Feb 18 14:07:54 kernel: pid 56899 (snort) uid 0: exited on signal 10 2021-02-18T14:10:04.000-0500,Feb 18 14:10:04 kernel: pid 21583 (snort) uid 0: exited on signal 10 2021-02-18T14:13:13.000-0500,Feb 18 14:13:13 kernel: pid 78778 (php-cgi) uid 0: exited on signal 11 (core dumped) 2021-02-18T14:18:38.000-0500,Feb 18 14:18:38 kernel: pid 1154 (snort) uid 0: exited on signal 10 2021-02-18T14:23:24.000-0500,Feb 18 14:23:24 kernel: pid 75058 (snort) uid 0: exited on signal 10 2021-02-18T14:26:01.000-0500,Feb 18 14:26:01 kernel: pid 26020 (snort) uid 0: exited on signal 10 2021-02-18T14:26:30.000-0500,Feb 18 14:26:30 kernel: pid 97052 (snort) uid 0: exited on signal 10 2021-02-18T14:29:12.000-0500,Feb 18 14:29:12 kernel: pid 84487 (snort) uid 0: exited on signal 10 2021-02-18T14:53:13.000-0500,Feb 18 14:53:13 kernel: pid 63165 (snort) uid 0: exited on signal 10 2021-02-18T14:55:54.000-0500,Feb 18 14:55:54 kernel: pid 64348 (snort) uid 0: exited on signal 10 2021-02-18T15:04:04.000-0500,Feb 18 15:04:04 kernel: pid 17533 (php-fpm) uid 0: exited on signal 11 (core dumped) 2021-02-18T15:05:49.000-0500,Feb 18 15:05:49 kernel: pid 9318 (snort) uid 0: exited on signal 10 2021-02-18T15:11:43.000-0500,Feb 18 15:11:43 kernel: pid 65338 (snort) uid 0: exited on signal 10 2021-02-18T15:22:04.000-0500,Feb 18 15:22:04 kernel: pid 24027 (snort) uid 0: exited on signal 10 2021-02-18T19:21:13.000-0500,Feb 18 19:21:13 kernel: pid 5625 (snort) uid 0: exited on signal 10 2021-02-20T10:22:02.000-0500,Feb 20 10:22:02 kernel: pid 42369 (php-cgi) uid 0: exited on signal 11 (core dumped) 2021-02-20T10:24:28.000-0500,Feb 20 10:24:28 kernel: pid 74738 (snort) uid 0: exited on signal 10 2021-02-21T16:06:59.000-0500,Feb 21 16:06:59 kernel: pid 30776 (snort) uid 0: exited on signal 10 2021-02-21T16:38:27.000-0500,Feb 21 16:38:27 kernel: pid 75666 (snort) uid 0: exited on signal 10 2021-02-21T19:28:30.000-0500,Feb 21 19:28:30 kernel: pid 67353 (snort) uid 0: exited on signal 10 2021-02-21T19:43:31.000-0500,Feb 21 19:43:31 kernel: pid 86017 (snort) uid 0: exited on signal 10 2021-02-21T19:48:33.000-0500,Feb 21 19:48:33 kernel: pid 81269 (snort) uid 0: exited on signal 10 2021-02-24T01:36:24.000-0500,Feb 24 01:36:24 kernel: pid 81513 (snort) uid 0: exited on signal 11 2021-02-24T01:36:26.000-0500,Feb 24 01:36:26 kernel: pid 62078 (snort) uid 0: exited on signal 11 2021-02-25T22:28:10.000-0500,Feb 25 22:28:10 kernel: pid 78826 (php-fpm) uid 0: exited on signal 11 (core dumped) 2021-02-25T22:29:59.000-0500,Feb 25 22:29:59 kernel: pid 73568 (php-fpm) uid 0: exited on signal 11 (core dumped) 2021-02-25T22:34:50.000-0500,Feb 25 22:34:50 kernel: pid 77758 (php-fpm) uid 0: exited on signal 11 (core dumped) 2021-02-25T22:35:09.000-0500,Feb 25 22:35:09 kernel: pid 28596 (snort) uid 0: exited on signal 11 2021-02-25T22:35:11.000-0500,Feb 25 22:35:11 kernel: pid 28801 (snort) uid 0: exited on signal 11 2021-02-25T23:15:01.000-0500,Feb 25 23:15:01 kernel: pid 60474 (snort) uid 0: exited on signal 10 2021-02-26T07:15:01.000-0500,Feb 26 07:15:01 kernel: pid 32892 (snort) uid 0: exited on signal 10 2021-02-27T09:24:14.000-0500,Feb 27 09:24:14 kernel: pid 45195 (snort) uid 0: exited on signal 10 2021-02-28T10:15:01.000-0500,Feb 28 10:15:01 kernel: pid 53195 (snort) uid 0: exited on signal 10 2021-02-28T21:12:07.000-0500,Feb 28 21:12:07 kernel: pid 21339 (snort) uid 0: exited on signal 10
Feb 18 was the day I upgraded my SG-3100, so, as expected, no crashes were occurring prior to that. (I can go back further--there were no crashes in the last 180 days)
Also, it's not clear to me that the PHP and Snort crashes are related. I have a few PHP crashes and some seem coincident with Snort, but Snort definitely crashes without PHP crashing.
My SG-3100 is removed from service so now I can start Snort and it won't crash until later. Incidentally, the pfBlockerNG update will almost always cause it to crash. This could be because Snort has traffic to scan at this point. (Traffic to the WebConfigurator doesn't count? Perhaps because it is SSL?)
-
is there a fix for this in the hopper? and when should we expect it? i had to rollback to 2.4.5p1 thus i am still waiting for the fix
-
Im having the same problem aswell, SG-3100 on 21.02-p1. Snort is not visible anymore. I managed to uninstall it the package but when trying to reinstall it it just hangs on "Please wait while the update system initializes". Is it possible to revert to 2.4.5? And if so, how can I do this? It seems there is no official rollback feature that I can use to perform a rollback.
-
@styxl said in Snort won't start after upgrade to 21.02 on SG-3100:
is there a fix for this in the hopper
I'm not a dev but if the issue is with PHP then it could go all the way back to Zend to find and fix it. And then it would presumably be in a p2 patch. I'm not optimistic it will be "soon."
@rek0n I haven't done this myself but this seems like the right path:
- get a copy of the 2.4.5 installer from Netgate (go.netgate.com)
- install 2.4.5
- set to Previous Stable Version in System/Update (to install 2.4.5 packages not 2.5)
- install desired packages
- restore configuration
- double check Previous Stable Version is still set
-
@teamits It seems that since I do not have an active subscription, so I am unable to create a ticket. I tried searching ftp repositories, and discovered that there are community releases, but these seem all to be amd64. Since I have the SG-3100 that is on arm64 Im wondering where I could obtain the SG-3100 arm64 2.4.5 image..
-
There have been no news as of yet. To roll back, you may create an account and request the previous stable version. A support subscription is not required.
-
You can follow the status of the bug here: https://redmine.pfsense.org/issues/11466.
As you can see in the notes, I worked on finding the bug as far as I could go. The issue is within PHP itself, and appears limited to the 32-bit ARM processor in the SG-3100. I say this because the identically same PHP code runs without issue on 64-bit ARM hardware such as the SG-1100 and also on all Intel hardware. PHP crashing is why Snort won't start.
-
It is sad we are being forced to choose between security/stability and access. The SG-3100 is a good viable product and a lot of Small to Medium Enterprises use it. I am still surprised that Netgate is yet to patch this knowing a lot of their SG-3100 installs are using SNORT as an IPS/IDS. Personally i chose to rollback and wait until a patch is published but one of my peers decided to ditch the SG-3100 and buy an SG-5100.... unfortunately i dont have that kind of money.
-
@styxl said in Snort won't start after upgrade to 21.02 on SG-3100:
Netgate is yet to patch this
But that's the thing, if it's a PHP problem Netgate may have to wait for Zend to fix it? Zend tends to update once a month and the March 4 update just came out. Plus if it's a compilation bug and not a code bug I would think that makes it harder. And if a new PHP is included in a 21.2-p2 then Netgate would presumably need to test all of pfSense before release. (I don't have any inside knowledge of this, I'm just connecting dots.)
-
I too have run into this issue. I spent too much time trying to get this to work before coming across this post. Netgate should flag the upgrade and caution Snort users that it will break their setup. Does anyone have any idea when this will be fixed? I don't want have to pull out the SG-3100 out of my environment. I think it's a great product.
Thanks
-
Having the same issue, as I understood from the bug report it seems to be more of an PHP issue now. Too bad Netgate did not warn us before upgrading.
-
For the PHP crashes, try the patch to disable PHP PCRE JIT on #11466 Note 32.
You can install the System Patches package and then create an entry for the patch URL
https://redmine.pfsense.org/attachments/download/3707/patch-disable-pcrejit-arm.diff
to apply the fix.Then run console menu options 16 and 11 to restart PHP and the GUI, or reboot.
-
@jimp I have applied the patch and will begin testing. I use Suricata, OPENVPN, and PfBlockerNG so hopefully this fixes things and we can mark it solved.
-
This post is deleted! -
This sounds like it fixes the PHP issue, but what about the Snort issue? These are two separate problems on the SG-3100. PHP is crashing, but also, on my SG-3100, Snort starts and then Snort core dumps when it starts looking at traffic.
I applied the diff above and restarted PHP and the GUI. I no longer see the PHP errors (in the few hours since I have applied it), but I still see Snort crashing:
"Jun 12 10:21:42 sg3100 kernel: pid 92832 (snort), jid 0, uid 0: exited on signal 10"
My SG-2220 (Intel-based) of course is working fine with Snort.
-
@rsm4 said in Snort won't start after upgrade to 21.02 on SG-3100:
This sounds like it fixes the PHP issue, but what about the Snort issue? These are two separate problems on the SG-3100. PHP is crashing, but also, on my SG-3100, Snort starts and then Snort core dumps when it starts looking at traffic.
I applied the diff above and restarted PHP and the GUI. I no longer see the PHP errors (in the few hours since I have applied it), but I still see Snort crashing:
"Jun 12 10:21:42 sg3100 kernel: pid 92832 (snort), jid 0, uid 0: exited on signal 10"
My SG-2220 (Intel-based) of course is working fine with Snort.
Snort on the SG-3100 is likely never going to be 100% fixed. The Signal 10 error is due to choices made by the compiler used for 32-bit ARM processors. The compiler is choosing to use a pair of memory load/store opcodes that do not perform automatic fixup of non-aligned memory access. This happens in the Snort binary due to the way certain data structures are packed in memory and then later accessed via pointers with casting. Intel and AMD processors (along with the 64-bit ARM models) perform the auto-fixup always, and thus the memory alignment hardware bus error (Signal 10 error) is avoided with that hardware.
That means the flaky Snort binary code works fine on all Intel and AMD hardware, and lots of other 64-bit stuff as well. There is thus zero incentive for the Snort team to spend the time and energy required to refactor the Snort binary code so that it works in the edge case with 32-bit ARM hardware. Ergo, I don't see the Signal 10 errors on SG-3100 boxes ever getting fixed.
If Snort won't run for you on that hardware, either change hardware, or abandon Snort. Those are really the only two options.
-
@bmeeks said in Snort won't start after upgrade to 21.02 on SG-3100:
or abandon Snort
You’ve probably answered/posted this, but does that apply to Suricata also? That’s probably the easiest pivot and I don’t think we’ve had problems with it on 3100s.
-
@steveits said in Snort won't start after upgrade to 21.02 on SG-3100:
@bmeeks said in Snort won't start after upgrade to 21.02 on SG-3100:
or abandon Snort
You’ve probably answered/posted this, but does that apply to Suricata also? That’s probably the easiest pivot and I don’t think we’ve had problems with it on 3100s.
If Suricata starts throwing Signal 10 errors, then yes, it would be likely unfixed. The most recent PHP patch fixed (or rather worked around) another ARM 32-bit problem with the regular expression engine in PHP. That problem caused Signal 11 (segmentation fault) errors in PHP which then crashed the Suricata PHP code. Signal 10 errors are memory bus hardware errors caused by non-aligned memory access (that is, attempting to access memory locations at addresses not word-aligned).
This is just me speaking and my personal opinion, but the 32-bit ARM platform should never have happened. The compiler tools for it (in FreeBSD, at least) seem ill-suited to compiling some legacy C-code, because they choose to use CPU opcodes that do not perform auto-fixup on non-aligned memory access. There are many legacy C-code programs that do just that (attempt non-aligned memory access). True, these are technically "errors" on the part of the programmer, but as I've stated before, all Intel and AMD hardware automatically fixes these non-aligned access attempts and the binary executable just rolls along (albeit, with a very slightly reduced efficiency due to the auto-fixup). So there is really no incentive for the code to be fixed as it works just fine on all other hardware.
The 32-bit ARM hardware, though, will generate a hardware fault when non-aligned access is attempted with two of its opcodes. The thing that chaps me is every opcode executed by the ARMv7 chip in the SG-3100 will perform auto-fixup of non-aligned access save just two of them, and those two are the ones the compiler chooses to use for some memory operations unless you disable all optimizations. This is what causes the Signal 10 errors in Snort (and possibly Suricata as well). The errors in the C code are not easy to fix, because they are caused by packed data structures, and changing the "packing" very likely will have unintended consequences all over the C code. In short, it ain't an easy fix at all!
Lastly, there is another large looming problem with Suricata on ARM 32-bit applicances. The version is currently 4.1.9 and it is not going to be updated. That's because Suricata versions 5 and higher require Rust, and Rust does not exist for the 32-bit ARM platform.
The SG-3100 is a fine firewall appliance when used as just a firewall. It has problems, though, with certain packages including Snort, Suricata and possibly others. Part of the reason is the more limited RAM, but another is the quirkiness of the 32-bit ARM compiler.