SG-3100 Very Unstable
-
@0daymaster said in SG-3100 Very Unstable:
Thanks, but my boss would never even consider running a binary from anywhere besides the Netgate repos. I will have to look into getting a recompiled clamd binary from Netgate.
I've not used the Squid package on pfSense. Is clamd bundled with that, or did you load clamd from an independent repository? I'm guessing based on your quote above that clamd is part of the Squid package. If so, the pfSense Team can modify the compiler config file for clamd so that compiler optimizations are disabled when compiling for armv6 hardware. Try working through their Support Team. I will also drop a note to one of the Netgate team members about this issue.
-
We're aware and working on it. New version of clamav for ARM should be up at some point today for testing on 2.4.5. If it's stable there, we'll copy it back to 2.4.4.
-
@jimp said in SG-3100 Very Unstable:
We're aware and working on it. New version of clamav for ARM should be up at some point today for testing on 2.4.5. If it's stable there, we'll copy it back to 2.4.4.
Thanks Jim! Ignore my email note. I had just hit SEND when I got another email notice of your reply here.
-
You should now see squid pkg version 0.4.44_7 which includes the recompiled clamav package.
-
Thanks jimp! Just updated.
-
Recently I have also been experiencing issues with my SG-3100. I tried a fresh install and then discovered the unbound memory leak, which I updated per negate guidance. Despite these efforts, my device continues to restart after a few days of use. Current packages include snort, pfBlockerNG-Devel, Acme, and OpenVPN exporter. No restarts have provided any useful logs which I believe may suggest a potential hardware issue. I checked the physical layer and also changed DNS servers. Here is a snapshot of my system log - can anyone explain why e6000sw0port3 keeps going up/down:
Time Process PID Message
Nov 16 03:09:41 check_reload_status Reloading filter
Nov 15 19:09:40 kernel e6000sw0port3: link state changed to UP
Nov 16 03:09:40 check_reload_status Linkup starting e6000sw0port3
Nov 16 03:09:38 check_reload_status Reloading filter
Nov 15 19:09:37 kernel e6000sw0port3: link state changed to DOWN
Nov 16 03:09:37 check_reload_status Linkup starting e6000sw0port3
Nov 16 03:09:34 check_reload_status Reloading filter
Nov 15 19:09:33 kernel e6000sw0port3: link state changed to UP
Nov 16 03:09:33 check_reload_status Linkup starting e6000sw0port3
Nov 16 03:09:32 check_reload_status Reloading filter
Nov 15 19:09:31 kernel e6000sw0port3: link state changed to DOWN
Nov 16 03:09:31 check_reload_status Linkup starting e6000sw0port3
Nov 16 03:07:41 check_reload_status Reloading filter
Nov 15 19:07:40 kernel e6000sw0port3: link state changed to UP
Nov 16 03:07:40 check_reload_status Linkup starting e6000sw0port3
Nov 16 03:07:38 check_reload_status Reloading filter
Nov 15 19:07:37 kernel e6000sw0port3: link state changed to DOWN
Nov 16 03:07:37 check_reload_status Linkup starting e6000sw0port3 -
@m3nt0r123 said in SG-3100 Very Unstable:
can anyone explain why e6000sw0port3 keeps going up/down:
Do you have a DHCP WAN with advanced options enabled, perhaps?
https://redmine.pfsense.org/issues/8507#note-15 -
No. I have not touched the Advanced Configuration checkbox under WAN.
-
There are definitely issues with the SG-3100's. I have one in my office, and one at a customer location. Both have problems with rebooting randomly. I have around 25 Netgate firewalls in the field that I manage now, and these are the only units that randomly reboot. Both units are running Suricata (Balanced) and OpenVPN. That's it. There's nothing unusual in the logs. For instance, mine reset last night a little after 1am. There were no log entries over an hour before that. Just goes down with no warning and comes back up, almost as if the power was cycled, but it wasn't. It hasn't been a huge inconvenience for me because it doesn't do it that often but I'm really glad I didn't place a lot of these with customers. With Netgates "We're not cross shipping you a firewall it unless you spend a ridiculous amount of money on a support contract.", or "Buy another one." policy, it's not exactly easy to get a replacement either even if it is approved for an RMA. So far with these, I've just been dealing with it but I will definitely not be buying any more. Currently both these units are on 2.4.4, but it doesn't really matter what version they are running, they experience the same behavior. Sometimes it's a few days, sometimes a few weeks but sooner or later they experience this reset.
-
Now I have been experiencing an issue with Rate chewing up processing power randomly. If I reboot the unit, things run smooth for a couple of days, but invariably the issue reoccurs. I hate to say it but I think pfSense on ARMv7 was a mistake.
-
I would look at heat dissipation for issues both of you have described.
The unit's have a passive heat sink (the bottom of the unit), and if in a rack with a limited amount of air flow might cause the random lockups. Try increasing the space in between the unit and the device below, if it's on a rack shelf try putting something under to raise the distance a few inches, or provide air flow over/under the unit.
I would also advocate updating to the latest version; there was an issue with unbound.
-
I have 6 SG-3100 running in different locations without any issues, chilled Server racks of course.
-Rico
-
I have 2 in different locations - both in idf rooms which are ACd of course as well. What does the sg3100's say their temp is - does it slowly rise?
-
My office unit is in a ventilated temp controlled room (69F), and not stacked on any other equipment. The running temp sits at around 52c, under load (150Mbps with Suricata 90% CPU) it can hit around 60C. Temp was my first concern given there are no fans on these units, but compared to other Netgate units I've used those temps are not out of wack. Also, in my case the reboots don't necessarily happen under high load. If it was a thermal issue, then I should be able to reproduce the problem by throwing a bunch of traffic at it. Not the case here, it can be doing nothing and cycle or it can be doing a lot of something and cycle. The other one is in a similar environment.
Maybe a lemon batch? All I know is that the other units I have in production (SG-2440, SG-4860 1U, SG-5100, XG-7100 1U), do not exhibit this behavior. So I'm sticking to my guns on this one, and still say these are flawed. I don't doubt that some of you don't have issues with these, but I do (apparently so do other people) and given the behavior it's most likely hardware related.
-
Let's open a ticket at https://go.netgate.com let us investigate this for you.
-
@sparklan thank you for chiming in; hopefully @chrismacmahon can help find a solution. I am now going on two months with these issues - my SG-3100 is hardly a year old. I wonder if Netgate could just exchange them out?
-
The WAN going up and down could be a modem/ISP related issue. We have seen this from time to time. Is it possible to put in a DUMB switch in front of the SG3100?
Another option is to try and select the media state from "autoselect" to "default" and see if that helps. Sparklan's issue is separate from yours.
-
@chrismacmahon thanks. I don’t have a dumb switch on hand but will give your other suggestion a try. Thanks.
-
@chrismacmahon perhaps a dumb question, but I see no option to modify the media state?
-
You do it in the WAN interface configuration. (Interfaces > WAN)