SG-1100 Seizing Up
-
Hi. I'm a relatively new pfSense user.
I bought an SG-1100 a few months ago, and was using it between the rest of my home network and my primary work-from-home machine to try it out. So yes, double-NATted and all that. No substantial changes beyond setting up my DHCP address range for the downstream LAN. No problems.
Wanting to move on from my Ubiquiti/UniFi based network, this past weekend I took out my USG as the primary gateway/router/NAT device and replaced it with my SG-1100. Again no big changes to the default settings, just DHCP downstream LAN assignment and told it to pass the IP address of my pihole as the primary DNS server to DHCP clients. Also assigned some static IP address assignments for a few particular hosts, by MAC.
Today I'm getting some random Internet outages, and I'm not sure why. They seem to last a few minutes, then disappear.
At the same time as the outage, I'm having intermittent trouble getting the pfSense + web GUI to respond. In some cases I have the Status/Traffic Graph up and the sliding graph stops moving but the Bandwidth In and Bandwidth Out numbers keep changing. Other times I can't get anything from the GUI, and after a while I get "504 Gateway Time-out/nginx" errors.
Suggestions for better troubleshooting? What to do the next time things seize up (like it did just now while I'm typing this)? Seizing up has happened at least 4-5 times today. Other work-from-home and school-from-home family users are not amused.
Right now the web GUI will only reply with 504 or 502 errors. I am able to successfully ping external addresses (by name) although the ping response is less than perfect (like 20% loss to 8.8.8.8 over 100 pings).
This time it wasn't coming back on its own. I've power cycled the SG-1100, so maybe in a minute or two I can post this plea for help.
System log has stuff in it, like this:
May 4 14:05:46 check_reload_status 401 Could not connect to /var/run/php-fpm.socket
May 4 14:05:46 check_reload_status 401 Could not connect to /var/run/php-fpm.socket
May 4 14:05:46 check_reload_status 401 Could not connect to /var/run/php-fpm.socket
May 4 14:05:46 php-fpm 39077 /rc.linkup: DEVD Ethernet detached event for wan
May 4 14:05:47 check_reload_status 401 Could not connect to /var/run/php-fpm.socket
May 4 14:05:47 check_reload_status 401 Could not connect to /var/run/php-fpm.socketHas probably 100s of the "Could not connect" errors.
Also saw this amidst all the "Could not connect" errors:
May 4 14:05:50 check_reload_status 401 Could not connect to /var/run/php-fpm.socket
May 4 14:05:50 rc.gateway_alarm 40704 >>> Gateway alarm: WAN_DHCP (Addr:69.47.200.1 Alarm:1 RTT:14.584ms RTTsd:7.253ms Loss:28%)
May 4 14:05:50 check_reload_status 401 updating dyndns WAN_DHCP
May 4 14:05:50 check_reload_status 401 Restarting ipsec tunnels
May 4 14:05:50 check_reload_status 401 Restarting OpenVPN tunnels/interfaces
May 4 14:05:51 check_reload_status 401 Could not connect to /var/run/php-fpm.socket -
@rushtone it could be some bugs or some servers are down or blocked. My client had this probs once, instead wasting your time, just fresh install pfsense. Make sure you put on the right nic port on ubiquiti/unifi, some of them require you to plug your pfsense next to their recommended port.
-
@rushtone said in SG-1100 Seizing Up:
May 4 14:05:46 php-fpm 39077 /rc.linkup: DEVD Ethernet detached event for wa
Both fragments mention an important event :
The first one :May 4 14:05:46 php-fpm 39077 /rc.linkup: DEVD Ethernet detached event for wan
This means the WAN interface went down. Down like : someone ripped out the cable (the hardware event).
or : dpinger, the gateway tester found so many ICMP (ping) losses that it decides that your uplink is so bad (or so saturated) that it executed a 'soft ware down-up event on the WAN interface).When interfaces a re triggered that way, all kind of processes get restarted, and the 1100 has a lot of work to do.
Check why you have so many ping losses might be the part of the issue.The second log snippet is also shows the end of a WAN down-up event :
May 4 14:05:50 check_reload_status 401 Restarting ipsec tunnels May 4 14:05:50 check_reload_status 401 Restarting OpenVPN tunnels/interfaces
Do this test : http://www.dslreports.com/speedtest
Also :
cable up your 'console' connection right away.
And use (access) the console. It's you live line.
Remember : the GUI is only for those days that everything works and is ok. And even in that case, the GUI is thus pretty worthless.Activate also the SSH access. It's the next best access.
You'll see the same menu as the console access.When you notice the GUI doesn't react anymore, use the console or SSH, enter option 8 and type
top
You can see with great detail what running - and also (use your brain) whats not running (any more).
With the main menu, you could restart the GUI (web server and PHP).
Etc.You could and should use the console or SSH access to shut down or reboot the SG1100.
Never ever again rip out the power. Check out the many other forum posts about what can happen if you do that. Just ... believe me, don't remove the power like that.
pfSense is not a light bulb.Can you confirm that you are using a vanilla pfSense ?
You're not using any of these pfSense packages that can really stress all the resources.
After all : the 1100 is using a very small processor. It's a device that proofs that you can run a 'real' firewall router with a 2 square millimeter processor. -
@gertjan Thank you for the detail and suggestions.
I will cable up the console so I'm ready next time.
Understood about politely shutting down the hardware.
Yes, I'm running vanilla pfSense, I've run update from the GUI once or twice when there was a new version.
I think that "darkstat" is the only extra package I have explicitly installed (and that was just last night). Just looking for something to display bandwidth in use.
Yes, the SG-1100 is the low end of the Netgate boxes, but it should be more than adequate for my SOHO network (only 60 Mbps service from ISP).