Netgate SG-3100 LEDs
-
I've updated my script for the gatewaystatus returned by pfSense+ 21.05 and to allow the specification of the gateways to monitor on the command line. It's no longer necessary to edit the script.
In my case the cron command looks like:
/root/gw_leds -v -a WAN_OTTC_DHCP -b WAN_EA_DHCP
Which means that LED a (left-most) monitors WAN_OTTC_DHCP gateway and LED b (middle) monitor WAN_EA_DHCP gateway.
-
I bit more hacking on the script today.
I made it more modular.
I disable PWM mode on any LEDs we are using which disables the slow blink. Maybe I should make that configurable.
I also added -A, -B and -C options to set a fixed color for a given LED give a list of 3 comma-separated numbers. Mostly for testing.
Off to do non-computer stuff on a Saturday.
-
New script worked great. I only have one gateway so I used
/root/gw_leds -b WAN_DHCP -A 0,0,16 -C 0,0,16
This gave me light blue (undetermined) for first and last led and current status on the center led.
The only issue I had was that I couldn't remember how to upload a file to root.
SFTP of course. I just don't do it very often.
-
@wgstarks I decided not to be too clear on that to make sure people knew enough of what they were doing to figure that out.
-
@jchonig I raised an issue on GitHub but I’m posting here as well. Looks like this script may be causing problems with large numbers of pipes being left open which results in a constant stream of errors
kern.ipc.maxpipekva exceeded; see tuning (7)
More details here and especially here.
-
@wgstarks I think this will work to keep the script from starting if the previous run did not complete. This isn't a fix, but it will keep the system from failing
Change the cronjob to start with
/usr/bin/lockf -s 1 /var/run/gw_leds.lock
./usr/bin/lockf -s 1 /var/run/gw_leds.lock /root/gw_leds -a WAN_OTTC_DHCP -b WAN_EA_DHCP
Or stop running the script.
The issue seems to be that the
sysctl
command is hanging and cron is starting the script again in 60 seconds. -
@jchonig said in Netgate SG-3100 LEDs:
@wgstarks I think this will work to keep the script from starting if the previous run did not complete. This isn't a fix, but it will keep the system from failing
Change the cronjob to start with
/usr/bin/lockf /var/run/gw_leds.lock
./usr/bin/lockf /var/run/gw_leds.lock /root/gw_leds -a WAN_OTTC_DHCP -b WAN_EA_DHCP
Or stop running the script.
The issue seems to be that the
sysctl
command is hanging and cron is starting the script again in 60 seconds.Just to be sure I understand, this will kill one run before it starts the next?
-
@wgstarks No, this will prevent another from starting. And see the edit I'm going to make in a minute to add a timeout.
-
Great script! Thanks a lot for that!
I would have now monitored my VDSL and LTE line. Would be happy, if the last LED could pulsate as before :-)
It also would be great if their is a mechanism enabled which prevents the described „deadlock“ as i would like to decrease the interval/cronjob to maybe 15 or 30 seconds. -
Update from my side: After almost 12h runninh the cron job the system became unresponsive.
-
@renegade Are you using lockf in your cron script? That's supposed to prevent it from consuming resources.
I'm pretty sure the root problem is a kernel bug causing the sysctl and gpioctl commands to hang. I need to find the time to do some debugging.
-
@jchonig Not yet - I will give it a try I suppose :)
-
@jchonig said in Netgate SG-3100 LEDs:
@renegade Are you using lockf in your cron script? That's supposed to prevent it from consuming resources.
I'm pretty sure the root problem is a kernel bug causing the sysctl and gpioctl commands to hang. I need to find the time to do some debugging.
This worked for me for about 18 hours but now the system is completely locked up with the same error so lockf doesn’t appear to do the trick.
Edit: Here is the command I was using (just for reference)-
/usr/bin/lockf /var/run/gw_leds.lock /root/gw_leds -b WAN_DHCP -A 0,0,16 -C 0,0,16