[FIXED] sg-1000 crashes and reboots with traffic shaping on



  • i setup a new sg-1000 for a remote client, he has just one openvpn tunnel running and no packages installed, just yesterday i configured traffic shaping for him and it seems to crash and reboot when there is heavy traffic through it, his internet speed is around 70mbps down and 19mbps up.

    it has crashed almost 3 times so far in 24 hours and being used remotely im not able to get any crash logs also, any1 else having this issue?



  • its happenning every now and then but i think due to the watchdogd timer it reboots the machine, any way i can get a crash dump or something so can post here?



  • May be LOW memory problem, monitor RAM usage.



  • im using ramdisk and i suspected the same so for now i disabled ramdisk so lets see what happens but considering i have no packages installed and a pretty basic setup i doubt that would be the issue, ill report back my findings.

    i dont even store RRD data or any other log


  • Netgate Administrator

    You mean it just stops logging completely and stops recording any monitoring data? That was working previously I assume?

    Can you setup a console to log that?

    Steve



  • Well after running for a few hours it just reboots randomly and this started happening after I configured traffic shaping with just a few queues and rules. So far it was running var and tmp in ram so nothing was saved but for now I have disabled ramdisk so let's see if something gets saved in the logs when it crashes.

    This is in a remote location few thousand miles away so I have no means to connect a console cable and keep monitoring, i hope having disabled ramdisk it saves to log any crash data but for now this will be the last unit of this under powered device I'll be ordering, apu2 performs a lot better in this regards :)



  • again it crashed and still nothing in logs and RAM usage isnt much, probably just 19%, my next guess is the watchdog timer might be not working correctly so for now i have disabled that even



  • it still keeps rebooting, im out of ideas, can any1 tell me whats wrong? cant this box handle even traffic shaping?

    ill try disabling traffic shaping and see if it still happens coz all this started after configuring that only



  • now it crashed and corrupted the system before i could disable traffic shaping, now the pains taking procedure to recover it something so far



  • after setting up from scratch and testing it thoroughly it seems traffic shaper is broken on sg-1000 and thats the single cause of if it rebooting abnormally



  • it seems this happens on APU2 also but in multiwan and single lan type setups coz i have 2 apu2s, one has single wan/lan and it doesnt happen there but happens on the other box, my second guess it it might happen on single wan/lan also provided if the wan download speed is over 50mbps, all boxes have exact same floating match rules and exact same queue lengths just the download and upload speeds r different.

    as soon as i run a speedtest on any one machine which saturates the line, firewall reboots, probably HSFC isnt able to handle it or something else, any pointers would be appreciated.

    do i need to change any kernel timings etc or the default r fine for sg-1000 and apu2 4gb ram model?


  • Netgate Administrator

    Still no logged errors or crash report from the APU2 either?

    Steve



  • nothing at all, strange thing is if internet download is below 50mbps it doesnt happen, anything above 50 or 60mbps it happens instantly and that too everytime so it isnt a hit and miss situation


  • Netgate Administrator

    Nothing shown on the console when that happens?

    There have been some issues with traffic shaping, particularly with igb (are they igb on the APU2?) but it always creates a crash report.

    Spontaneous rebooting with no logs and no report is usually a hardware issue though that seems unlikely if you only see it when shaping is enabled.

    Steve



  • try swapping the psu, I dont know what psu these devices use I guess is external psu adaptor, but in computers often a spontaneous reboot is power delivery related.



  • At first I even thought it could be hardware or power related issue and I tried a different power supply when but that didn't solve it. I first discovered this issue on the sg-1000 which is hardly 3 months old, tried on my apu2 and it didn't happen coz my download speeds are just 12mbps, then recently I bought a new apu2 for a client and configured it just a week back and happened to him as well once download speeds go over 50mbps and it happens always without fail every single time when I exceed that speed. Now it can't be that all the devices are faulty or have a power related issue considering the shaper and floating match rules are exactly same on all. Yes the apu2 have igb nics but the sg-1000 doesn't and happens there as well. Currently after disabling traffic shaping all the 3 devices run without issues for days, the sg-1000 and the first apu2 have been working well for weeks without a reboot.



  • I'll get the client to attach a serial console and try to get crash logs if any are generated.


  • Netgate Administrator

    What shaper settings are you actually using there? Same of both SG-1000 and APU2?

    Nothing on the console of the APU2 either?

    Steve



  • there was a lightning strike so clients vlan switch is busted so as soon as his apu2 is up ill get log from serial console for the, regarding shaper queue config all the boxes have the this setup, just the download and upload speeds are different:

     altq on pppoe1 hfsc bandwidth 30Mb queue {  qInternet  } 
     queue qInternet on pppoe1 bandwidth 30Mb hfsc (  codel  , linkshare 30Mb  , upperlimit 30Mb  )  {  qACK,  qOthersDefault,  qP2P,  qVoIP,  qOthersHigh  } 
     queue qACK on pppoe1 bandwidth 20% hfsc (  codel  , linkshare 20%  )  
     queue qOthersDefault on pppoe1 bandwidth 20% hfsc (  codel  , linkshare 20%  )  
     queue qP2P on pppoe1 bandwidth 5% hfsc (  codel  , default  , linkshare 5%  )  
     queue qVoIP on pppoe1 bandwidth 27% hfsc (  realtime (447Kb, 5, 224Kb)  , linkshare 27%  )  
     queue qOthersHigh on pppoe1 bandwidth 28% hfsc (  codel  , linkshare 28%  )  
    
     altq on igb0 hfsc bandwidth 70Mb queue {  qInternet  } 
     queue qInternet on igb0 bandwidth 70Mb hfsc (  codel  , linkshare 70Mb  , upperlimit 70Mb  )  {  qACK,  qOthersDefault,  qP2P,  qVoIP,  qOthersHigh  } 
     queue qACK on igb0 bandwidth 15% hfsc (  codel  , linkshare 15%  )  
     queue qOthersDefault on igb0 bandwidth 40% hfsc (  codel  , linkshare 40%  )  
     queue qP2P on igb0 bandwidth 10% hfsc (  codel  , default  , linkshare 10%  )  
     queue qVoIP on igb0 bandwidth 10% hfsc (  realtime (447Kb, 5, 224Kb)  , linkshare 10%  )  
     queue qOthersHigh on igb0 bandwidth 25% hfsc (  codel  , linkshare 25%  )  
    

  • Netgate Administrator

    The SG-1000 running PPPoE also?



  • yes, same isp for sg-1000 and the recent apu2 i bought



  • this got fixed on the apu2 so im assuming it works on sg-1000 as well coz both run the exact same config and both gave same symptoms


Log in to reply