Help? netgate router is receving frequent Gateway alarms & resetting causing lost connections



  • HI,
    I have a netgate SOHO router, with the lastest version of pfsense.
    For the past few weeks, my sessions are getting disconnected, like my VPN into work and some games, website etc.

    I see in the system logs, when this may be occuring:
    Sep 18 11:02:35 rc.gateway_alarm 3471 >>> Gateway alarm: WAN_DHCP (Addr:75.133.112.1 Alarm:1 RTT:7938ms RTTsd:1019ms Loss:22%)
    Sep 18 11:02:35 check_reload_status updating dyndns WAN_DHCP
    Sep 18 11:02:35 check_reload_status Restarting ipsec tunnels
    Sep 18 11:02:35 check_reload_status Restarting OpenVPN tunnels/interfaces
    Sep 18 11:02:35 check_reload_status Reloading filter
    Sep 18 11:03:39 rc.gateway_alarm 64926 >>> Gateway alarm: WAN_DHCP (Addr:75.133.112.1 Alarm:0 RTT:8141ms RTTsd:1420ms Loss:5%)
    Sep 18 11:03:39 check_reload_status updating dyndns WAN_DHCP
    Sep 18 11:03:39 check_reload_status Restarting ipsec tunnels
    Sep 18 11:03:39 check_reload_status Restarting OpenVPN tunnels/interfaces

    I have "chater" spectrum itnernet. it's a 60 mb/ 4mb up system. I have a cisco modem DPC3008.

    Not sure why this is happening, but I saw a no too similar issue, where it was recommend to enable traffic shaping using PIRQ. So I set my speeds to 58Mb / 4 MB up. Seems to work, but is this an issue with the PFSENSE? Should I down grade? Or do I need to understand from Spectrum why are they sending DHCP requests/refresh to my modem & thus causing the router to reload stuff.
    I had the cable line from the pole to house replaced. The tech found there was some kind of surge & it was causing timeouts on the line. Not sure if this is related to a much larger issue.

    Any help would be appreciated. Thank you.



  • I had this same problem on Comcast Business and was unable to resolve it. I ended up switching to a Netgear router temporarily and the problem went away. It appears pfSense is very sensitive to packet loss on the WAN interface and will reset every so often when it issues a WAN alarm. Bug or feature? I don't know.


  • Netgate

    You are seeing buffer bloat at your ISP. Setting the firewall to send at a rate less than what buffers upstream keeps the latency down and keeps the circuit happy. It has nothing to do with "downgrading" or the version of pfSense you are running.



  • @derelict I think I understand this, since there is a certain amount of bloat, based on some of the speed tests i 've ran on dslreports.com.

    So where would I make these changes on my router? I currently have the traffic shaper enabled PRIQ, with priority for my VoIP modem. I've disabled the System->routing monitor & perform actions on monitor too. But it seems I'm seeing the issue. Do I reduce the upload/download setting on the Traffic Shaper. Currently it's set for 100MB down and 10MB up. Do I reduce that to something like 95MB down /7MB up?

    Thanks.


  • Netgate

    When I tune these I start at about 90% of what speed tests show.

    It is almost certainly your upload, not your download. I would concentrate on that.



  • @raellic This is something I was going to try too. Because I don't remember seeing this issues with a non-netgate router. I used to have a custom PC running pfsense & don't recall seeing these issues.

    Thank you.


  • Netgate

    The buffering is almost certainly at the ISP, not in pfSense. I say this because without shaping enabled (the default) there are very few buffers. None that would cause 1000ms of latency.



  • @derelict
    @johnpoz I've set the monitor of gateway to NOT action on issues, but it appears that something is happening, extreme lag is occurring on my connection. The below output status actually cased the connection to lag:

    Oct 1 21:55:24 dpinger WAN_DHCP 75.133.112.1: Alarm latency 9423us stddev 1594us loss 21%
    Oct 1 21:56:42 dpinger WAN_DHCP 75.133.112.1: Clear latency 9486us stddev 1748us loss 5%
    Oct 1 21:57:19 dpinger WAN_DHCP 75.133.112.1: Alarm latency 9482us stddev 1888us loss 21%
    Oct 1 21:59:13 dpinger WAN_DHCP 75.133.112.1: Clear latency 9955us stddev 3258us loss 5%
    Oct 1 22:00:09 dpinger WAN_DHCP 75.133.112.1: Alarm latency 9197us stddev 2324us loss 21%
    Oct 1 22:01:39 dpinger WAN_DHCP 75.133.112.1: Clear latency 11927us stddev 5882us loss 5%

    NOt sure what else I need to tweak. My Traffer shaper upload has been set to 7MB/s out of 10MB/s. Still seeing laging, but no disconnections.



  • @r0sebush
    My VPN session to my job got disconnected, even though I have the Perfrom Montier Action disabled...:
    Oct 2 16:20:07 dpinger WAN_DHCP 75.133.112.1: Alarm latency 7708us stddev 1903us loss 21%
    Oct 2 16:21:21 dpinger WAN_DHCP 75.133.112.1: Clear latency 7791us stddev 2353us loss 5%



  • Just curious what happens when you disable traffic shaping ? also Ive been a tech at many of the larger telecom's. Did the tech replace you Residential Gateway RG. Ive seen some hardware do stuff like this other router brands. Does it happen around the same time of day or night? if it rains out or gets really hot or any where in between hot or cold days?



  • @snowaks I haven't tried it without traffic shaping. I'm using the Traffic shaping for my VOIP, it performs better & not impacted by other traffic (VPN, Gaming, surfing). I'll try to disable it, but I'd like to make sure my VOIP device is getting the necessary bandwidth, would I use a limiter for that?

    The issues occur or appears randomly. Maybe I should have my ISP (Spectrum) re-check the outside "box" on the pole. Since my inside wiring etc. is excellent & they already replaced the coax from the POLE to my house.

    Ugh. Too bad the techs at my ISP suck ass. I'd like to get the bottom of this & I'm leaning towards the ISP and something on there end, but hey like to deny shit.


  • Rebel Alliance Developer Netgate

    Is your WAN DHCP? Since you said you had a Cable provider, it probably is.

    If so, check for this: https://redmine.pfsense.org/issues/8507#note-23



  • Yes they do . Also they only give techs 30-60min to fix your problem kinda hard to trouble shoot a random problem that does not show up in 30mins. when it happens again try to connect direct to the RG/CMs and see if the packet loss problems are still there. That way you know where your problem is starting also it could be the RG and resting the connection could also fix it. I had this same problem happen to me 30% packet loss was a bad rg and premise amp. Coax also ages and the stuff from big box stores is cheaper then what charter would install. Have they replaced every thing from the Nid to your CM? if they have it could be something beyond your tap.

    PS. What happens when you leave your mtu blank and do not set a value ?



  • @snowaks MTU for the WAN interface is blank by default. Same for LAN. But in the Status/interfaces, they're both 1500.



  • Yesterday, I also noticed that I had UPn-P emabled. But I didn't configure any of the settings. So I disabled that yesterday (along with the disabled Gateway monitoring & monitor actions) and noticed that I didn't have any disconnections.
    I also contacted my ISP to get somebody out today 10/6 , but they had a no-show.
    I re-enabled the gateway monitoring (without the monitor action). Let's see if we get disconnections today or until they come out in a few days.



  • no show sounds like they are overbooking there techs again.



  • @r0sebush update So Tech came out, but again.. they are somewhat useless, but he showed no issue on the line going from house to pole/amplifier, and from model to outside as well. Not enough time spent, due to their time limit allotted per customer.
    I wonder if I have a bad wan Port on my netgate sg:2440. I still see disconnections, a little less than before. but nonetheless it happens.
    I'm going to enable the opt1 line, and reconfigure firewall NAT etc. and outgoing filters to support opt1. Maybe this will remedy my stuff too..

    I can't believe with a $200+ dollar router/firewall I have this kind of issue. I may just get another type of route, ubiquiti perhaps? I loved PFSense when I had it running a a custom build PC.... but the sg-2440, ugh... is this a defective model? should I buy the next one they're selling because the 2440 is end of life? WTF.

    Oh well.. seems like I'm getting "it" from both ends.. Spectrum, with there lack of movement on getting me some network specialist to help me out VS the Netgate SG-2440 & how it's not really plug in play, but plug in guess how to re-configure your router for your SOHO... :) Maybe they're forcing me to buy one of their pfsense books, which I have....

    peace.


  • Netgate

    Highly doubtful it is hardware. You can blame the hardware for ISP issues all you want. Assign WAN to another port and connect the ISP there to check. You don't have to make a new port. Just reassign it to a different hardware interface in Interfaces > Assignments



  • @derelict Tried to change to opt1 port. The only issue I had before I could determine disconnects is that I couldn't nslookup from PC. Looks like DNS wasn't being served correct into my LAN from opt1 or the router.. even after a reboot.

    Ugh. This is annoying. Thanks for you input, maybe i can research about getting DNS working on OPT1 -> LAN... WAN->LAN works no issue but still get disconnects. (with our without Traffic shaper enabled or disabled.)

    What I don't understand is that even though I have WAN moniter disabled & do do actions, the system still disconnects some sessions ,like my VPN to work. Problem exists, not sure if Specturm support would escalete to manager. Maybe this is what I need, since Tech/linesmen can't help with this kind of problem.