Navigation

    Netgate Discussion Forum
    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search

    Frequent internet loss - need help figuring out where and why? Maybe pfSense, Modem, ISP, or all 3?

    General pfSense Questions
    6
    33
    2620
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      mmiller7 last edited by mmiller7

      I'm not sure where to begin with this, it's been an increasingly frequent reoccurring problem for me the last couple months. It gets worse under load (e.g. uploading files to "the cloud" or streaming video). I have not changed the config for the better part of a year and it was rock solid stable.

      Scenario (repeats every 1-10 days):

      1. internet starts having high latency (~60-100mS compared to ~10mS to anything)
      2. After unknown period of time, I lose all internet connectivity
      3. I find the web-GUI unresponsive
      4. I log in via SSH and see it has no WAN IP and in the logs lots of re0 up/down/up/down and watchdog alerts
      5. if I restart PHP-FPM I can get back to the web-GUI for a brief period of time but I still can't get it to pull a DHCP IP on the WAN
      6. I reboot the whole router
      7. it comes up and appears to work fine, internet fully restored
      8. I look at the modem (192.168.100.1) and see high numbers of uncorrectable frames on all channels
      9. I observe pfsense now has a WAN IP and works fine.

      What I've tried:

      1. rebooting cable modem only - no difference, still no WAN IP in pfSense
      2. used different NIC for WAN interface (Amazon Basics USB3.0 NIC) speculating it was the Realtek chipset, this made no difference but maybe that makes sense because the LAN (also Realtek) works fine to SSH into the router and restart it
      3. I clean installed both pfSense 2.4.3 (what I'd been on for a long time) and then tried 2.4.4 now that it is out. No difference.
      4. I have painfully rebuilt my configuration from scratch (vs backup/restore) after a clean install. Again, no difference.
      5. I have powered down all my ham radio gear (thinking RFI was causing an issue) again, no difference it still dies after a few days.
      6. I have moved my MoCA LAN bridge to a totally separate run of coax thinking it was interfering with the modem. Again, no change.
      7. I bought a key tool so I could open up the demarc box on the side of my house and remove/reseat the cable from the cable-co feed. I only have internet so its a straight shot off their pole into the demarc and up to my modem with only a couple barrel connectors (no splits). No difference.
      8. I have replaced all the cables that I can both coax and ethernet - only ones I have not changed out are in the walls (I rent, not my house) and up the telephone pole (cable-co side). Again, no change.
      9. I have tried replacing and removing my coax surge protector. No difference either way.

      Here's my hardware:
      ZOTAC ZBOX C Series CI323 Nano Passive Cooling Mini PC with Intel Celeron N3150 Quad-Core CPU Intel HD Graphics Barebones System with 4GB RAM and 16GB SSD (about 21 months old)
      Arris Surfboard SB6193 modem (probably about 18-24 months old)

      My ISP is Metrocast/Atlantic Broadband and I have 200x15Mbps service (when its working right it achieves about 195x16Mbps)

      I saved off /var/log folder as a tar-gzip before rebooting pfSense so I can pull files out but I don't know what to look for at this point - nothing is jumping out at me.

      Here are screenshots from my modem:

      2_1538448634930_modem log.png 1_1538448634930_modem issue status.png 0_1538448634930_modem info.png

      I'm at my witts end, I have people suggesting I should "just buy a normal router" and I'm at the point I am considering just replacing hardware at random hoping it does something. I'm not sure I want to call my ISP because they will tell me to reboot my modem and plug a computer in "it works what is your problem" and blame my computer/router.

      PLEASE help with any ideas?

      1 Reply Last reply Reply Quote 0
      • B
        bfeitell last edited by

        On a hunch, what is the MTU of your WAN connection, and are you seeing any arpresolve errors?

        I ran into a problem with a small MTU of 576 being handed out by my friend's ISP's cable modem termination system. In prior versions of pfSense the interface-mtu field passed with the lease was ignored by dhclient. The new dhclient code respects the interface-mtu field, and forces tiny packets, which can break things.

        I posted about my experience and the fix here:
        https://forum.netgate.com/topic/136089/solved-and-revised-2-4-4-release-arpresolve-can-t-allocate-llinfo-for-gateway-on-interface0-dhcp-mtu-576

        1 Reply Last reply Reply Quote 0
        • M
          mmiller7 last edited by mmiller7

          According to Interfaces > Status my WAN MTU is 1500 currently. I have not modified it so I assume it's using whatever it got from DHCP or default.

          I do get some lines like that (which look like it has the IP of either the ISP's gateway or ISP's DHCP server, unsure which) as well as my modem IP at times. I didn't think much of it because sometimes I see that and its still working fine until it starts getting the interface up-down.

          Here's a snip from dmesg from when it finished booting until when it died one of the times if that helps.
          re0 = WAN
          re1 = LAN
          Others = VLANS or OpenVPN

            Origin="GenuineIntel"  Id=0x406c3  Family=0x6  Model=0x4c  Stepping=3
            Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
            Features2=0x43d8e3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,AESNI,RDRAND>
            AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
            AMD Features2=0x101<LAHF,Prefetch>
            Structured Extended Features=0x2282<TSCADJ,SMEP,ERMS,NFPUSG>
            VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
            TSC: P-state invariant, performance statistics
          padlock0: No ACE support.
          aesni0: <AES-CBC,AES-XTS,AES-GCM,AES-ICM> on motherboard
          re1: link state changed to DOWN
          vlan0: changing name to 're1.2'
          vlan1: changing name to 're1.3'
          re0: link state changed to DOWN
          re1: link state changed to UP
          re1.2: link state changed to UP
          re1.3: link state changed to UP
          re0: link state changed to UP
          tun1: changing name to 'ovpns1'
          ovpns1: link state changed to UP
          tun2: changing name to 'ovpns2'
          ovpns2: link state changed to UP
          pflog0: promiscuous mode enabled
          re0: link state changed to DOWN
          re0: link state changed to UP
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 74.214.49.1 on re0
          arpresolve: can't allocate llinfo for 192.168.100.1 on re0
          arpresolve: can't allocate llinfo for 192.168.100.1 on re0
          arpresolve: can't allocate llinfo for 192.168.100.1 on re0
          arpresolve: can't allocate llinfo for 192.168.100.1 on re0
          arpresolve: can't allocate llinfo for 192.168.100.1 on re0
          arpresolve: can't allocate llinfo for 192.168.100.1 on re0
          ovpns1: link state changed to DOWN
          ovpns1: link state changed to UP
          ovpns2: link state changed to DOWN
          ovpns2: link state changed to UP
          ovpns1: link state changed to DOWN
          ovpns1: link state changed to UP
          ovpns2: link state changed to DOWN
          ovpns2: link state changed to UP
          ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0 (disconnected)
          ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0
          ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0 (disconnected)
          ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0
          ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0 (disconnected)
          ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0
          ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0 (disconnected)
          ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          ovpns1: link state changed to DOWN
          ovpns1: link state changed to UP
          ovpns2: link state changed to DOWN
          ovpns2: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          ovpns1: link state changed to DOWN
          ovpns1: link state changed to UP
          ovpns2: link state changed to DOWN
          ovpns2: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          ovpns1: link state changed to DOWN
          ovpns1: link state changed to UP
          ovpns2: link state changed to DOWN
          ovpns2: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          re0: watchdog timeout
          re0: link state changed to DOWN
          re0: link state changed to UP
          
          
          1 Reply Last reply Reply Quote 0
          • M
            mmiller7 last edited by

            Also possibly relevant (as I read your thread) I found tips about Realtek issues and I have disabled "Hardware Checksum Offloading" though I saw no difference doing that.

            1 Reply Last reply Reply Quote 0
            • B
              bfeitell last edited by

              Could you take a look at your actual DHCP lease? It will be found in /var/db/dhclient.leases.re0 Does it contain an interface-mtu field?

              You might want to anonymize your WAN IP rather than posting it here.

              M 1 Reply Last reply Reply Quote 0
              • M
                mmiller7 @bfeitell last edited by mmiller7

                @bfeitell said in Frequent internet loss - need help figuring out where and why? Maybe pfSense, Modem, ISP, or all 3?:

                var/db/dhclient.leases.re0

                Yes - It does look like it contains that field

                lease {
                  interface "re0";
                  fixed-address 74.214.49.x;
                  option subnet-mask 255.255.255.0;
                  option routers 74.214.49.1;
                  option domain-name-servers 173.44.120.56,173.44.120.57;
                  option host-name "pfSense";
                  option interface-mtu 576;
                  option broadcast-address 255.255.255.255;
                  option dhcp-lease-time 86400;
                  option dhcp-message-type 5;
                  option dhcp-server-identifier 65.175.141.7;
                  renew 2 2018/10/2 14:23:45;
                  rebind 2 2018/10/2 23:23:45;
                  expire 3 2018/10/3 02:23:45;
                }
                lease {
                  interface "re0";
                  fixed-address 74.214.49.x;
                  option subnet-mask 255.255.255.0;
                  option routers 74.214.49.1;
                  option domain-name-servers 173.44.120.56,173.44.120.57;
                  option host-name "pfSense";
                  option interface-mtu 576;
                  option broadcast-address 255.255.255.255;
                  option dhcp-lease-time 86400;
                  option dhcp-message-type 5;
                  option dhcp-server-identifier 65.175.141.7;
                  renew 2 2018/10/2 15:05:05;
                  rebind 3 2018/10/3 00:05:05;
                  expire 3 2018/10/3 03:05:05;
                }
                [2.4.4-RELEASE][root@pfSense.apt]/root: 
                
                

                WAN IP...eh, it changes often enough anonymizing the last bit is probably fine.

                1 Reply Last reply Reply Quote 0
                • B
                  bfeitell last edited by bfeitell

                  EDITED TO CORRECT POINTER TO CORRECT FIELD IN "Lease Requirements and Requests"

                  Try the fix I discussed in the prior post I referenced. You are probably hitting the same issue I saw.

                  "In the "Lease Requirements and Requests" section for WAN DHCP in the field "Option modifiers" add the text without quotes: "supersede interface-mtu 0""

                  1 Reply Last reply Reply Quote 0
                  • M
                    mmiller7 last edited by

                    I shall give that a shot! Do I need a reboot/reload after that change is made?

                    Probably be at least 2 weeks until I declare "fixed" but I may know sooner if it dies. Interesting that they would set such a small MTU, I've never really seen anything under 1500 used "in the wild"

                    1 Reply Last reply Reply Quote 0
                    • B
                      bfeitell last edited by

                      Please report back your results, especially with regard to whether the 'arpresolve can't allocate llinfo for $GATEWAY' errors go away with the fix in place. The granted lease details won't change, but the default pfSense MTU of 1500 should be in effect after the fix.

                      1 Reply Last reply Reply Quote 0
                      • M
                        mmiller7 last edited by

                        Maybe I'm still missing something. I don't see that field listed where I expected it to be under Interfaces > WAN

                        Am I missing something? This is 2.4.4

                        0_1538453717201_wan setting.png

                        1 Reply Last reply Reply Quote 0
                        • B
                          bfeitell last edited by bfeitell

                          Yes, scroll down a bit more. "Option modifiers" appears below in the "Lease Requirements and Requests" section.

                          I would reboot, but saving and applying the fix should resolve things. You can check the MTU before and after the fix at the command prompt with 'ifconfig re0'.

                          1 Reply Last reply Reply Quote 0
                          • M
                            mmiller7 last edited by

                            Ah, I found it as you said under the "Lease Requirements and Requests" heading. I was looking for a keyword heading "Advanced" or checkbox "Advanced". I'll do a reboot for good measure, it seems reasonable enough.

                            I will certainly report back either way - if it is still having problems or showing those lines or if I think it's fixed after a while.

                            Thanks!

                            1 Reply Last reply Reply Quote 1
                            • B
                              bfeitell last edited by bfeitell

                              I have edited the post above, and my prior post to reflect the correct location of "Option modifiers" in the "Lease Requirements and Requests" section.

                              I really think this fix needs to be documented, as the underlying problem causes all sorts of flakiness beyond DHCP renewal/ARP quirks, including the failure of certain web sites, like newyorker.com, to load.

                              1 Reply Last reply Reply Quote 0
                              • M
                                mmiller7 last edited by mmiller7

                                Interesting note - prior to the reboot after applying that I had intermittent connectivity and odd entries in my dmesg output.

                                I think after a reboot it settled out but in case you are interested here is what it showed:

                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                re0: link state changed to UP
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                re0: link state changed to DOWN
                                re0: link state changed to UP
                                nd6_setmtu0: new link MTU on re0 (576) is too small for IPv6
                                re0: link state changed to DOWN
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                re0: link state changed to UP
                                re0: link state changed to DOWN
                                re0: link state changed to UP
                                re0: link state changed to DOWN
                                nd6_setmtu0: new link MTU on re0 (576) is too small for IPv6
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                re0: link state changed to UP
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                re0: link state changed to DOWN
                                re0: link state changed to UP
                                nd6_setmtu0: new link MTU on re0 (576) is too small for IPv6
                                re0: link state changed to DOWN
                                re0: link state changed to UP
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                re0: link state changed to DOWN
                                re0: link state changed to UP
                                nd6_setmtu0: new link MTU on re0 (576) is too small for IPv6
                                re0: link state changed to DOWN
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                re0: link state changed to UP
                                re0: link state changed to DOWN
                                re0: link state changed to UP
                                re0: link state changed to DOWN
                                nd6_setmtu0: new link MTU on re0 (576) is too small for IPv6
                                re0: link state changed to UP
                                re0: link state changed to DOWN
                                re0: link state changed to UP
                                nd6_setmtu0: new link MTU on re0 (576) is too small for IPv6
                                re0: link state changed to DOWN
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                re0: link state changed to UP
                                re0: link state changed to DOWN
                                re0: link state changed to UP
                                nd6_setmtu0: new link MTU on re0 (576) is too small for IPv6
                                re0: link state changed to DOWN
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                re0: link state changed to UP
                                re0: link state changed to DOWN
                                re0: link state changed to UP
                                nd6_setmtu0: new link MTU on re0 (576) is too small for IPv6
                                re0: link state changed to DOWN
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                re0: link state changed to UP
                                re0: link state changed to DOWN
                                re0: link state changed to UP
                                nd6_setmtu0: new link MTU on re0 (576) is too small for IPv6
                                re0: link state changed to DOWN
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                re0: link state changed to UP
                                re0: link state changed to DOWN
                                re0: link state changed to UP
                                nd6_setmtu0: new link MTU on re0 (576) is too small for IPv6
                                re0: link state changed to DOWN
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                re0: link state changed to UP
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                re0: link state changed to DOWN
                                re0: link state changed to UP
                                nd6_setmtu0: new link MTU on re0 (576) is too small for IPv6
                                re0: link state changed to DOWN
                                arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                

                                AFAIK my ISP does not support IPv6 and I don't appear to have any IPv6 connectivity. I'm assuming it just got confused with changing things until it was rebooted.

                                1 Reply Last reply Reply Quote 0
                                • B
                                  bfeitell last edited by

                                  Yes, I think this pretty clearly shows you hit this 576 MTU oddity. For my friend it broke The New Yorker website, and iHeart Radio. I was more concerned with the firewall unpredictably falling off the net. It turned out that the two problems had the same root cause of a too small MTU.

                                  1 Reply Last reply Reply Quote 0
                                  • M
                                    mmiller7 last edited by mmiller7

                                    Not sure why the forum thinks I'm spamming posting the output of ifconfig but it does.

                                    After a reboot its still showing ifconfig re0 with mtu 576.

                                    Any thoughts? Maybe I need to manually enter 1500 in the WAN settings?

                                    0_1538455003372_options.png

                                    1 Reply Last reply Reply Quote 0
                                    • B
                                      bfeitell last edited by

                                      Please post a screen shot of the "Lease Requirements and Requests" as you have it filled out. If you have the supersede statement in there correctly, it may be a quirk of the (re) driver interacting with dhclient. Setting 1500 explicitly for the MTU might fix it, but I would try a cold boot after confirming you have the incantation entered correctly. In the meanwhile, I will double check how I have it set on my friend's machine...

                                      1 Reply Last reply Reply Quote 0
                                      • M
                                        mmiller7 last edited by

                                        Just re-re-re read your post and you said without quotes. Maybe I should have gone to bed while it was still yesterday. :)

                                        Took out my " " from the option modifiers, rebooted AGAIN and now I see mtu 1500 in ifconfig re0!

                                        Seems like a good sign that its at least now doing what you (and I) was expecting.

                                        1 Reply Last reply Reply Quote 1
                                        • B
                                          bfeitell last edited by

                                          Excellent! Please follow up on this if it is fixed. I think that Netgate needs to put this in the formal documentation. It is a sneaky little quirk from upstream. pfSense, and dhclient are doing the right thing following the DHCP lease parameters issued, but the cable modem hardware from the ISP is giving out bad settings for setting up the connection.

                                          1 Reply Last reply Reply Quote 0
                                          • M
                                            mmiller7 last edited by mmiller7

                                            Negative success, just had total outage tonight. Had to reboot pfsense to get it to come back online.

                                              Origin="GenuineIntel"  Id=0x406c3  Family=0x6  Model=0x4c  Stepping=3
                                              Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
                                              Features2=0x43d8e3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,AESNI,RDRAND>
                                              AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
                                              AMD Features2=0x101<LAHF,Prefetch>
                                              Structured Extended Features=0x2282<TSCADJ,SMEP,ERMS,NFPUSG>
                                              VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
                                              TSC: P-state invariant, performance statistics
                                            padlock0: No ACE support.
                                            aesni0: <AES-CBC,AES-XTS,AES-GCM,AES-ICM> on motherboard
                                            re1: link state changed to DOWN
                                            vlan0: changing name to 're1.2'
                                            vlan1: changing name to 're1.3'
                                            re0: link state changed to DOWN
                                            re1: link state changed to UP
                                            re1.2: link state changed to UP
                                            re1.3: link state changed to UP
                                            re0: link state changed to UP
                                            tun1: changing name to 'ovpns1'
                                            ovpns1: link state changed to UP
                                            tun2: changing name to 'ovpns2'
                                            ovpns2: link state changed to UP
                                            pflog0: promiscuous mode enabled
                                            ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0 (disconnected)
                                            ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0
                                            re0: link state changed to DOWN
                                            re0: link state changed to UP
                                            ovpns1: link state changed to DOWN
                                            ovpns1: link state changed to UP
                                            ovpns2: link state changed to DOWN
                                            ovpns2: link state changed to UP
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            re0: link state changed to UP
                                            ovpns1: link state changed to DOWN
                                            ovpns1: link state changed to UP
                                            ovpns2: link state changed to DOWN
                                            ovpns2: link state changed to UP
                                            ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0 (disconnected)
                                            ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0
                                            ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0 (disconnected)
                                            ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0
                                            ugen0.2: <American Power Conversion Back-UPS ES 750 FW841.I3 .D USB FWI3> at usbus0 (disconnected)
                                            ugen0.2: <American Power Conversion Back-UPS ES 750 FW841.I3 .D USB FWI3> at usbus0
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            re0: link state changed to UP
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            ovpns1: link state changed to DOWN
                                            ovpns1: link state changed to UP
                                            ovpns2: link state changed to DOWN
                                            ovpns2: link state changed to UP
                                            ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0 (disconnected)
                                            ugen0.5: <vendor 0x8087 product 0x07dc> at usbus0
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: link state changed to UP
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            ovpns1: link state changed to DOWN
                                            ovpns1: link state changed to UP
                                            ovpns2: link state changed to DOWN
                                            ovpns2: link state changed to UP
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: link state changed to UP
                                            ovpns1: link state changed to DOWN
                                            ovpns1: link state changed to UP
                                            ovpns2: link state changed to DOWN
                                            ovpns2: link state changed to UP
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: link state changed to UP
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: link state changed to UP
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: link state changed to UP
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            re0: link state changed to UP
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: link state changed to UP
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            re0: link state changed to UP
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: link state changed to UP
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            re0: link state changed to UP
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            re0: link state changed to UP
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            re0: link state changed to UP
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            re0: link state changed to UP
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            re0: link state changed to UP
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            re0: link state changed to UP
                                            arpresolve: can't allocate llinfo for 74.214.49.1 on re0
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            re0: link state changed to UP
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            re0: link state changed to UP
                                            re0: watchdog timeout
                                            re0: link state changed to DOWN
                                            

                                            0_1538795475631_monitoring.png

                                            Any other thoughts? I see my modem errors are creeping up again (tho only in the few-hundreds this time, not up to 1000 yet)

                                            Grimson 1 Reply Last reply Reply Quote 0
                                            • Grimson
                                              Grimson Banned @mmiller7 last edited by

                                              @mmiller7 said in Frequent internet loss - need help figuring out where and why? Maybe pfSense, Modem, ISP, or all 3?:

                                              re0: watchdog timeout

                                              That's a very common issue with Realtek crap NICs, you can try to use the official Realtek driver (hint: look into the Hardware section) or better yet switch to Intel NICs.

                                              M 1 Reply Last reply Reply Quote 0
                                              • B
                                                bfeitell last edited by

                                                I agree with Grimson. The Realtek NICs can be very dodgy to work with. You should make sure that you have disabled all three of:

                                                Hardware Checksum Offloading
                                                Hardware TCP Segmentation Offloading
                                                Hardware Large Receive Offloading

                                                at the bottom of System/Advanced/Networking

                                                The "supersede interface-mtu 0" fix remains necessary for you if you were having the arpresolve/llinfo errors and frequent drops without it. The fix is now referenced in the upgrade guide for 2.4.4 for cases where the advanced options section has been touched.

                                                You can look through the network card tuning recommendations, and try variations on the MSI/MSIX fixes you see there by adapting them for (re) cards.

                                                https://www.netgate.com/docs/pfsense/hardware/tuning-and-troubleshooting-network-cards.html

                                                For example, adding something like these in /boot/loader.conf.local

                                                net.inet.tcp.tso=0
                                                hw.pci.enable_msix=0
                                                hw.pci.enable_msi=0
                                                hw.re.tso_enable=0

                                                Take a look through the forums, and you will see that many people have problems with Realtek hardware.

                                                I hope this helps.

                                                M 1 Reply Last reply Reply Quote 0
                                                • M
                                                  mmiller7 @Grimson last edited by mmiller7

                                                  @grimson The Zotac is a NUC-style low power mini box so the NICs can't be changed and it has no expansion slots, I did try a USB NIC already (AX88179) and it had the exact same problem. Actually I think everything I own that isn't a laptop has Realtek NICs on the motherboard.

                                                  Both my WAN and LAN are Realtek chips, re1 (the LAN) never seems to blink (metaphorically, that is), its only re0 (WAN) that is blowing up. The re1 LAN has even more throughput because its got 3 VLANs going thru it vs re0 has no VLANs.

                                                  Grimson 1 Reply Last reply Reply Quote 0
                                                  • M
                                                    mmiller7 @bfeitell last edited by mmiller7

                                                    @bfeitell

                                                    @bfeitell said in Frequent internet loss - need help figuring out where and why? Maybe pfSense, Modem, ISP, or all 3?:

                                                    I agree with Grimson. The Realtek NICs can be very dodgy to work with. You should make sure that you have disabled all three of:

                                                    Hardware Checksum Offloading
                                                    Hardware TCP Segmentation Offloading
                                                    Hardware Large Receive Offloading

                                                    at the bottom of System/Advanced/Networking

                                                    All 3 of those were already disabled

                                                    The "supersede interface-mtu 0" fix remains necessary for you if you were having the arpresolve/llinfo errors and frequent drops without it. The fix is now referenced in the upgrade guide for 2.4.4 for cases where the advanced options section has been touched.

                                                    I'll leave it in - I think it did (slightly) help my speeds even if it didn't help my reliability. I was seeing that prior to 2.4.4 (I hoped upgrading would help things).

                                                    You can look through the network card tuning recommendations, and try variations on the MSI/MSIX fixes you see there by adapting them for (re) cards.

                                                    https://www.netgate.com/docs/pfsense/hardware/tuning-and-troubleshooting-network-cards.html

                                                    For example, adding something like these in /boot/loader.conf.local

                                                    net.inet.tcp.tso=0
                                                    hw.pci.enable_msix=0
                                                    hw.pci.enable_msi=0
                                                    hw.re.tso_enable=0

                                                    Take a look through the forums, and you will see that many people have problems with Realtek hardware.

                                                    I hope this helps.

                                                    I'll look thru those and see what I can add.

                                                    I'm still wondering about the modem - anyone think it could be going bad with those errors that keep jumping up to high numbers shortly before it dies? I just don't understand why it would crash pfSense if the modem stops passing data for a while? And the fact it ran stable for over a year, then now is unstable seems odd it would be a hardware incompatibility?

                                                    1 Reply Last reply Reply Quote 0
                                                    • Grimson
                                                      Grimson Banned @mmiller7 last edited by

                                                      @mmiller7 said in Frequent internet loss - need help figuring out where and why? Maybe pfSense, Modem, ISP, or all 3?:

                                                      @grimson The Zotac is a NUC-style low power mini box so the NICs can't be changed and it has no expansion slots, I did try a USB NIC already (AX88179) and it had the exact same problem.

                                                      Then it is simply a bad choice for a pfSense installation.

                                                      @mmiller7 said in Frequent internet loss - need help figuring out where and why? Maybe pfSense, Modem, ISP, or all 3?:

                                                      Actually I think everything I own that isn't a laptop has Realtek NICs on the motherboard.

                                                      And those are all consumer grade devices, primarily intended to run Windows where the Realtek NICs work halfway decent (in a consumer use-case). pfSense is designed to run on enterprise grade hardware and based on FreeBSD.

                                                      The Realtek drivers from FreeBSD are pretty bad, the FreeBSD drivers from Realtek themselves are a bit better, but far from the quality of Intel (or Broadcom) drivers. If you want a stable and reliable pfSense installation you need to switch hardware. That's simply how it is, if you don't believe me check the hardware section and the FreeBSD forums. If you don't believe them and still insist on using Realtek NICs you'll have to live with crashes and issues related to those interfaces.

                                                      Those are the facts, and for me there is no reason to discuss this any further.

                                                      1 Reply Last reply Reply Quote 0
                                                      • M
                                                        mmiller7 last edited by

                                                        There may be some issues -- but I can tell you there are also plenty of people successfully using this same Zotac box with pfSense based on the reviews I was reading and multiple others I personally know who are using pfSense on it with no problems. Also the fact it worked for well over a year without any problems for me, seemingly it can't be that bad if I'm only starting to see issues in the past month with the same configuration. It also doesn't explain why the SAME chipset is working totally fine with the LAN interface even when the WAN crashes out.

                                                        I've used other FreeBSD based "appliances" including Monowall and FreeNAS (pre-0.8) - I know there can be issues with drivers, I have seen it. This doesn't fit the pattern I've seen with other incompatible devices though - in all those cases it would be unstable or slow out of the box, not working for a long period of time then blow up.

                                                        1 Reply Last reply Reply Quote 0
                                                        • M
                                                          mmiller7 last edited by

                                                          Had another drop-out tonight even with the extra options in there tweaking stuff.

                                                          At this point I'm going to try a new modem (one that supports 32x8 channels vs 16x4) and see if that helps any. When I called my ISP the tech dug around a bit and he thought it could be I'm just dropping offline because of too many errors from an over-saturated node. I do see I'm up to 24x3 channels bonded vs 16x3 with the old modem, maybe the extra few channels will help if they over-subscribed the network.

                                                          1 Reply Last reply Reply Quote 0
                                                          • stephenw10
                                                            stephenw10 Netgate Administrator last edited by

                                                            If you're seeing those watchdog errors in the re NIC then the only solution that has been reported to work is the alternative driver. Lot's of users with Zotac boxes have hit that issue. I wouldn't bother doing anything else until you try that:
                                                            https://forum.netgate.com/topic/135850/official-realtek-driver-binary-1-95-for-2-4-4-release

                                                            Steve

                                                            1 Reply Last reply Reply Quote 0
                                                            • M
                                                              mmiller7 last edited by mmiller7

                                                              Both NICs are the same model, if its a driver why would only one be affected?

                                                              re1 has far more traffic (routing between VLANs including IP-cameras to servers) than re0...yet re0 is the only one that seems to choke?

                                                              1 Reply Last reply Reply Quote 0
                                                              • chpalmer
                                                                chpalmer last edited by chpalmer

                                                                @mmiller7 said in Frequent internet loss - need help figuring out where and why? Maybe pfSense, Modem, ISP, or all 3?:

                                                                SB6193

                                                                Your modem is probably an SB6183.. Correct me if Im wrong.

                                                                Im not a big fan of Arris products anymore. Do you have another modem to try?

                                                                6183's can get really hot. If they get too hot Ive seen them start to error out. Not every one.. not every customer. But enough that we do not keep them in service for our customers anymore.

                                                                You probably mentioned but who is your ISP and what region are you in? Edit- found it.. Metrocast/Atlantic Broadband

                                                                Triggering snowflakes one by one..

                                                                1 Reply Last reply Reply Quote 0
                                                                • M
                                                                  mmiller7 last edited by mmiller7

                                                                  Modem was a SB6183 that's correct - my thread-setarter post has screenshots attached of the modem status/config/log pages (192.168.100.1) with the signals and errors and logs.

                                                                  And yes - it got what I consider to be "very hot" measured the exterior of the case better than 120F with an IR thermometer. I tried having a case fan blow thru it, that helped the errors stay from the several 1000's down to several 100's but it still was generating errors after a while especially in evenings.

                                                                  At one point I even found some forum suggesting cellular interference - I even tried disconnecting my FemToCell (which is the only way to get usable cell service in this area) to rule that out since it sits near where the coax comes into the house. That did not look like it made an appreciable difference.

                                                                  Only other modem I already owned is an ancient one that I used for maybe 6-8 years in college but it's too old to be supported (it may not even be DOCSIS2.0, I only had like 3Mbps back then). Saturday evening I finally got mad at it and replaced with an Arris SB6190 (which I also have a fan blowing thru the case slots as well) when I called the ISP to have them register the new modem the technician I spoke to thought it could just be at peak times the node I'm on is over capacity and throwing errors.

                                                                  There could be some credit to the cable-tech's theory on node saturation - as I think about it "most of the time" when I've started seeing errors throwing all over before dropping offline it has started to blow up around 10PM local time. In my tests I have been able to nearly saturate both upload/download (which is not easy to saturate 150-200Mbps downlink) for 8-9 hours while I'm at work just running an infinite loop and it has never been offline when I got home.

                                                                  Since I really want a separate stand-alone modem that isn't an ISP-managed "all in one router" it looks like Arris is about the only modem available.

                                                                  Side-note, when I called the ISP for a modem swap I had an unusual experience, getting a tech who was not only partially familiar with Linux setups but also was a ham radio operator and I was able to have an intelligent conversation about why I was wanting to swap modems and what concerns I had about the instability, and how I'd checked signals (which he also verified signals look good when it was not dropped offline from his end). He also described the management system he used to set up my new modem as "archaic" for whatever that is worth. It does not have the ability to just "plug n play" a new modem like some IPSs where you can log in and self-register, there's no captive portal, the only way to swap them is call the tech support and have them replace the head end modem config MAC address from their end.

                                                                  I wish I had other ISP options or fiber...only thing my parents have had on FiOS is the "North American Fiber-Seeking Backhoe" eating the optical cable every year or two. But alas, out here I have exactly one ISP and unusable cellular, too many trees for satellite (and I rent so no cutting them down).

                                                                  1 Reply Last reply Reply Quote 0
                                                                  • A
                                                                    acascianelli last edited by

                                                                    I’ve been having a very similar problem. I’m using a SB6183 as well, but with a PCEngines APU2 as my firewall.

                                                                    PC Engines APU2C4

                                                                    1 Reply Last reply Reply Quote 0
                                                                    • M
                                                                      mmiller7 last edited by mmiller7

                                                                      Just wanted to post an update - while its only been 4 days since I got the new modem so far I have not had any more lockups/dropouts even pushing 200GB per day transfer (I've been trying to run frequent speed tests, several pings, plus normal traffic). Also while I have some channels with "corrected" frames on the modem its only 10 or so at most and 0 uncorrectable (down from many thousands)

                                                                      On pfSense Status > Monitoring reports only 0.21% maximum packet loss and 0% average and my ping has stayed below 50mS even under load and immediately returns to <10mS when load lets up.

                                                                      The dmesg output shows no unexpected messages, no flapping, no "watchdog" errors and no "llinfo" errors. It seems stable once again. Hopefully I didn't just jinx it.

                                                                      EDIT: 7 days now going strong.

                                                                      1 Reply Last reply Reply Quote 1
                                                                      • First post
                                                                        Last post