Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Losing internet since this morning, packet loss and gateway offline

    Scheduled Pinned Locked Moved General pfSense Questions
    14 Posts 5 Posters 3.9k Views 8 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P Offline
      pftdm007
      last edited by

      Thanks Bill ! But that kinda raises more questions 😳

      • I understand that dpinger probes the WAN periodically, but this kind of issues never happened in many many years (if ever) since I live here... I've lost the web a few times (5-6 times in the last 9 years) but each times, power cycling the modem did the trick (with consumer grade gear, that happens). Is there a way to better isolate the issue? Perhaps a command to run or a tool to use on pfsense next time this happens? At this moment, its all working fine but I expect this happening again... Unless the issue is on the ISP's side and they fix their stuff. I will call them to ask.

      • I don't want to change settings such as thresholds to make the issue "go away". It always worked well, I see more this as a band-aid and would rather fix the underlying problem than "mask it"...

      • I take that all the issues I've presented above are all benign? What about the following???:

      Oct 13 12:29:57 vnstatd 14114 Error: pidfile "/var/run/vnstat/vnstat.pid" lock failed (Resource temporarily unavailable), exiting.

      Oct 13 12:29:37 ntopng [HTTPserver.cpp:1005] ERROR: [HTTP] set_ports_option: cannot bind to 3000s: Address already in use

      Oct 13 12:28:01 dhcpleases Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process.
      Oct 13 12:28:07 php-cgi servicewatchdog_cron.php: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1602606487] unbound[88054:0] error: bind: address already in use [1602606487] unbound[88054:0] fatal error: could not open ports'

      bmeeksB 1 Reply Last reply Reply Quote 0
      • bmeeksB Offline
        bmeeks @pftdm007
        last edited by bmeeks

        @pftdm007 said in Losing internet since this morning, packet loss and gateway offline:

        Oct 13 12:28:07 php-cgi servicewatchdog_cron.php: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1602606487] unbound[88054:0] error: bind: address already in use [1602606487] unbound[88054:0] fatal error: could not open ports'

        Let me start by saying the very first thing you need to do is uninstall the Service Watchdog package. It will conflict and interfere with the actions initiated by dpinger when it thinks the gateway is "down". There is no real need for the Service Watchdog package. Remove it! It is causing those problems you posted (the errors about PID files and such) because it will try to restart things that are already being restarted by the process kicked off by dpinger. So with multiple restart requests those packages and processes can sort of lose their mind, so to speak.

        Perhaps you have never had either as much loss, or for as long, as you are currently experiencing. The point remains that dpinger logged a gateway alarm with 25% packet loss. That is likely a real event. If you can log into your cable modem (usually at 192.168.100.1), you can check your upstream and downstream signal levels and signal-to-noise ratio values to see if they show any issues. You can also look at the gateway monitoring portion of the pfSense system log to see if dpinger is periodically seeing packet loss.

        1 Reply Last reply Reply Quote 1
        • AKEGECA Offline
          AKEGEC
          last edited by

          @pftdm007 , Internet Outages could be because of local, regional or ISP issues. Like ISP has being continuously attacked, while others stopped in a day or ISP changes their protocols. Try without security packages for a day.

          1 Reply Last reply Reply Quote 0
          • P Offline
            pftdm007
            last edited by

            @bmeeks : The Watchdog service is gone. Lets see if it helps reducing the amount of errors in the logs, and hopefully eliminate chances of interfering with the rest of the system..... I wonder why the package isn't deprecated in the package store?

            I cannot log in to my cable modem because I haven't found out how to yet. Its upstream of pfsense, and seems to be configured as bridge mode since pfsense's WAN IP is the one assigned by my ISP, and not an IP in the 192.168.... subnet. However the system logs > Gateways shows entries with 192.168.100.10 and 192.168.100.1...

            Not sure how this is configured. I haven't configured this modem in ages, at least for 7 years now, only power cycling it when its acting up. The other IP's masked by "xxx" are my public IP.

            Other than that, these are the entries in the logs. The newer ones are dating back from Oct 13 (when I posted here) and no newer errors or entries. I believe this is a sign the issues are resolved...

            Oct 13 12:38:26 	dpinger 		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr xxx.xxx.xxx.xxx bind_addr xxx.xxx.xxx.xxx identifier "WAN_HW_DHCP "
            Oct 13 12:38:17 	dpinger 		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr xxx.xxx.xxx.xxx bind_addr xxx.xxx.xxx.xxx identifier "WAN_HW_DHCP "
            Oct 13 12:37:47 	dpinger 		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 192.168.100.1 bind_addr 192.168.100.10 identifier "WAN_HW_DHCP "
            Oct 13 12:37:44 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65
            Oct 13 12:37:43 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65
            Oct 13 12:37:42 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65
            Oct 13 12:37:42 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65
            Oct 13 12:37:41 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65
            Oct 13 12:37:41 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65
            Oct 13 12:37:40 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65
            Oct 13 12:37:40 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65
            Oct 13 12:37:39 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65
            Oct 13 12:37:39 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65
            Oct 13 12:37:38 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65
            Oct 13 12:37:38 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65
            Oct 13 12:37:37 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65
            Oct 13 12:37:37 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65
            Oct 13 12:37:36 	dpinger 		WAN_HW_DHCP xxx.xxx.xxx.xxx: sendto error: 65 
            

            @AKEGEC : Thanks for the advice, that has always been my default approach to isolate the issue, deactivate packages, then firewall rules then HW troubleshooting. I got there quick ;)

            1 Reply Last reply Reply Quote 0
            • bmeeksB Offline
              bmeeks
              last edited by bmeeks

              You have a problem with your cable Internet connection. But that problem is then causing you a couple of different issues in pfSense. I'll try to briefly explain.

              When everything on the cable company's side is working correctly, your cable modem in bridge mode will respond to DHCP requests from pfSense on your WAN and assign the public IP provided by the cable company to pfSense. To be more techically precise, your modem will pass-through the DHCP traffic to the equipment at the cable company's headend. Then your Internet works normally.

              You have something going wrong now and then with your cable connection such that packet loss on your line increases. This causes dpinger to execute its "gateway down" actions. That consists of sending all packages and services that use the WAN interface "restart" commands.

              If the loss on your cable signal side is bad enough, your modem will lose sync with the piece of Internet equipment at the headend called a CMTS (Cable Modem Termination System). Once your cable modem loses sync with the CMTS, the modem will send pfSense a DHCP release/renew sequence and will assign pfSense a temporary RFC1918 address in the 192.168.100.0/24 subnet from a local DHCP server that lives on the cable modem. That's where that .10 IP address is likely coming from. The .1 IP address is the cable modem itself.

              Once pfSense gets that new WAN IP (the 192.168.100.10, for example), then pfSense is happy. The problem, though, is that IP address is not routable on the Internet and you can't talk to anything with that IP. What should happen is that when your cable modem re-establishes sync with the CMTS, it notifies pfSense that it needs to change its WAN IP to the public IP coming down from the CMTS. Unfortunately, with some cable modems, that second handshake with pfSense does not go well and pfSense happily retains the non-routable RFC1918 address it was given when communications with the CMTS was lost. When you physically reboot the cable modem, that breaks things loose and a normal DHCP request/offer sequence happens between pfSense and the CMTS and your connectivity returns.

              Most of us prevent our cable modems from giving pfSense that non-routable RFC1918 address by putting the IP of the cable modem (192.168.100.1) in the Reject Leases From field of the DHCP Client configuration for the WAN. See below --

              DHCP_client_reject_leases.png

              Telling pfSense to reject leases coming from the cable modem itself results in pfSense constantly sending DHCP requests out via the WAN. Once the CMTS and the modem sync back up, the modem will send the pfSense DHCP request to the CMTS and the CMTS will respond with a DHCP offer containing your public IP.

              If you can't connect to your cable modem at 192.168.100.1, then perhaps you have checked the box to block RFC1918 and bogons for the WAN. You will need to uncheck the RFC1918 box to talk to the cable modem. You should be able to connect and see a page where you can get signal stats. As an example, here are the Downstream RF stats from my cable modem:

              cable_modem_ds_stats.png

              In addition to signal level stats, you should have access to some type of event logging that will give you some clues about whether or not the modem is losing sync with the CMTS. Referring to my screenshot above, the logs are on the Event Log tab. Most cable modems have basically the same types of screens and data available. You can Google the specific brand and model of your cable modem for details. The cable company also has this data when they want to look at it, so they could be proactive and see the line loss and address it. But few do. They wait for customers to call and complain.

              1 Reply Last reply Reply Quote 2
              • johnpozJ Offline
                johnpoz LAYER 8 Global Moderator
                last edited by

                @bmeeks how long has your modem been up - those are some pretty high numbers for Uncorrectables.. I take it your modem has been up a really long time?

                An intelligent man is sometimes forced to be drunk to spend time with his fools
                If you get confused: Listen to the Music Play
                Please don't Chat/PM me for help, unless mod related
                SG-4860 24.11 | Lab VMs 2.8, 24.11

                bmeeksB 1 Reply Last reply Reply Quote 0
                • bmeeksB Offline
                  bmeeks @johnpoz
                  last edited by bmeeks

                  @johnpoz said in Losing internet since this morning, packet loss and gateway offline:

                  @bmeeks how long has your modem been up - those are some pretty high numbers for Uncorrectables.. I take it your modem has been up a really long time?

                  Yes, it has been up for quite a while. But my house also sits 350 feet off the street on a large lot, so I have a very long run of buried RG6 coax from the street drop to my home. I installed my own bi-directional line amp at the bottom of the pole where my drop enters the ground. My speed tests always max out, though, at 105 and 11 megabits/sec. I pay for what is currently the top package they offer in my small town (100/10).

                  Before installing the line amp I had frequent loss of sync -- like several times a week. Now I never lose sync unless the company is doing late night maintainence. I would like to see the error numbers lower, but we do lots of Netflix and Amazon streaming and never noticed any issues. And speed tests have always maxed out no matter when I test. So it works even if the stats are not fantastic.

                  1 Reply Last reply Reply Quote 0
                  • P Offline
                    pftdm007
                    last edited by

                    @bmeeks : I added the modem's "internal" IP to the Reject IP list in the WAN interface settings page. Lets see if next time when I lose internet connectivity things are happening in a "cleaner" way... At least if pfsense is "pounding" the WAN for a new IP via DHCP request until it gets a valid one I will clearly see it in the logs and will know what's happening...

                    As for accessing the model upstream of pfsense, I unchecked both boxes (Block bogon networks and block private networks) under WAN's settings , but when I try to access the modem, Firefox quickly throws a "Connection Reset" error page...

                    Not a big deal but that'd be interesting to see what's going on at the modem level...

                    1 Reply Last reply Reply Quote 0
                    • johnpozJ Offline
                      johnpoz LAYER 8 Global Moderator
                      last edited by johnpoz

                      To access your modem, you may need to create a vip on your modems network, say 192.168.100.2 and use that vip via outbound nat to access the modem status page.

                      vip.png

                      That source in mine is my local lan 192.168.9/24... So when client on my lan wants to connect to the modem status page pfsense nats that traffic to the vip IP set.. So modem sees traffic from 192.168.100.2

                      You may or may not need to do that.. Really depends on the modem, etc.

                      An intelligent man is sometimes forced to be drunk to spend time with his fools
                      If you get confused: Listen to the Music Play
                      Please don't Chat/PM me for help, unless mod related
                      SG-4860 24.11 | Lab VMs 2.8, 24.11

                      Raffi_R 1 Reply Last reply Reply Quote 0
                      • bmeeksB Offline
                        bmeeks
                        last edited by

                        @johnpoz is correct. Some modems need an accomodation on the WAN in order to access them when they are in bridge mode. My current Arris modem does not, and neither did my prior Motorola.

                        1 Reply Last reply Reply Quote 0
                        • johnpozJ Offline
                          johnpoz LAYER 8 Global Moderator
                          last edited by johnpoz

                          Yeah I can access mine as well without, I just have that setup since it doesn't really hurt anything.. And I can use it to show setup for others..

                          If your having issues - suggest you try setting it up..

                          Edit:
                          Here - turned it off you can see talking to modem from my public wan IP.. Then turned it back on and see it using the vip IP to talk.

                          modem.png

                          An intelligent man is sometimes forced to be drunk to spend time with his fools
                          If you get confused: Listen to the Music Play
                          Please don't Chat/PM me for help, unless mod related
                          SG-4860 24.11 | Lab VMs 2.8, 24.11

                          1 Reply Last reply Reply Quote 0
                          • Raffi_R Offline
                            Raffi_ @johnpoz
                            last edited by

                            @johnpoz said in Losing internet since this morning, packet loss and gateway offline:

                            To access your modem, you may need to create a vip on your modems network, say 192.168.100.2 and use that vip via outbound nat to access the modem status page.

                            vip.png

                            That source in mine is my local lan 192.168.9/24... So when client on my lan wants to connect to the modem status page pfsense nats that traffic to the vip IP set.. So modem sees traffic from 192.168.100.2

                            You may or may not need to do that.. Really depends on the modem, etc.

                            Didn't know about this setting. In my case, I had to add an Alias IPV4 address under the interface to access my 4G LTE modem GUI.
                            cfd5e601-d2c9-4131-8883-494e7da82aa3-image.png

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.