Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Running pfsense 2.7.0-release (amd64) and it randomly fails losing connectiion to ISP

    Scheduled Pinned Locked Moved General pfSense Questions
    2.7.0-rel
    42 Posts 3 Posters 7.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • W
      Wylbur
      last edited by

      Note: I will be gone to a convention. I will have this system powered down while I am gone. Will be back 10-OCT-23, but I will have some access to email.

      Wylbur.

      W 1 Reply Last reply Reply Quote 0
      • W
        Wylbur @Wylbur
        last edited by

        @Wylbur I made the switch from the dual port card (realtek) (mac 00:e0:4c:61:b4:94) to the one on the MOBO (unknown what it is) and it took me a bit to figure out a few changes that had to be made and so far so good (I did this about 30 minutes ago).

        This was forced because of some weird problem I am having with a government site so I decided now is the time to do this just in case the problem is with the WAN port.

        Nice idea while it lasted. -- But things are otherwise working with this swap.

        I don't see collisions, but I do see a large number of interrupts. I don't exactly know what those are (I work with interrupt driven mainframes, so I expect to see interrupts coming out my ears. Every I/O is at least 1 interrupt, then there are various caused by instruction streams, the system timer generates many time interrupts for dispatcher processing...).

        Wylbur.

        stephenw10S 1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator @Wylbur
          last edited by

          @Wylbur said in Running pfsense 2.7.0-release (amd64) and it randomly fails losing connectiion to ISP:

          I don't see collisions, but I do see a large number of interrupts. I don't exactly know what those are (I work with interrupt driven mainframes, so I expect to see interrupts coming out my ears. Every I/O is at least 1 interrupt, then there are various caused by instruction streams, the system timer generates many time interrupts for dispatcher processing...).

          Exactly, interrupts are required for the NIC to function so that's not a problem unless the rate is very high.

          Steve

          W 1 Reply Last reply Reply Quote 0
          • W
            Wylbur @stephenw10
            last edited by

            @stephenw10 It has finally locked up twice now since the change to using the MOBO ethernet port. This is what I captured before a reboot (and I do not understand the error):

            Oct 28 04:37:18 kernel .done.
            Oct 28 04:37:22 php-cgi 482 rc.bootup: Creating rrd update script
            Oct 28 04:37:22 kernel done.
            Oct 28 04:37:23 syslogd exiting on signal 15
            Oct 28 04:37:23 syslogd kernel boot file is /boot/kernel/kernel
            Oct 28 04:37:23 php-fpm 382 /rc.start_packages: Restarting/Starting all packages.
            Oct 28 04:37:23 php-fpm 382 /rc.start_packages: [zeek] Removing cronjobs ...
            Oct 28 04:37:23 root 45144 Bootup complete
            Oct 28 04:37:25 login 56559 login on ttyv0 as root
            Oct 28 04:37:25 sshguard 65433 Now monitoring attacks.
            Oct 28 04:44:00 sshguard 65433 Exiting on signal.
            Oct 28 04:44:00 sshguard 11155 Now monitoring attacks.
            Oct 28 07:04:00 sshguard 11155 Exiting on signal.
            Oct 28 07:04:00 sshguard 70639 Now monitoring attacks.
            Oct 28 19:46:00 sshguard 70639 Exiting on signal.
            Oct 28 19:46:00 sshguard 70135 Now monitoring attacks.
            Oct 28 20:48:00 sshguard 70135 Exiting on signal.
            Oct 28 20:48:00 sshguard 95525 Now monitoring attacks.
            Oct 29 08:36:00 sshguard 95525 Exiting on signal.
            Oct 29 08:36:00 sshguard 22645 Now monitoring attacks.
            Oct 29 12:44:00 sshguard 22645 Exiting on signal.
            Oct 29 12:44:00 sshguard 65234 Now monitoring attacks.
            Oct 29 17:36:52 rc.gateway_alarm 27177 >>> Gateway alarm: WAN_DHCP (Addr:8.8.8.8 Alarm:1 RTT:15.698ms RTTsd:.952ms Loss:22%)
            Oct 29 17:36:52 check_reload_status 443 updating dyndns WAN_DHCP
            Oct 29 17:36:52 check_reload_status 443 Restarting IPsec tunnels
            Oct 29 17:36:52 check_reload_status 443 Restarting OpenVPN tunnels/interfaces
            Oct 29 17:36:52 check_reload_status 443 Reloading filter
            Oct 29 17:36:53 php-fpm 382 /rc.openvpn: Gateway, none 'available' for inet, use the first one configured. 'WAN_DHCP'
            Oct 29 17:36:53 php-fpm 382 /rc.openvpn: Gateway, NONE AVAILABLE
            Oct 29 19:29:43 php-fpm 382 /index.php: Session timed out for user 'admin' from: 192.168.1.21 (Local Database)
            Oct 29 19:29:48 php-fpm 382 /index.php: Successful login for user 'admin' from: 192.168.1.21 (Local Database)

            W 1 Reply Last reply Reply Quote 0
            • W
              Wylbur @Wylbur
              last edited by

              @Wylbur re1 = realTek 8168/8111 as is re0. re0 is now on the MOBO.

              What does anyone recommend for a better adapter? I'd like to try to replace the dual port card I have.

              Wylbur.

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Any Intel 1G NIC would be far better.

                Nothing really logged there beyond the packet loss alarm. No watchdog timeouts logged.

                W 1 Reply Last reply Reply Quote 0
                • W
                  Wylbur @stephenw10
                  last edited by

                  @stephenw10

                  I have had this problem happen again -- Loss of connections with ISP (within the last 15 minutes) and I have an Intel chip dual port ethernet 1Gb card. The following is what the syslog shows (I did a reroot reboot):

                  Nov 20 19:46:04 php-fpm 60708 [Snort] Snort STOP for WAN(igb1)...
                  Nov 20 19:46:05 snort 68178 *** Caught Term-Signal
                  Nov 20 19:46:05 kernel igb1: promiscuous mode disabled
                  Nov 20 20:03:00 sshguard 85448 Exiting on signal.
                  Nov 20 20:03:00 sshguard 55524 Now monitoring attacks.
                  Nov 20 21:11:00 sshguard 55524 Exiting on signal.
                  Nov 20 21:11:00 sshguard 38547 Now monitoring attacks.
                  Nov 20 21:16:00 sshguard 38547 Exiting on signal.
                  Nov 20 21:16:00 sshguard 43043 Now monitoring attacks.
                  Nov 21 00:20:00 kernel pid 77019 (php), jid 0, uid 0: exited on signal 6 (core dumped)
                  Nov 21 01:10:00 sshguard 43043 Exiting on signal.
                  Nov 21 01:10:00 sshguard 18059 Now monitoring attacks.
                  Nov 21 05:33:00 sshguard 18059 Exiting on signal.
                  Nov 21 05:33:00 sshguard 6020 Now monitoring attacks.
                  Nov 21 10:09:00 sshguard 6020 Exiting on signal.
                  Nov 21 10:09:00 sshguard 6274 Now monitoring attacks.
                  Nov 21 10:47:00 sshguard 6274 Exiting on signal.
                  Nov 21 10:47:00 sshguard 32946 Now monitoring attacks.
                  Nov 21 11:36:52 rc.gateway_alarm 70343 >>> Gateway alarm: WAN_DHCP (Addr:8.8.8.8 Alarm:1 RTT:24.697ms RTTsd:.955ms Loss:21%)
                  Nov 21 11:36:52 check_reload_status 443 updating dyndns WAN_DHCP
                  Nov 21 11:36:52 check_reload_status 443 Restarting IPsec tunnels
                  Nov 21 11:36:52 check_reload_status 443 Restarting OpenVPN tunnels/interfaces
                  Nov 21 11:36:52 check_reload_status 443 Reloading filter
                  Nov 21 11:36:53 php-fpm 60708 /rc.openvpn: Gateway, none 'available' for inet, use the first one configured. 'WAN_DHCP'
                  Nov 21 11:36:53 php-fpm 60708 /rc.openvpn: Gateway, NONE AVAILABLE
                  Nov 21 11:39:47 php-fpm 60708 /status_dhcp_leases.php: Session timed out for user 'admin' from: 192.168.1.37 (Local Database)
                  Nov 21 11:39:49 php-fpm 60708 /status_dhcp_leases.php: Successful login for user 'admin' from: 192.168.1.37 (Local Database)
                  Nov 21 11:41:16 php-fpm 83302 /diag_reboot.php: Stopping all packages.


                  Note that I stopped SNORT because of some anomalies with a US Gov't web site. Snort was not causing it just haven't turned it back on.

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    Are you still running the Realtek NICs?

                    W 1 Reply Last reply Reply Quote 0
                    • W
                      Wylbur @stephenw10
                      last edited by

                      @stephenw10

                      Negative. I am running an INTEL dual port NIC. Both LAN and WAN go through that card. The MOBO has a port, but it is Realtek based so I decided to not use it.

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Ah OK. So the WAN shows a gateway alarm there then you logged in and rebooted. I assume after reboot the WAN gateway shows as up? And if you did not reboot it stays down?

                        W 1 Reply Last reply Reply Quote 0
                        • W
                          Wylbur @stephenw10
                          last edited by

                          @stephenw10

                          That is correct.

                          So I know something is not right when a streaming device stops. Or at your desk you tell your email client to fetch mail and it says it can't connect to .... Or a browser says the server is no longer responding....

                          Then I go and pop up the tab into pfSense and check status, and look at the log. When I see that message, I know the only way out (at this time) is reboot, so I also select reroot. Maybe that is over kill, but things come back up quickly after that and nothing is hung.

                          stephenw10S 1 Reply Last reply Reply Quote 0
                          • stephenw10S
                            stephenw10 Netgate Administrator @Wylbur
                            last edited by

                            @Wylbur said in Running pfsense 2.7.0-release (amd64) and it randomly fails losing connectiion to ISP:

                            When I see that message, I know the only way out (at this time) is reboot

                            Which message specifically? The dpinger packet loss alarm?

                            W 1 Reply Last reply Reply Quote 0
                            • W
                              Wylbur @stephenw10
                              last edited by

                              @stephenw10

                              This one or one like it:
                              Gateway alarm: WAN_DHCP (Addr:8.8.8.8 Alarm:1 RTT:24.697ms RTTsd:.955ms Loss:21 <<< Loss will be at 21 or higher

                              1 Reply Last reply Reply Quote 0
                              • stephenw10S
                                stephenw10 Netgate Administrator
                                last edited by

                                Ok, I think we're going to need to dig into exactly what is failing here. Since there's nothing else logged when this happens it doesn't appear to be NIC link issue or routing change etc.

                                I think I would try running a packet capture on the WAN when it's in that state. See what's actually leaving there and if anything is coming back.

                                W 1 Reply Last reply Reply Quote 0
                                • W
                                  Wylbur @stephenw10
                                  last edited by

                                  @stephenw10

                                  I've been looking at tracing and packet captures. But I'm not seeing what I would have expected. And it may be because of a difference in terminology. For NDM or Connect:Direct (a Managed File XFER product) I would turn on tracing for a specific thing, having to do with hand-shake or TCP|UDP packets for a specific address. In this case it is the WAN port that I need to trace. Is this Dataplane packet tracing? Also note, I have blocked IPv6 in/out for our environment should that be a possible problem. And if I understand correctly this is all CLI, so it can't be set up from the GUI, right?

                                  1 Reply Last reply Reply Quote 0
                                  • stephenw10S
                                    stephenw10 Netgate Administrator
                                    last edited by

                                    Here you are just capturing all packets on the WAN to what, if anything, is there. I expect to see either the gateway monitoring pings or ARP requests at least. Probably not much else.
                                    But if you see the upstream gateway sending ARP requests for example that gives us a clue.
                                    You can run the packet capture from the webgui:
                                    https://docs.netgate.com/pfsense/en/latest/diagnostics/packetcapture/webgui.html

                                    1 Reply Last reply Reply Quote 0
                                    • First post
                                      Last post
                                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.