Running pfsense 2.7.0-release (amd64) and it randomly fails losing connectiion to ISP
-
Any Intel 1G NIC would be far better.
Nothing really logged there beyond the packet loss alarm. No watchdog timeouts logged.
-
I have had this problem happen again -- Loss of connections with ISP (within the last 15 minutes) and I have an Intel chip dual port ethernet 1Gb card. The following is what the syslog shows (I did a reroot reboot):
Nov 20 19:46:04 php-fpm 60708 [Snort] Snort STOP for WAN(igb1)...
Nov 20 19:46:05 snort 68178 *** Caught Term-Signal
Nov 20 19:46:05 kernel igb1: promiscuous mode disabled
Nov 20 20:03:00 sshguard 85448 Exiting on signal.
Nov 20 20:03:00 sshguard 55524 Now monitoring attacks.
Nov 20 21:11:00 sshguard 55524 Exiting on signal.
Nov 20 21:11:00 sshguard 38547 Now monitoring attacks.
Nov 20 21:16:00 sshguard 38547 Exiting on signal.
Nov 20 21:16:00 sshguard 43043 Now monitoring attacks.
Nov 21 00:20:00 kernel pid 77019 (php), jid 0, uid 0: exited on signal 6 (core dumped)
Nov 21 01:10:00 sshguard 43043 Exiting on signal.
Nov 21 01:10:00 sshguard 18059 Now monitoring attacks.
Nov 21 05:33:00 sshguard 18059 Exiting on signal.
Nov 21 05:33:00 sshguard 6020 Now monitoring attacks.
Nov 21 10:09:00 sshguard 6020 Exiting on signal.
Nov 21 10:09:00 sshguard 6274 Now monitoring attacks.
Nov 21 10:47:00 sshguard 6274 Exiting on signal.
Nov 21 10:47:00 sshguard 32946 Now monitoring attacks.
Nov 21 11:36:52 rc.gateway_alarm 70343 >>> Gateway alarm: WAN_DHCP (Addr:8.8.8.8 Alarm:1 RTT:24.697ms RTTsd:.955ms Loss:21%)
Nov 21 11:36:52 check_reload_status 443 updating dyndns WAN_DHCP
Nov 21 11:36:52 check_reload_status 443 Restarting IPsec tunnels
Nov 21 11:36:52 check_reload_status 443 Restarting OpenVPN tunnels/interfaces
Nov 21 11:36:52 check_reload_status 443 Reloading filter
Nov 21 11:36:53 php-fpm 60708 /rc.openvpn: Gateway, none 'available' for inet, use the first one configured. 'WAN_DHCP'
Nov 21 11:36:53 php-fpm 60708 /rc.openvpn: Gateway, NONE AVAILABLE
Nov 21 11:39:47 php-fpm 60708 /status_dhcp_leases.php: Session timed out for user 'admin' from: 192.168.1.37 (Local Database)
Nov 21 11:39:49 php-fpm 60708 /status_dhcp_leases.php: Successful login for user 'admin' from: 192.168.1.37 (Local Database)
Nov 21 11:41:16 php-fpm 83302 /diag_reboot.php: Stopping all packages.
Note that I stopped SNORT because of some anomalies with a US Gov't web site. Snort was not causing it just haven't turned it back on.
-
Are you still running the Realtek NICs?
-
Negative. I am running an INTEL dual port NIC. Both LAN and WAN go through that card. The MOBO has a port, but it is Realtek based so I decided to not use it.
-
Ah OK. So the WAN shows a gateway alarm there then you logged in and rebooted. I assume after reboot the WAN gateway shows as up? And if you did not reboot it stays down?
-
That is correct.
So I know something is not right when a streaming device stops. Or at your desk you tell your email client to fetch mail and it says it can't connect to .... Or a browser says the server is no longer responding....
Then I go and pop up the tab into pfSense and check status, and look at the log. When I see that message, I know the only way out (at this time) is reboot, so I also select reroot. Maybe that is over kill, but things come back up quickly after that and nothing is hung.
-
@Wylbur said in Running pfsense 2.7.0-release (amd64) and it randomly fails losing connectiion to ISP:
When I see that message, I know the only way out (at this time) is reboot
Which message specifically? The dpinger packet loss alarm?
-
This one or one like it:
Gateway alarm: WAN_DHCP (Addr:8.8.8.8 Alarm:1 RTT:24.697ms RTTsd:.955ms Loss:21 <<< Loss will be at 21 or higher -
Ok, I think we're going to need to dig into exactly what is failing here. Since there's nothing else logged when this happens it doesn't appear to be NIC link issue or routing change etc.
I think I would try running a packet capture on the WAN when it's in that state. See what's actually leaving there and if anything is coming back.
-
I've been looking at tracing and packet captures. But I'm not seeing what I would have expected. And it may be because of a difference in terminology. For NDM or Connect:Direct (a Managed File XFER product) I would turn on tracing for a specific thing, having to do with hand-shake or TCP|UDP packets for a specific address. In this case it is the WAN port that I need to trace. Is this Dataplane packet tracing? Also note, I have blocked IPv6 in/out for our environment should that be a possible problem. And if I understand correctly this is all CLI, so it can't be set up from the GUI, right?
-
Here you are just capturing all packets on the WAN to what, if anything, is there. I expect to see either the gateway monitoring pings or ARP requests at least. Probably not much else.
But if you see the upstream gateway sending ARP requests for example that gives us a clue.
You can run the packet capture from the webgui:
https://docs.netgate.com/pfsense/en/latest/diagnostics/packetcapture/webgui.html