PfSense looses connection every 28-30 days.
-
Thanks Stephen.
It is my experience that NICs rarely fail. Apart from a hardware problem, do you have any ideas what is causing this issue every 28-30 days?
-
@johnpoz Thanks for the input.
I originally blamed BIND as well, but with the complete loss of the IP lease yesterday, I assumed other factors are at work.
"How is that clients behind not resolve but pfsense can? Are you pointing pfsense to other than itself? Maybe your bind just went offline if that is what is providing your users dns?"
I am assuming this has something to do with BIND. But more importantly it appears a result of the DHCP Lease in some way. I only disabled dpinger 30 days ago. I assume dpinger worked its magic to restore the IP address, but routing tables were hosed in the process.
Yesterday dpinger had been disabled already, so I assume this is why the IP collapsed completely.
-
@jacksnack2 said in PfSense looses connection every 28-30 days.:
Looks like dhclient is having issues.
You gave a list with mostly dhcp server logs, the process that hands out IP's on your LAN.
These are the details from the DHCP client : :
Apr 30 17:18:31 hostanme dhclient[8918]: send_packet: No route to host Apr 30 17:18:33 hostanme dhclient[8918]: send_packet: No route to host Apr 30 17:18:35 hostanme dhclient[8918]: send_packet: No route to host Apr 30 17:18:39 hostanme dhclient[8918]: send_packet: No route to host Apr 30 17:18:50 hostanme dhclient[8918]: send_packet: No route to host Apr 30 17:19:28 hostanme dhclient[8918]: send_packet: No route to host
Conclusion : the WAN interface is down .... (doesn't exist).
edit : btw : why do your LAN client called "PHN-LPC06TU0C" have to repeat server times their DHCP request (DISCOVER), pfSense received it and replied with an OFFER, and the LAN clients wait 5 seconds to send out another DISCOVER.
After several tries the LAN client finally accepts (receives ??) the OFFER and acknowledges with an ACK ....
A set of bad network cables ? Network overload ? VLAN issues ? Bad wifi connection ? -
Look at the DHCP logs and filter on process
dhclient
and post what's there.The interval from before it fails to after it recovers would be most telling.
My hunch is the WAN port is asking for a renewal and not getting it then reverting to a full DHCP request and not getting that either, then the lease expires and it just stops working - which would probably be an ISP/modem problem. All pfSense can do is ask for a renewal. The server has to respond to it.
Something else you might want to do is just start a Diagnostics > Packet Capture on WAN for UDP 67, set it for something like 1000000 packets and just let it run. Stop it after it fails. Even better would be to try something like disconnect/reconnect ethernet or restart the modem to see if you can get a capture including a recovery too.
-
@Derelict I have attached filtered DHCP logs as you suggested.[0_1556895346233_dhcp.log.filtered](Uploading 100%) dhcp.log.filtered.txt
Thanks all for your help.
-
Please use wireshark to filter what you want to show and upload the actual pcap.
-
Thanks again all,
I have enabled Wireshark and will report back when more information is available.
I should also note that the previous router did not seem to have these issues, It was a Netgear router. I replaced it with PfSense because the PfSense box sports faster interfaces and more functionality.
-
@jacksnack2 said in PfSense looses connection every 28-30 days.:
I have enabled Wireshark and will report back when more information is available.
What dude - just download the pcap from pfsense info wireshark - filter out what you don't want to show with wireshark... Save the pcap and upload it.
-
Sorry, I don't know what "What dude" means.
I understand how to use Wireshark.
Derelict advised to "Stop it after it fails". It may take weeks before another failure event takes place.
Until then Wireshark will show normal traffic. I doubt this is of any use.
-
That's why I said just let it run for 1000000 packets on WAN filtered on UDP port 67.
If you want to tie up a laptop or something running wireshark that's cool too.
-
Hello,
It was about ~35 days this time.
pcap uploaded filtered.pcap .
The pfsense machine was not rebooted, only the modem.
After a reboot, a new IP address was assigned. Connections are now normal.
Thanks Again.
-
Another case of the ISP device simply stopping responses to DHCP Requests and DHCP Discovers.
It looks to me like there is another MAC address out there making DHCP requests.
1051 2019-06-04 16:20:05.338922 0.0.0.0 68 255.255.255.255 67 DHCP 64 538 9.496981000 0x0000 (0) DHCP Discover - Transaction ID 0x4b38b221
Ethernet II, Src: ac:ec:80:79:2e:77, Dst: ff:ff:ff:ff:ff:ff
I'm assuming your WAN port MAC address is:
Ethernet II, Src: 38:60:77:04:e8:2c, Dst: ac:ec:80:79:2e:75
I do not know why what looks like the ISP modem would be making DHCP requests on that network but it seems fishy to me.
If pfSense renews the lease every 2 hours for a month then it just stops getting a response when issuing identical requests, it is a problem with the modem, not pfSense. Look for yourself. What is the difference between the requests in Frames 1,3,5, and 7 that did receive a response vs the requests in 9,10,11,12, etc that did not?
I would certainly put 192.168.100.1 in the Reject leases from area on the WAN configuration if it is not already there.
I don't see that pfSense is doing anything wrong here. The modem simply stops responding, apparently.
-
Ok, so you can see in the pacp that it renews the IP s few times but then the DHCP server stops responding. it keep trying and eventually starts broadcasting for any DHCP servers. Then it looses it's own DHCP lease but keeps sending requests.
Then after sometime the modem starts up it's dhcp server and gives the pfSense WAN a private IP.
You probably want to prevent that happening by adding 192.168.100.1 to the 'Reject leases from' field on the WAN DHCP setup.
However that doesn't explain why the remote dhcp server stopped responding.
Steve
-
@jacksnack2 said in PfSense looses connection every 28-30 days.:
Arris TM822G
Arris TM822G while not listed below it is a Puma 5 modem and has issues. Quite old actually if I remember right. Its a telephone modem so it would get an address of its own bridge mode or not.
Badmodems.com
POS.
Replace the old sparkplugs before any further diagnosis is the norm in my book.
-
Thank you all for the quick feedback.
I am a Linux Admin by trade, although networking does occasionally fall under my perview :>
I blocked the modem IP as suggested. Also, I have re-enabled dpinger as this allows the router to re-obtain a lease. However the issue is the while obtaining a lease via dpinger, DNS resolution fails for internal clients. A router reboot is required.
I don't see a solution here. But again, I do appreciate the help.
-
@jacksnack2 said in PfSense looses connection every 28-30 days.:
while obtaining a lease via dpinger,
Huh? Dpinger doesn't have anything to do with renewing a dhcp lease??
-
@johnpoz dpinger does not directly deal with leases, but it does fire actions:
/usr/local/sbin/pfSctl
-c "service reload dyndns ${GW}"
-c "service reload ipsecdns"
-c "service reload openvpn ${GW}"
-c "filter reload" >/dev/null 2>&1I can state the when dpinger was enabled, the router held an IP address, even though DNS did not work internally.
This allowed me to ssh into the machine and reboot.
Once I disabled dpinger, no IP address existed for the WAN.
Are you saying this is a coincidence?
-
None of that would have anything to do with dhcp lease renew..
-
Indeed the dhclient is independent of dpinger. Something it triggered may have restarted the dhclient perhaps but if it was able to pull a lease it would have done so anyway.
You said the rebooting the modem also allowed it to come back up. I would try simply pulling the WAN cable from either the modem or pfSense and reconnecting it. Does that also bring back the connection?
Are you running 2.4.4p3 now? It's possible you're hitting this: https://redmine.pfsense.org/issues/9267
That is fixed in current 2.5 snapshots if you're able to test one.
Steve
-