metronet fiber, internet goes down roughly every 24 hours
-
i turned off gateway monitoring entirely yesterday and it still went down early this morning
i go to interface tab. release, renew and the connection instantly comes back online
here are my findings the Nokia ONT is changing the local address of the device every 24 hours. when it changes Pfsense says the wan goes down, and everything stops working. i have no way to communicate to the ONT by webqui or ftp etc. since they willl not release that to me.
the last setting i just changed under wan is : Block private networks and loopback addresses UNCHECKED
-
Ok, what does the system log show when that happens? Or the DHCP log?
-
the 10.92 device is the ONT modem. which as you can see the address is changing
Feb 23 14:26:39 kernel sonewconn: pcb 0xc6ba38f0: Listen queue overflow: 2 already in queue awaiting acceptance (1 occurrences) Feb 23 14:29:32 kernel arpresolve: can't allocate llinfo for 100.92.192.1 on mvneta2 Feb 23 14:29:32 kernel arpresolve: can't allocate llinfo for 100.92.192.1 on mvneta2 Feb 23 14:29:32 php-fpm 363 /status_interfaces.php: Shutting down Router Advertisment daemon cleanly Feb 23 14:29:33 kernel arpresolve: can't allocate llinfo for 100.92.192.1 on mvneta2 Feb 23 14:29:33 kernel arpresolve: can't allocate llinfo for 100.92.192.1 on mvneta2 Feb 23 14:29:39 check_reload_status rc.newwanip starting mvneta2 Feb 23 14:29:39 php-fpm 24613 /status_interfaces.php: calling interface_dhcpv6_configure. Feb 23 14:29:39 php-fpm 24613 /status_interfaces.php: Accept router advertisements on interface mvneta2 Feb 23 14:29:39 php-fpm 24613 /status_interfaces.php: Starting rtsold process Feb 23 14:29:40 php-fpm 9259 /rc.newwanip: rc.newwanip: Info: starting on mvneta2. Feb 23 14:29:40 php-fpm 9259 /rc.newwanip: rc.newwanip: on (IP address: 100.92.204.194) (interface: WAN[wan]) (real interface: mvneta2). Feb 23 09:29:41 rtsold 51876 <sendpacket> sendmsg on mvneta2: Permission denied Feb 23 14:29:42 kernel ovpnc1: link state changed to DOWN Feb 23 14:29:42 check_reload_status Reloading filter Feb 23 14:29:43 kernel ovpnc1: link state changed to UP Feb 23 14:29:43 check_reload_status rc.newwanip starting ovpnc1 Feb 23 14:29:44 php-fpm 362 /rc.newwanip: rc.newwanip: Info: starting on ovpnc1. Feb 23 14:29:44 php-fpm 362 /rc.newwanip: rc.newwanip: on (IP address: 10.32.118.189) (interface: AIRVPN[opt3]) (real interface: ovpnc1). Feb 23 14:29:44 php-fpm 362 /rc.newwanip: IP Address has changed, killing states on former IP Address 10.27.202.21. Feb 23 09:29:45 rtsold 51876 <sendpacket> sendmsg on mvneta2: Permission denied Feb 23 14:29:45 php-fpm 9259 /rc.newwanip: Resyncing OpenVPN instances for interface WAN. Feb 23 14:29:46 php-fpm 9259 OpenVPN terminate old pid: 84804 Feb 23 14:29:46 kernel ovpnc1: link state changed to DOWN Feb 23 14:29:46 check_reload_status Reloading filter Feb 23 14:29:46 php-fpm 9259 OpenVPN PID written: 49485 Feb 23 14:29:46 check_reload_status Reloading filter Feb 23 14:29:46 php-fpm 9259 /rc.newwanip: Creating rrd update script Feb 23 14:29:48 php-fpm 9259 /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 100.92.204.194 -> 100.92.204.194 - Restarting packages. Feb 23 14:29:48 check_reload_status Starting packages Feb 23 09:29:49 rc.gateway_alarm 26316 >>> Gateway alarm: AIRVPN_VPNV4 (Addr:10.32.118.189 Alarm:1 RTT:.898ms RTTsd:1.084ms Loss:25%) Feb 23 14:29:49 check_reload_status updating dyndns AIRVPN_VPNV4 Feb 23 14:29:49 check_reload_status Restarting ipsec tunnels Feb 23 14:29:49 check_reload_status Restarting OpenVPN tunnels/interfaces Feb 23 14:29:49 check_reload_status Reloading filter Feb 23 09:29:49 rtsold 51876 <sendpacket> sendmsg on mvneta2: Permission denied
-
-
I don't see a 10.100.x.x IP address those there. you mean the 100.92.x.x address? That looks like CGN.
It looks like the WAN address is changing, which is not unusual. The addresses involved there seem to be the WAN and it's gateway, I would not expect to see an IP for the local modem device at all.
The DHCP logs are hard to read in that format. We are only interested in the dhclient entries. They might be in UTC vs whatever timezone your other logs are in. Really we need to see the dhclient entries in the dhcp log and the system log entries covering that same time span to see what triggered the renewal, or lack of it.
If you're hitting that bug I expect to see the link fail or the lease force renewed for some reason in the system log.
Then the dhclient start in the dhcp log but fail to get an IP and timeout.
Then some error in the system log.
Then nothing until you manually restart dhclient.I'm not seeing that though, from what we have so far it's more like the remote dhcp server is handing you a ludicrously long lease and then expecting it to renew sooner. This entry:
dhclient 93554 bound to 100.92.204.194 -- renewal in 43200 seconds.
Shows a 12h lease though. pfSense would normally attempt to renew it after half the lease time. If you filter the dhcp log for only dhclient process entries I expect to see it renewing every 6h.
Steve
-
fixed the IP. this is carrier grade nat...
is there a way i can get you the logs you need by the var/log area easier?
Feb 23 09:35:29 pfSense dhclient[29691]: connection closed
Feb 23 09:35:29 pfSense dhclient[29691]: exiting.
Feb 23 09:35:31 pfSense dhclient: PREINIT
Feb 23 09:35:32 pfSense dhclient[54752]: DHCPREQUEST on mvneta2 to 255.255.255.255 port 67
Feb 23 09:35:32 pfSense dhclient[54752]: DHCPACK from 100.92.192.3
Feb 23 09:35:32 pfSense dhclient: REBOOT
Feb 23 09:35:32 pfSense dhclient: Starting add_new_address()
Feb 23 09:35:32 pfSense dhclient: ifconfig mvneta2 inet 100.92.204.194 netmask 255.255.224.0 broadcast 100.92.223.255
Feb 23 09:35:32 pfSense dhclient: New IP Address (mvneta2): 100.92.204.194
Feb 23 09:35:32 pfSense dhclient: New Subnet Mask (mvneta2): 255.255.224.0
Feb 23 09:35:32 pfSense dhclient: New Broadcast Address (mvneta2): 100.92.223.255
Feb 23 09:35:32 pfSense dhclient: New Routers (mvneta2): 100.92.192.1
Feb 23 09:35:32 pfSense dhclient: Adding new routes to interface: mvneta2
Feb 23 09:35:32 pfSense dhclient: Creating resolv.conf
Feb 23 09:35:32 pfSense dhclient[54752]: bound to 100.92.204.194 -- renewal in 43200another:
Feb 23 14:29:03 pfSense dhclient[8995]: send_packet: No route to host
Feb 23 14:29:30 pfSense dhclient[8995]: connection closed
Feb 23 14:29:30 pfSense dhclient[8995]: exiting.
Feb 23 09:29:39 pfSense dhclient: PREINIT
Feb 23 09:29:39 pfSense dhclient[29287]: DHCPREQUEST on mvneta2 to 255.255.255.255 port 67
Feb 23 09:29:39 pfSense dhclient[29287]: DHCPACK from 100.92.192.3
Feb 23 09:29:39 pfSense dhclient: REBOOT
Feb 23 09:29:39 pfSense dhclient: Starting add_new_address()
Feb 23 09:29:39 pfSense dhclient: ifconfig mvneta2 inet 100.92.204.194 netmask 255.255.224.0 broadcast 100.92.223.255
Feb 23 09:29:39 pfSense dhclient: New IP Address (mvneta2): 100.92.204.194
Feb 23 09:29:39 pfSense dhclient: New Subnet Mask (mvneta2): 255.255.224.0
Feb 23 09:29:39 pfSense dhclient: New Broadcast Address (mvneta2): 100.92.223.255
Feb 23 09:29:39 pfSense dhclient: New Routers (mvneta2): 100.92.192.1
Feb 23 09:29:39 pfSense dhclient: Adding new routes to interface: mvneta2
Feb 23 09:29:39 pfSense dhclient: /sbin/route add default 100.92.192.1
Feb 23 09:29:39 pfSense dhclient: Creating resolv.conf
Feb 23 09:29:39 pfSense dhclient[29287]: bound to 100.92.204.194 -- renewal in 1800 seconds. -
Is that two different devices? On two separate connections? They have the same IP address but only 6mins apart....
One is given a 12h lease but the other only 30mins.
I think you may have some conflict there....
-
same equipment. the new sg 3100.
is there something i can change or look at to remedy this?
-
So that is two excerpts from the same dhcp log on the same device?
The time stamps are confusing, what order should those be read in?
Both those show successful renewal though which is not what is expected from the dhcp bug we referenced.
Steve
-
same 3100. same device
i pulled them directly from var\logs i copied them to a word file. ctrl + f what you wanted to see and pasted it here
read top to bottom
i am really hoping a static IP address from the provider will resolve this
called my isp to see if they could enable the static today. not possible i have to wait until tomorrow at 9am EST :(
-
In the dhcp logs view you can filter by the dhclient process and then just copy/paste them here directly without going through Word (or any other editor).
Some of those logs show the gateway not responding to ARP which probably won't be solved by using a static IP. If you can get one though if will obviously solve any dhcp issues.Steve
-
static ip has been set and active for over 24 hours now. NO issues whatsoever. first time ever with this new internet service.
i will look forward to when Internet Systems Consortium DHCP Server 4.3.6-P1 is updated to 4.4.2 within Pfsense per https://www.isc.org/dhcp/
so all of this is fixed for users.
so to continue using Pfsense i will be paying 10 dollars extra a month until the release..
-
It's dhclient not the dhcp server. It will be in 2.5 when that is released, it isn't yet in 2.5 snaps as they are currently built on 12.0.
Steve
-
As I said (in the wrong thread)....
stephenw10 Netgate Administrator about 16 hours ago
Ok, good news. The binary part of the fix for this is now in 2.4.5 snapshots:
https://github.com/pfsense/FreeBSD-src/commits/RELENG_2_4_5/sbin/dhclient/dhclient.cThe full fix also requires changes to the dhclient script which can be applied via the system patches package. I have briefly tested that and it didn't seem to break anything.
That patch is here: https://redmine.pfsense.org/attachments/download/2682/pfsense-dhclient-script-patch.txt
If you're able to test it we may be able to include it in 2.4.5.
Steve
-
no problem Sir.
at this time i have no way to test as i am locked in a one year agreement with a static WAN ip address. my issue resolved.
not sure i provided you any good information. feel free to lock this thread and work with the other gentlemen if that seems best
-
I'm going to test locally but I can only try to simulate a failed dhcp server. It is definitely a bug that would be very good to squash. I'd love to hear from anyone who is hitting it 'in the field'.
Steve
-
@stephenw10 Hello, I feel like I'm running into this.
Quick overview on my situation:
- Currently running ver 22.05-RELEASE on a 2100. (recently updated)
- Ran over 2 years w/o ever having to reboot appliance on Xfinity internet.
- Recent switch to MetroNet Fiber - Internet goes down every 24-36 hours.
- Has happened 3 times so far, feels like it is reproducing reliably within these time frames.
- Manually releasing WAN / renew WAN action on WAN interface - Restores Internet
Trying to figure out a good path forward:
- Pay $10/month for a static IP (I'd rather not)
- Can we script something to automate the release/renew action during the night on a daily basis? (Feels like a hack)
- Is this a bug that can be fixed? (best solution IMO)
-
What do you see logged when it happens?
The actual bug referenced here was fixed in 2.4.5:
https://redmine.pfsense.org/issues/9267I suggest you may well be hitting this:
https://forum.netgate.com/post/1063443So specifically MetroNets broken DHCP relay behaviour. The workaround shown there should prevent it if so.
Steve
-
I haven't caught logs when the issue happens yet. I'm hoping I can catch on the next reproduce since I "sort of" know what to look for. Below is what I have on the most recent restore of services.
Any logging in particular you would need?
I feel like you're probably right about it being an issue with MetroNet. I called them today and they just rebooted everything and suggested I move to another piece of equipment that they can support.
Appreciate you linking me to that HOWTO post. I'll probably end up implementing that after I reproduce one more time.
This was on the latest restore:
(Post was flagged for SPAM, so created a pastebin)
https://pastebin.com/04r3wLek -
Yeah I bet it is that issue. The only way to be sure there is to packet capture the DHCP requests
and check that unicast packets are being ignored but broadcasts see responses.