metronet fiber, internet goes down roughly every 24 hours
-
it is currently my backup router in case my main router goes down. if it will not work on 2.4.5 i will need to find another backup solution
metronet does not seem to like my netgate minnowboard mbt 4220 so i am having to run the sg2220 until i prove metronet is the issue
-
Yes.
Though there is no 2.4.5 release yet only RC snapshots.Steve
-
thank you
i am about to boot up the minnowboard. do you have any idea what i can be looking for in the logs for a failing wan port?
example:
the first line is when the internet went down. 100.92.192.1 is the gateway address of the modem itself
Feb 16 05:55:27 dpinger WAN_DHCP 100.92.192.1: Alarm latency 2326us stddev 2231us loss 21% Feb 16 06:03:57 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP " Feb 16 06:03:57 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP " Feb 16 06:04:05 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP " Feb 16 06:04:05 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP " Feb 16 06:07:25 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP " Feb 16 06:07:25 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP " Feb 16 06:07:33 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP " Feb 16 06:07:36 dpinger send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP "
-
All of those are dpinger starting apart from the first line which is an alarm when it hit >20% packet loss.
There's nothing in the system to show why it kept restarting? Flapping link would be my first guess.
Steve
-
Understood.
Would you agree that the device Hooked up to the pfsense router is what is going offline though
Which tab are you suggesting I review logs from
-
The main system log would show the link going up and down if it is in fact doing that.
If not it should show what it triggering dpinger to restart continually.Steve
-
The device was powered off so it does not show any logs under system - general before 11am the last time I got the previous log I posted
Checking car/ log -
Order 101916 has been placed
I am not going to use the minnowboard as my primary device anymore
Thank you for great support and products
-
Well I can't object. But it would be good to know why that was happening. You might try swapping the LAN and WAN assignment and see if the error follows the interface or stays on the port.
Steve
-
i tried loading ubuntu LTS. 2 version in fact, to test the nics. i don't have a hub to get both a keyboard and mouse working, only one would work at a time. a keyboard with a hub built in... still both would not work. could not find any good documentation
then i find the minnowboard project has been abandoned pretty much, so i'll scratch that 300 dollar purchase off as a lesson (nothing against Netgate i know it wasn't an official product)
time to move on to bigger and better you know
-
That 4220 might still be covered under warranty. I’m pretty sure it was released after the SG-2220.
Jeff
-
I bought it in august of 2018
The whole minnowboard project has been abandoned it appears
-
@bcruze I would still contact them, it's probably under warranty.
https://go.netgate.com/support/login
Jeff
-
@akuma1x said in is the Netgate sg-2220 capable of release 2.4.5:
@bcruze I would still contact them, it's probably under warranty.
https://go.netgate.com/support/login
Jeff
looks like i have my answer in this thread. it is a Pfsense software issue : https://forum.netgate.com/topic/150568/problems-reestablishing-the-connection/2
-
since my wan continues to drop(supposedly Metronet says its not). and Pfsense will not reconnect:
under interfaces > wan > dhcp configuration > presets.
correct me if i am wrong. but will that not force the device to reconnect per the presets and fix this issue?
-
If you are actually hitting https://redmine.pfsense.org/issues/9267 then the problem is that if dhclient hits it's timeout values it mishandles the error, crashes out and never retries. Others have worked around that by setting the timeout to something much higher like 900s. https://forum.netgate.com/post/854007
That works well for the situation where the modem reboots but takes longer than the timeout to do so.
Steve
-
its enabled now. it was not before. 24 hours will tell, thanks for the replies Stephen
more logs, that show the gateway goes down for whatever reason.
Feb 19 18:13:27 rc.gateway_alarm 2403 >>> Gateway alarm: WAN_DHCP (Addr:100.92.192.1 Alarm:1 RTT:1.135ms RTTsd:1.512ms Loss:21%) Feb 19 18:13:27 check_reload_status updating dyndns WAN_DHCP Feb 19 18:13:27 check_reload_status Restarting ipsec tunnels Feb 19 18:13:27 check_reload_status Restarting OpenVPN tunnels/interfaces Feb 19 18:13:27 check_reload_status Reloading filter Feb 19 18:13:28 php-fpm 66863 /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN_DHCP. Feb 19 18:16:59 php-fpm 53040 /index.php: Successful login for user 'admin' from: 192.168.1.205 (Local Database) Feb 19 18:21:59 php-fpm 53040 /status_interfaces.php: The command '/usr/local/sbin/dhclient {$ipv} -d -r -lf '/var/db/dhclient.leases.igb0' -cf '/var/etc/dhclient_wan.conf' -sf '/usr/local/sbin/pfSense-dhclient-script'' returned exit code '1', the output was 'Internet Systems Consortium DHCP Client 4.3.6-P1 Copyright 2004-2018 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ Listening on BPF/igb0/00:08:a2:09:e9:be Sending on BPF/igb0/00:08:a2:09:e9:be Can't attach interface {} to bpf device /dev/bpf0: Device not configured If you think you have received this message due to a bug rather than a configuration issue please read the section on submitting bugs on either our web page at www.isc.org or in the README file before submitting a bug. These pages explain the proper process and the information we find helpful for debugging. exiting.' Feb 19 18:22:07 check_reload_status rc.newwanip starting igb0 Feb 19 18:22:08 php-fpm 66863 /rc.newwanip: rc.newwanip: Info: starting on igb0. Feb 19 18:22:08 php-fpm 66863 /rc.newwanip: rc.newwanip: on (IP address: 100.92.220.245) (interface: WAN[wan]) (real interface: igb0). Feb 19 18:22:08 php-fpm 66863 /rc.newwanip: The command '/sbin/route delete -host ' returned exit code '64', the output was 'route: destination parameter required route: usage: route [-46dnqtv] command [[modifiers] args]' Feb 19 18:22:08 php-fpm 66863 /rc.newwanip: The command '/sbin/route delete -host ' returned exit code '64', the output was 'route: destination parameter required route: usage: route [-46dnqtv] command [[modifiers] args]' Feb 19 18:22:13 php-fpm 66863 /rc.newwanip: Resyncing OpenVPN instances for interface WAN. Feb 19 18:22:13 php-fpm 66863 OpenVPN terminate old pid: 98658 Feb 19 18:22:14 kernel sonewconn: pcb 0xfffff8000a55db40: Listen queue overflow: 2 already in queue awaiting acceptance (1 occurrences) Feb 19 18:22:16 php-fpm 66863 /rc.newwanip: OpenVPN ID client3 PID 98658 still running, killing. Feb 19 18:22:17 kernel ovpnc3: link state changed to DOWN Feb 19 18:22:17 php-fpm 66863 OpenVPN PID written: 84973 Feb 19 18:22:17 check_reload_status Reloading filter Feb 19 18:22:17 php-fpm 66863 /rc.newwanip: Creating rrd update script Feb 19 18:22:19 kernel ovpnc3: link state changed to UP Feb 19 18:22:19 check_reload_status rc.newwanip starting ovpnc3 Feb 19 18:22:19 php-fpm 66863 /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 100.92.220.245 -> 100.92.220.245 - Restarting packages. Feb 19 18:22:19 check_reload_status Starting packages Feb 19 18:22:20 php-fpm 339 /rc.newwanip: rc.newwanip: Info: starting on ovpnc3.
-
Still went down, I called they reset the ont and it came back instantly
I’ll have a static ip Monday
And a new modem Monday...
I setup the new 3100 today and it runs great!
Anything specific to setup a static address on pfsense?
-
Not really, if it's actually a static IP set the WAN to static and configure the IP, subnet and gateway.
If it's supplied to you a static dhcp lease then you don't have to do anything.
Steve
-
Every 24 hours this is beyond frustrating
3 pfsense routers all the same issue now
So according to the redmine fix isn’t released until version 2.5?
-
It's not at all clear that's what you're hitting. When it goes down what do you do to re-stablish it?
If that's your only WAN connection you can just disable the 'gateway monitoring action' on the gateway. That will keep the monitoring stats to prevent it reloading everything whenever the gateway stops responding.
You should probably also set the monitoring IP to something other than the gateway. The gateway may well drops pings under load.Steve
-
i turned off gateway monitoring entirely yesterday and it still went down early this morning
i go to interface tab. release, renew and the connection instantly comes back online
here are my findings the Nokia ONT is changing the local address of the device every 24 hours. when it changes Pfsense says the wan goes down, and everything stops working. i have no way to communicate to the ONT by webqui or ftp etc. since they willl not release that to me.
the last setting i just changed under wan is : Block private networks and loopback addresses UNCHECKED
-
Ok, what does the system log show when that happens? Or the DHCP log?
-
the 10.92 device is the ONT modem. which as you can see the address is changing
Feb 23 14:26:39 kernel sonewconn: pcb 0xc6ba38f0: Listen queue overflow: 2 already in queue awaiting acceptance (1 occurrences) Feb 23 14:29:32 kernel arpresolve: can't allocate llinfo for 100.92.192.1 on mvneta2 Feb 23 14:29:32 kernel arpresolve: can't allocate llinfo for 100.92.192.1 on mvneta2 Feb 23 14:29:32 php-fpm 363 /status_interfaces.php: Shutting down Router Advertisment daemon cleanly Feb 23 14:29:33 kernel arpresolve: can't allocate llinfo for 100.92.192.1 on mvneta2 Feb 23 14:29:33 kernel arpresolve: can't allocate llinfo for 100.92.192.1 on mvneta2 Feb 23 14:29:39 check_reload_status rc.newwanip starting mvneta2 Feb 23 14:29:39 php-fpm 24613 /status_interfaces.php: calling interface_dhcpv6_configure. Feb 23 14:29:39 php-fpm 24613 /status_interfaces.php: Accept router advertisements on interface mvneta2 Feb 23 14:29:39 php-fpm 24613 /status_interfaces.php: Starting rtsold process Feb 23 14:29:40 php-fpm 9259 /rc.newwanip: rc.newwanip: Info: starting on mvneta2. Feb 23 14:29:40 php-fpm 9259 /rc.newwanip: rc.newwanip: on (IP address: 100.92.204.194) (interface: WAN[wan]) (real interface: mvneta2). Feb 23 09:29:41 rtsold 51876 <sendpacket> sendmsg on mvneta2: Permission denied Feb 23 14:29:42 kernel ovpnc1: link state changed to DOWN Feb 23 14:29:42 check_reload_status Reloading filter Feb 23 14:29:43 kernel ovpnc1: link state changed to UP Feb 23 14:29:43 check_reload_status rc.newwanip starting ovpnc1 Feb 23 14:29:44 php-fpm 362 /rc.newwanip: rc.newwanip: Info: starting on ovpnc1. Feb 23 14:29:44 php-fpm 362 /rc.newwanip: rc.newwanip: on (IP address: 10.32.118.189) (interface: AIRVPN[opt3]) (real interface: ovpnc1). Feb 23 14:29:44 php-fpm 362 /rc.newwanip: IP Address has changed, killing states on former IP Address 10.27.202.21. Feb 23 09:29:45 rtsold 51876 <sendpacket> sendmsg on mvneta2: Permission denied Feb 23 14:29:45 php-fpm 9259 /rc.newwanip: Resyncing OpenVPN instances for interface WAN. Feb 23 14:29:46 php-fpm 9259 OpenVPN terminate old pid: 84804 Feb 23 14:29:46 kernel ovpnc1: link state changed to DOWN Feb 23 14:29:46 check_reload_status Reloading filter Feb 23 14:29:46 php-fpm 9259 OpenVPN PID written: 49485 Feb 23 14:29:46 check_reload_status Reloading filter Feb 23 14:29:46 php-fpm 9259 /rc.newwanip: Creating rrd update script Feb 23 14:29:48 php-fpm 9259 /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 100.92.204.194 -> 100.92.204.194 - Restarting packages. Feb 23 14:29:48 check_reload_status Starting packages Feb 23 09:29:49 rc.gateway_alarm 26316 >>> Gateway alarm: AIRVPN_VPNV4 (Addr:10.32.118.189 Alarm:1 RTT:.898ms RTTsd:1.084ms Loss:25%) Feb 23 14:29:49 check_reload_status updating dyndns AIRVPN_VPNV4 Feb 23 14:29:49 check_reload_status Restarting ipsec tunnels Feb 23 14:29:49 check_reload_status Restarting OpenVPN tunnels/interfaces Feb 23 14:29:49 check_reload_status Reloading filter Feb 23 09:29:49 rtsold 51876 <sendpacket> sendmsg on mvneta2: Permission denied
-
-
I don't see a 10.100.x.x IP address those there. you mean the 100.92.x.x address? That looks like CGN.
It looks like the WAN address is changing, which is not unusual. The addresses involved there seem to be the WAN and it's gateway, I would not expect to see an IP for the local modem device at all.
The DHCP logs are hard to read in that format. We are only interested in the dhclient entries. They might be in UTC vs whatever timezone your other logs are in. Really we need to see the dhclient entries in the dhcp log and the system log entries covering that same time span to see what triggered the renewal, or lack of it.
If you're hitting that bug I expect to see the link fail or the lease force renewed for some reason in the system log.
Then the dhclient start in the dhcp log but fail to get an IP and timeout.
Then some error in the system log.
Then nothing until you manually restart dhclient.I'm not seeing that though, from what we have so far it's more like the remote dhcp server is handing you a ludicrously long lease and then expecting it to renew sooner. This entry:
dhclient 93554 bound to 100.92.204.194 -- renewal in 43200 seconds.
Shows a 12h lease though. pfSense would normally attempt to renew it after half the lease time. If you filter the dhcp log for only dhclient process entries I expect to see it renewing every 6h.
Steve
-
fixed the IP. this is carrier grade nat...
is there a way i can get you the logs you need by the var/log area easier?
Feb 23 09:35:29 pfSense dhclient[29691]: connection closed
Feb 23 09:35:29 pfSense dhclient[29691]: exiting.
Feb 23 09:35:31 pfSense dhclient: PREINIT
Feb 23 09:35:32 pfSense dhclient[54752]: DHCPREQUEST on mvneta2 to 255.255.255.255 port 67
Feb 23 09:35:32 pfSense dhclient[54752]: DHCPACK from 100.92.192.3
Feb 23 09:35:32 pfSense dhclient: REBOOT
Feb 23 09:35:32 pfSense dhclient: Starting add_new_address()
Feb 23 09:35:32 pfSense dhclient: ifconfig mvneta2 inet 100.92.204.194 netmask 255.255.224.0 broadcast 100.92.223.255
Feb 23 09:35:32 pfSense dhclient: New IP Address (mvneta2): 100.92.204.194
Feb 23 09:35:32 pfSense dhclient: New Subnet Mask (mvneta2): 255.255.224.0
Feb 23 09:35:32 pfSense dhclient: New Broadcast Address (mvneta2): 100.92.223.255
Feb 23 09:35:32 pfSense dhclient: New Routers (mvneta2): 100.92.192.1
Feb 23 09:35:32 pfSense dhclient: Adding new routes to interface: mvneta2
Feb 23 09:35:32 pfSense dhclient: Creating resolv.conf
Feb 23 09:35:32 pfSense dhclient[54752]: bound to 100.92.204.194 -- renewal in 43200another:
Feb 23 14:29:03 pfSense dhclient[8995]: send_packet: No route to host
Feb 23 14:29:30 pfSense dhclient[8995]: connection closed
Feb 23 14:29:30 pfSense dhclient[8995]: exiting.
Feb 23 09:29:39 pfSense dhclient: PREINIT
Feb 23 09:29:39 pfSense dhclient[29287]: DHCPREQUEST on mvneta2 to 255.255.255.255 port 67
Feb 23 09:29:39 pfSense dhclient[29287]: DHCPACK from 100.92.192.3
Feb 23 09:29:39 pfSense dhclient: REBOOT
Feb 23 09:29:39 pfSense dhclient: Starting add_new_address()
Feb 23 09:29:39 pfSense dhclient: ifconfig mvneta2 inet 100.92.204.194 netmask 255.255.224.0 broadcast 100.92.223.255
Feb 23 09:29:39 pfSense dhclient: New IP Address (mvneta2): 100.92.204.194
Feb 23 09:29:39 pfSense dhclient: New Subnet Mask (mvneta2): 255.255.224.0
Feb 23 09:29:39 pfSense dhclient: New Broadcast Address (mvneta2): 100.92.223.255
Feb 23 09:29:39 pfSense dhclient: New Routers (mvneta2): 100.92.192.1
Feb 23 09:29:39 pfSense dhclient: Adding new routes to interface: mvneta2
Feb 23 09:29:39 pfSense dhclient: /sbin/route add default 100.92.192.1
Feb 23 09:29:39 pfSense dhclient: Creating resolv.conf
Feb 23 09:29:39 pfSense dhclient[29287]: bound to 100.92.204.194 -- renewal in 1800 seconds. -
Is that two different devices? On two separate connections? They have the same IP address but only 6mins apart....
One is given a 12h lease but the other only 30mins.
I think you may have some conflict there....
-
same equipment. the new sg 3100.
is there something i can change or look at to remedy this?
-
So that is two excerpts from the same dhcp log on the same device?
The time stamps are confusing, what order should those be read in?
Both those show successful renewal though which is not what is expected from the dhcp bug we referenced.
Steve
-
same 3100. same device
i pulled them directly from var\logs i copied them to a word file. ctrl + f what you wanted to see and pasted it here
read top to bottom
i am really hoping a static IP address from the provider will resolve this
called my isp to see if they could enable the static today. not possible i have to wait until tomorrow at 9am EST :(
-
In the dhcp logs view you can filter by the dhclient process and then just copy/paste them here directly without going through Word (or any other editor).
Some of those logs show the gateway not responding to ARP which probably won't be solved by using a static IP. If you can get one though if will obviously solve any dhcp issues.Steve
-
static ip has been set and active for over 24 hours now. NO issues whatsoever. first time ever with this new internet service.
i will look forward to when Internet Systems Consortium DHCP Server 4.3.6-P1 is updated to 4.4.2 within Pfsense per https://www.isc.org/dhcp/
so all of this is fixed for users.
so to continue using Pfsense i will be paying 10 dollars extra a month until the release..
-
It's dhclient not the dhcp server. It will be in 2.5 when that is released, it isn't yet in 2.5 snaps as they are currently built on 12.0.
Steve
-
As I said (in the wrong thread)....
stephenw10 Netgate Administrator about 16 hours ago
Ok, good news. The binary part of the fix for this is now in 2.4.5 snapshots:
https://github.com/pfsense/FreeBSD-src/commits/RELENG_2_4_5/sbin/dhclient/dhclient.cThe full fix also requires changes to the dhclient script which can be applied via the system patches package. I have briefly tested that and it didn't seem to break anything.
That patch is here: https://redmine.pfsense.org/attachments/download/2682/pfsense-dhclient-script-patch.txt
If you're able to test it we may be able to include it in 2.4.5.
Steve
-
no problem Sir.
at this time i have no way to test as i am locked in a one year agreement with a static WAN ip address. my issue resolved.
not sure i provided you any good information. feel free to lock this thread and work with the other gentlemen if that seems best
-
I'm going to test locally but I can only try to simulate a failed dhcp server. It is definitely a bug that would be very good to squash. I'd love to hear from anyone who is hitting it 'in the field'.
Steve
-
@stephenw10 Hello, I feel like I'm running into this.
Quick overview on my situation:
- Currently running ver 22.05-RELEASE on a 2100. (recently updated)
- Ran over 2 years w/o ever having to reboot appliance on Xfinity internet.
- Recent switch to MetroNet Fiber - Internet goes down every 24-36 hours.
- Has happened 3 times so far, feels like it is reproducing reliably within these time frames.
- Manually releasing WAN / renew WAN action on WAN interface - Restores Internet
Trying to figure out a good path forward:
- Pay $10/month for a static IP (I'd rather not)
- Can we script something to automate the release/renew action during the night on a daily basis? (Feels like a hack)
- Is this a bug that can be fixed? (best solution IMO)
-
What do you see logged when it happens?
The actual bug referenced here was fixed in 2.4.5:
https://redmine.pfsense.org/issues/9267I suggest you may well be hitting this:
https://forum.netgate.com/post/1063443So specifically MetroNets broken DHCP relay behaviour. The workaround shown there should prevent it if so.
Steve
-
I haven't caught logs when the issue happens yet. I'm hoping I can catch on the next reproduce since I "sort of" know what to look for. Below is what I have on the most recent restore of services.
Any logging in particular you would need?
I feel like you're probably right about it being an issue with MetroNet. I called them today and they just rebooted everything and suggested I move to another piece of equipment that they can support.
Appreciate you linking me to that HOWTO post. I'll probably end up implementing that after I reproduce one more time.
This was on the latest restore:
(Post was flagged for SPAM, so created a pastebin)
https://pastebin.com/04r3wLek