metronet fiber, internet goes down roughly every 24 hours



  • it is currently my backup router in case my main router goes down. if it will not work on 2.4.5 i will need to find another backup solution

    metronet does not seem to like my netgate minnowboard mbt 4220 so i am having to run the sg2220 until i prove metronet is the issue


  • Netgate Administrator

    Yes.
    Though there is no 2.4.5 release yet only RC snapshots.

    Steve



  • thank you

    i am about to boot up the minnowboard. do you have any idea what i can be looking for in the logs for a failing wan port?

    example:

    the first line is when the internet went down. 100.92.192.1 is the gateway address of the modem itself

    Feb 16 05:55:27 	dpinger 		WAN_DHCP 100.92.192.1: Alarm latency 2326us stddev 2231us loss 21%
    Feb 16 06:03:57 	dpinger 		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP "
    Feb 16 06:03:57 	dpinger 		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP "
    Feb 16 06:04:05 	dpinger 		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP "
    Feb 16 06:04:05 	dpinger 		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP "
    Feb 16 06:07:25 	dpinger 		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP "
    Feb 16 06:07:25 	dpinger 		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP "
    Feb 16 06:07:33 	dpinger 		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP "
    Feb 16 06:07:36 	dpinger 		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 100.92.192.1 bind_addr 100.92.205.91 identifier "WAN_DHCP "
    

  • Netgate Administrator

    All of those are dpinger starting apart from the first line which is an alarm when it hit >20% packet loss.

    There's nothing in the system to show why it kept restarting? Flapping link would be my first guess.

    Steve



  • Understood.

    Would you agree that the device Hooked up to the pfsense router is what is going offline though

    Which tab are you suggesting I review logs from


  • Netgate Administrator

    The main system log would show the link going up and down if it is in fact doing that.
    If not it should show what it triggering dpinger to restart continually.

    Steve



  • The device was powered off so it does not show any logs under system - general before 11am the last time I got the previous log I posted
    Checking car/ log



  • Order 101916 has been placed

    I am not going to use the minnowboard as my primary device anymore

    Thank you for great support and products


  • Netgate Administrator

    Well I can't object. 😉 But it would be good to know why that was happening. You might try swapping the LAN and WAN assignment and see if the error follows the interface or stays on the port.

    Steve



  • i tried loading ubuntu LTS. 2 version in fact, to test the nics. i don't have a hub to get both a keyboard and mouse working, only one would work at a time. a keyboard with a hub built in... still both would not work. could not find any good documentation

    then i find the minnowboard project has been abandoned pretty much, so i'll scratch that 300 dollar purchase off as a lesson (nothing against Netgate i know it wasn't an official product)

    time to move on to bigger and better you know



  • That 4220 might still be covered under warranty. I’m pretty sure it was released after the SG-2220.

    Jeff



  • I bought it in august of 2018

    The whole minnowboard project has been abandoned it appears



  • @bcruze I would still contact them, it's probably under warranty.

    https://go.netgate.com/support/login

    Jeff



  • @akuma1x said in is the Netgate sg-2220 capable of release 2.4.5:

    @bcruze I would still contact them, it's probably under warranty.

    https://go.netgate.com/support/login

    Jeff

    looks like i have my answer in this thread. it is a Pfsense software issue : https://forum.netgate.com/topic/150568/problems-reestablishing-the-connection/2



  • since my wan continues to drop(supposedly Metronet says its not). and Pfsense will not reconnect:

    under interfaces > wan > dhcp configuration > presets.

    correct me if i am wrong. but will that not force the device to reconnect per the presets and fix this issue? Screen Shot 2020-02-19 at 6.54.10 PM.png


  • Netgate Administrator

    If you are actually hitting https://redmine.pfsense.org/issues/9267 then the problem is that if dhclient hits it's timeout values it mishandles the error, crashes out and never retries. Others have worked around that by setting the timeout to something much higher like 900s. https://forum.netgate.com/post/854007

    That works well for the situation where the modem reboots but takes longer than the timeout to do so.

    Steve



  • its enabled now. it was not before. 24 hours will tell, thanks for the replies Stephen

    more logs, that show the gateway goes down for whatever reason.

    Feb 19 18:13:27	rc.gateway_alarm	2403	>>> Gateway alarm: WAN_DHCP (Addr:100.92.192.1 Alarm:1 RTT:1.135ms RTTsd:1.512ms Loss:21%)
    Feb 19 18:13:27	check_reload_status		updating dyndns WAN_DHCP
    Feb 19 18:13:27	check_reload_status		Restarting ipsec tunnels
    Feb 19 18:13:27	check_reload_status		Restarting OpenVPN tunnels/interfaces
    Feb 19 18:13:27	check_reload_status		Reloading filter
    Feb 19 18:13:28	php-fpm	66863	/rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN_DHCP.
    Feb 19 18:16:59	php-fpm	53040	/index.php: Successful login for user 'admin' from: 192.168.1.205 (Local Database)
    Feb 19 18:21:59	php-fpm	53040	/status_interfaces.php: The command '/usr/local/sbin/dhclient {$ipv} -d -r -lf '/var/db/dhclient.leases.igb0' -cf '/var/etc/dhclient_wan.conf' -sf '/usr/local/sbin/pfSense-dhclient-script'' returned exit code '1', the output was 'Internet Systems Consortium DHCP Client 4.3.6-P1 Copyright 2004-2018 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ Listening on BPF/igb0/00:08:a2:09:e9:be Sending on BPF/igb0/00:08:a2:09:e9:be Can't attach interface {} to bpf device /dev/bpf0: Device not configured If you think you have received this message due to a bug rather than a configuration issue please read the section on submitting bugs on either our web page at www.isc.org or in the README file before submitting a bug. These pages explain the proper process and the information we find helpful for debugging. exiting.'
    Feb 19 18:22:07	check_reload_status		rc.newwanip starting igb0
    Feb 19 18:22:08	php-fpm	66863	/rc.newwanip: rc.newwanip: Info: starting on igb0.
    Feb 19 18:22:08	php-fpm	66863	/rc.newwanip: rc.newwanip: on (IP address: 100.92.220.245) (interface: WAN[wan]) (real interface: igb0).
    Feb 19 18:22:08	php-fpm	66863	/rc.newwanip: The command '/sbin/route delete -host ' returned exit code '64', the output was 'route: destination parameter required route: usage: route [-46dnqtv] command [[modifiers] args]'
    Feb 19 18:22:08	php-fpm	66863	/rc.newwanip: The command '/sbin/route delete -host ' returned exit code '64', the output was 'route: destination parameter required route: usage: route [-46dnqtv] command [[modifiers] args]'
    Feb 19 18:22:13	php-fpm	66863	/rc.newwanip: Resyncing OpenVPN instances for interface WAN.
    Feb 19 18:22:13	php-fpm	66863	OpenVPN terminate old pid: 98658
    Feb 19 18:22:14	kernel		sonewconn: pcb 0xfffff8000a55db40: Listen queue overflow: 2 already in queue awaiting acceptance (1 occurrences)
    Feb 19 18:22:16	php-fpm	66863	/rc.newwanip: OpenVPN ID client3 PID 98658 still running, killing.
    Feb 19 18:22:17	kernel		ovpnc3: link state changed to DOWN
    Feb 19 18:22:17	php-fpm	66863	OpenVPN PID written: 84973
    Feb 19 18:22:17	check_reload_status		Reloading filter
    Feb 19 18:22:17	php-fpm	66863	/rc.newwanip: Creating rrd update script
    Feb 19 18:22:19	kernel		ovpnc3: link state changed to UP
    Feb 19 18:22:19	check_reload_status		rc.newwanip starting ovpnc3
    Feb 19 18:22:19	php-fpm	66863	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 100.92.220.245 -> 100.92.220.245 - Restarting packages.
    Feb 19 18:22:19	check_reload_status		Starting packages
    Feb 19 18:22:20	php-fpm	339	/rc.newwanip: rc.newwanip: Info: starting on ovpnc3.
    


  • Still went down, I called they reset the ont and it came back instantly

    I’ll have a static ip Monday

    And a new modem Monday...

    I setup the new 3100 today and it runs great!

    Anything specific to setup a static address on pfsense?


  • Netgate Administrator

    Not really, if it's actually a static IP set the WAN to static and configure the IP, subnet and gateway.

    If it's supplied to you a static dhcp lease then you don't have to do anything.

    Steve



  • Every 24 hours this is beyond frustrating

    3 pfsense routers all the same issue now

    So according to the redmine fix isn’t released until version 2.5?


  • Netgate Administrator

    It's not at all clear that's what you're hitting. When it goes down what do you do to re-stablish it?

    If that's your only WAN connection you can just disable the 'gateway monitoring action' on the gateway. That will keep the monitoring stats to prevent it reloading everything whenever the gateway stops responding.
    You should probably also set the monitoring IP to something other than the gateway. The gateway may well drops pings under load.

    Steve



  • i turned off gateway monitoring entirely yesterday and it still went down early this morning

    i go to interface tab. release, renew and the connection instantly comes back online

    here are my findings the Nokia ONT is changing the local address of the device every 24 hours. when it changes Pfsense says the wan goes down, and everything stops working. i have no way to communicate to the ONT by webqui or ftp etc. since they willl not release that to me.

    the last setting i just changed under wan is : Block private networks and loopback addresses UNCHECKED


  • Netgate Administrator

    Ok, what does the system log show when that happens? Or the DHCP log?



  • the 10.92 device is the ONT modem. which as you can see the address is changing

    Feb 23 14:26:39	kernel		sonewconn: pcb 0xc6ba38f0: Listen queue overflow: 2 already in queue awaiting acceptance (1 occurrences)
    Feb 23 14:29:32	kernel		arpresolve: can't allocate llinfo for 100.92.192.1 on mvneta2
    Feb 23 14:29:32	kernel		arpresolve: can't allocate llinfo for 100.92.192.1 on mvneta2
    Feb 23 14:29:32	php-fpm	363	/status_interfaces.php: Shutting down Router Advertisment daemon cleanly
    Feb 23 14:29:33	kernel		arpresolve: can't allocate llinfo for 100.92.192.1 on mvneta2
    Feb 23 14:29:33	kernel		arpresolve: can't allocate llinfo for 100.92.192.1 on mvneta2
    Feb 23 14:29:39	check_reload_status		rc.newwanip starting mvneta2
    Feb 23 14:29:39	php-fpm	24613	/status_interfaces.php: calling interface_dhcpv6_configure.
    Feb 23 14:29:39	php-fpm	24613	/status_interfaces.php: Accept router advertisements on interface mvneta2
    Feb 23 14:29:39	php-fpm	24613	/status_interfaces.php: Starting rtsold process
    Feb 23 14:29:40	php-fpm	9259	/rc.newwanip: rc.newwanip: Info: starting on mvneta2.
    Feb 23 14:29:40	php-fpm	9259	/rc.newwanip: rc.newwanip: on (IP address: 100.92.204.194) (interface: WAN[wan]) (real interface: mvneta2).
    Feb 23 09:29:41	rtsold	51876	<sendpacket> sendmsg on mvneta2: Permission denied
    Feb 23 14:29:42	kernel		ovpnc1: link state changed to DOWN
    Feb 23 14:29:42	check_reload_status		Reloading filter
    Feb 23 14:29:43	kernel		ovpnc1: link state changed to UP
    Feb 23 14:29:43	check_reload_status		rc.newwanip starting ovpnc1
    Feb 23 14:29:44	php-fpm	362	/rc.newwanip: rc.newwanip: Info: starting on ovpnc1.
    Feb 23 14:29:44	php-fpm	362	/rc.newwanip: rc.newwanip: on (IP address: 10.32.118.189) (interface: AIRVPN[opt3]) (real interface: ovpnc1).
    Feb 23 14:29:44	php-fpm	362	/rc.newwanip: IP Address has changed, killing states on former IP Address 10.27.202.21.
    Feb 23 09:29:45	rtsold	51876	<sendpacket> sendmsg on mvneta2: Permission denied
    Feb 23 14:29:45	php-fpm	9259	/rc.newwanip: Resyncing OpenVPN instances for interface WAN.
    Feb 23 14:29:46	php-fpm	9259	OpenVPN terminate old pid: 84804
    Feb 23 14:29:46	kernel		ovpnc1: link state changed to DOWN
    Feb 23 14:29:46	check_reload_status		Reloading filter
    Feb 23 14:29:46	php-fpm	9259	OpenVPN PID written: 49485
    Feb 23 14:29:46	check_reload_status		Reloading filter
    Feb 23 14:29:46	php-fpm	9259	/rc.newwanip: Creating rrd update script
    Feb 23 14:29:48	php-fpm	9259	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 100.92.204.194 -> 100.92.204.194 - Restarting packages.
    Feb 23 14:29:48	check_reload_status		Starting packages
    Feb 23 09:29:49	rc.gateway_alarm	26316	>>> Gateway alarm: AIRVPN_VPNV4 (Addr:10.32.118.189 Alarm:1 RTT:.898ms RTTsd:1.084ms Loss:25%)
    Feb 23 14:29:49	check_reload_status		updating dyndns AIRVPN_VPNV4
    Feb 23 14:29:49	check_reload_status		Restarting ipsec tunnels
    Feb 23 14:29:49	check_reload_status		Restarting OpenVPN tunnels/interfaces
    Feb 23 14:29:49	check_reload_status		Reloading filter
    Feb 23 09:29:49	rtsold	51876	<sendpacket> sendmsg on mvneta2: Permission denied
    



  • Netgate Administrator

    I don't see a 10.100.x.x IP address those there. you mean the 100.92.x.x address? That looks like CGN.

    It looks like the WAN address is changing, which is not unusual. The addresses involved there seem to be the WAN and it's gateway, I would not expect to see an IP for the local modem device at all.

    The DHCP logs are hard to read in that format. We are only interested in the dhclient entries. They might be in UTC vs whatever timezone your other logs are in. Really we need to see the dhclient entries in the dhcp log and the system log entries covering that same time span to see what triggered the renewal, or lack of it.

    If you're hitting that bug I expect to see the link fail or the lease force renewed for some reason in the system log.
    Then the dhclient start in the dhcp log but fail to get an IP and timeout.
    Then some error in the system log.
    Then nothing until you manually restart dhclient.

    I'm not seeing that though, from what we have so far it's more like the remote dhcp server is handing you a ludicrously long lease and then expecting it to renew sooner. This entry:

    dhclient
    93554
    bound to 100.92.204.194 -- renewal in 43200 seconds.
    

    Shows a 12h lease though. pfSense would normally attempt to renew it after half the lease time. If you filter the dhcp log for only dhclient process entries I expect to see it renewing every 6h.

    Steve



  • fixed the IP. this is carrier grade nat...

    is there a way i can get you the logs you need by the var/log area easier?

    Feb 23 09:35:29 pfSense dhclient[29691]: connection closed
    Feb 23 09:35:29 pfSense dhclient[29691]: exiting.
    Feb 23 09:35:31 pfSense dhclient: PREINIT
    Feb 23 09:35:32 pfSense dhclient[54752]: DHCPREQUEST on mvneta2 to 255.255.255.255 port 67
    Feb 23 09:35:32 pfSense dhclient[54752]: DHCPACK from 100.92.192.3
    Feb 23 09:35:32 pfSense dhclient: REBOOT
    Feb 23 09:35:32 pfSense dhclient: Starting add_new_address()
    Feb 23 09:35:32 pfSense dhclient: ifconfig mvneta2 inet 100.92.204.194 netmask 255.255.224.0 broadcast 100.92.223.255
    Feb 23 09:35:32 pfSense dhclient: New IP Address (mvneta2): 100.92.204.194
    Feb 23 09:35:32 pfSense dhclient: New Subnet Mask (mvneta2): 255.255.224.0
    Feb 23 09:35:32 pfSense dhclient: New Broadcast Address (mvneta2): 100.92.223.255
    Feb 23 09:35:32 pfSense dhclient: New Routers (mvneta2): 100.92.192.1
    Feb 23 09:35:32 pfSense dhclient: Adding new routes to interface: mvneta2
    Feb 23 09:35:32 pfSense dhclient: Creating resolv.conf
    Feb 23 09:35:32 pfSense dhclient[54752]: bound to 100.92.204.194 -- renewal in 43200

    another:
    Feb 23 14:29:03 pfSense dhclient[8995]: send_packet: No route to host
    Feb 23 14:29:30 pfSense dhclient[8995]: connection closed
    Feb 23 14:29:30 pfSense dhclient[8995]: exiting.
    Feb 23 09:29:39 pfSense dhclient: PREINIT
    Feb 23 09:29:39 pfSense dhclient[29287]: DHCPREQUEST on mvneta2 to 255.255.255.255 port 67
    Feb 23 09:29:39 pfSense dhclient[29287]: DHCPACK from 100.92.192.3
    Feb 23 09:29:39 pfSense dhclient: REBOOT
    Feb 23 09:29:39 pfSense dhclient: Starting add_new_address()
    Feb 23 09:29:39 pfSense dhclient: ifconfig mvneta2 inet 100.92.204.194 netmask 255.255.224.0 broadcast 100.92.223.255
    Feb 23 09:29:39 pfSense dhclient: New IP Address (mvneta2): 100.92.204.194
    Feb 23 09:29:39 pfSense dhclient: New Subnet Mask (mvneta2): 255.255.224.0
    Feb 23 09:29:39 pfSense dhclient: New Broadcast Address (mvneta2): 100.92.223.255
    Feb 23 09:29:39 pfSense dhclient: New Routers (mvneta2): 100.92.192.1
    Feb 23 09:29:39 pfSense dhclient: Adding new routes to interface: mvneta2
    Feb 23 09:29:39 pfSense dhclient: /sbin/route add default 100.92.192.1
    Feb 23 09:29:39 pfSense dhclient: Creating resolv.conf
    Feb 23 09:29:39 pfSense dhclient[29287]: bound to 100.92.204.194 -- renewal in 1800 seconds.


  • Netgate Administrator

    Is that two different devices? On two separate connections? They have the same IP address but only 6mins apart....

    One is given a 12h lease but the other only 30mins.

    I think you may have some conflict there....



  • same equipment. the new sg 3100.

    is there something i can change or look at to remedy this?


  • Netgate Administrator

    So that is two excerpts from the same dhcp log on the same device?

    The time stamps are confusing, what order should those be read in?

    Both those show successful renewal though which is not what is expected from the dhcp bug we referenced.

    Steve



  • same 3100. same device

    i pulled them directly from var\logs i copied them to a word file. ctrl + f what you wanted to see and pasted it here

    read top to bottom

    i am really hoping a static IP address from the provider will resolve this

    called my isp to see if they could enable the static today. not possible i have to wait until tomorrow at 9am EST :(


  • Netgate Administrator

    In the dhcp logs view you can filter by the dhclient process and then just copy/paste them here directly without going through Word (or any other editor).
    Some of those logs show the gateway not responding to ARP which probably won't be solved by using a static IP. If you can get one though if will obviously solve any dhcp issues.

    Steve



  • static ip has been set and active for over 24 hours now. NO issues whatsoever. first time ever with this new internet service.

    i will look forward to when Internet Systems Consortium DHCP Server 4.3.6-P1 is updated to 4.4.2 within Pfsense per https://www.isc.org/dhcp/

    so all of this is fixed for users.

    so to continue using Pfsense i will be paying 10 dollars extra a month until the release..


  • Netgate Administrator

    It's dhclient not the dhcp server. It will be in 2.5 when that is released, it isn't yet in 2.5 snaps as they are currently built on 12.0.

    Steve


  • Netgate Administrator

    As I said (in the wrong thread)....

    stephenw10 Netgate Administrator about 16 hours ago

    Ok, good news. The binary part of the fix for this is now in 2.4.5 snapshots:
    https://github.com/pfsense/FreeBSD-src/commits/RELENG_2_4_5/sbin/dhclient/dhclient.c

    The full fix also requires changes to the dhclient script which can be applied via the system patches package. I have briefly tested that and it didn't seem to break anything.

    That patch is here: https://redmine.pfsense.org/attachments/download/2682/pfsense-dhclient-script-patch.txt

    If you're able to test it we may be able to include it in 2.4.5.

    Steve



  • no problem Sir.

    at this time i have no way to test as i am locked in a one year agreement with a static WAN ip address. my issue resolved.

    not sure i provided you any good information. feel free to lock this thread and work with the other gentlemen if that seems best


  • Netgate Administrator

    I'm going to test locally but I can only try to simulate a failed dhcp server. It is definitely a bug that would be very good to squash. I'd love to hear from anyone who is hitting it 'in the field'.

    Steve


Log in to reply