Australian NBN connection stops after random time
-
G'day,
I've only recently started using pfSense, and it's a nice setup.
I have it running on a generic multi-ethernet micro PC platform, one of these little things :
It gets pretty hot, the heatsink surface is almost too hot to touch, but the temperature displayed on the pFsense dashboard sits at 27.9 degrees, I think it's lying, anyway ... I have a fan on order that I'm going to pop on top of it that should help ...It's set up with four networks - one out through the WAN port to an Australian NBNco modem/router, that provides DHCP and a link to the Internet, the other ports are local LANs, pretty standard. The only exception, is one of the LANs is a routable network (and is, routed, no NAT, to the 'net), while the others are RFC 1918 addresses, and are NAT'ed via the WAN ports' IP address for outgoing traffic.
Again, nothing extra-ordinary here.
Every random few days, the connection to the Australian NBN router I have, drops out. I've not had time to thoroughly investigate it, but I can connect to the pfSense box from my LANs just fine, just the connection to the NBNco box seems to be failing "somewhere". A reboot brings it back up, I haven't yet tried bringing the interface down and back up yet to see if that would fix it. I suspect, but am not certain, that the NBNco modem is a toy and has issues, but I can't confirm that.
Once I find a reliable way detect and fix the issue, whether that's a reboot, or bouncing the WAN port or whatever, what's the best way to plug a script into the pfSense setup so it will survive upgrades and so on?
Also, has anyone seen this issue or similar and has any suggestions?
Thank you,
Carl
-
I'm an idiot, forgot to say versions etc, pFsense :
2.6.0-RELEASE (amd64)
built on Mon Jan 31 19:57:53 UTC 2022
FreeBSD 12.3-STABLE -
The DHCP settings for the WAN port are currently "FreeBSD default". Would that make any difference?
-
@bleve What type of NICs does this device have? What's the driver appear as in pfSense?
-
@rcoleman-netgate dmesg says :
em0: <Intel(R) Gigabit CT 82574L> port 0xe000-0xe01f mem 0xdf540000-0xdf55ffff,0xdf560000-0xdf563fff irq 17 at device 0.0 on pci1
em0: EEPROM V1.9-0
em0: Using 1024 TX descriptors and 1024 RX descriptors
em0: Using 2 RX queues 2 TX queues
em0: Using MSI-X interrupts with 3 vectors
em0: Ethernet address: 00:f1:f3:21:7b:ed
em0: netmap queues/slots: TX 2/1024, RX 2/1024
pcib2: <ACPI PCI-PCI bridge> irq 18 at device 28.2 on pci0
pci2: <ACPI PCI bus> on pcib2
em1: <Intel(R) Gigabit CT 82574L> port 0xd000-0xd01f mem 0xdf440000-0xdf45ffff,0xdf460000-0xdf463fff irq 18 at device 0.0 on pci2
em1: EEPROM V1.9-0
em1: Using 1024 TX descriptors and 1024 RX descriptors
em1: Using 2 RX queues 2 TX queues
em1: Using MSI-X interrupts with 3 vectors
em1: Ethernet address: 00:f1:f3:21:7b:ee
em1: netmap queues/slots: TX 2/1024, RX 2/1024
pcib3: <ACPI PCI-PCI bridge> irq 19 at device 28.3 on pci0
pci3: <ACPI PCI bus> on pcib3
em2: <Intel(R) Gigabit CT 82574L> port 0xc000-0xc01f mem 0xdf340000-0xdf35ffff,0xdf360000-0xdf363fff irq 19 at device 0.0 on pci3
em2: EEPROM V1.9-0
em2: Using 1024 TX descriptors and 1024 RX descriptors
em2: Using 2 RX queues 2 TX queues
em2: Using MSI-X interrupts with 3 vectors
em2: Ethernet address: 00:f1:f3:21:7b:ef
em2: netmap queues/slots: TX 2/1024, RX 2/1024
pcib4: <ACPI PCI-PCI bridge> irq 16 at device 28.4 on pci0
pci4: <ACPI PCI bus> on pcib4
em3: <Intel(R) Gigabit CT 82574L> port 0xb000-0xb01f mem 0xdf240000-0xdf25ffff,0xdf260000-0xdf263fff irq 16 at device 0.0 on pci4
em3: EEPROM V1.9-0
em3: Using 1024 TX descriptors and 1024 RX descriptors
em3: Using 2 RX queues 2 TX queues
em3: Using MSI-X interrupts with 3 vectors
em3: Ethernet address: 00:f1:f3:21:7b:f0 -
ifconfig reports :
[2.6.0-RELEASE][carl@barry.aboc.net.au]/home/carl: ifconfig em0
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
description: WAN
options=81209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER>
ether 00:f1:f3:21:7b:ed
inet6 fe80::2f1:f3ff:fe21:7bed%em0 prefixlen 64 scopeid 0x1
inet 167.179.136.192 netmask 0xfffffc00 broadcast 167.179.139.255
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> -
@bleve OK good, not Realtek.
When it goes down what do you see at Status->Gateways?
-
@rcoleman-netgate yes, not realtek!
I'll have to wait and try to catch it again. Unless it may have been logged?Is this relevant?
Jan 17 17:15:35 dpinger 50348 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 167.179.136.1 bind_addr 167.179.136.192 identifier "WAN_DHCP "
Jan 17 17:15:36 dpinger 54506 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 167.179.136.1 bind_addr 167.179.136.192 identifier "WAN_DHCP "There's also a pile of these :
Jan 17 17:11:26 dpinger 54594 WAN_DHCP 167.179.136.1: sendto error: 64
Jan 17 17:11:26 dpinger 54594 WAN_DHCP 167.179.136.1: sendto error: 64
Jan 17 17:11:27 dpinger 54594 WAN_DHCP 167.179.136.1: sendto error: 65
Jan 17 17:11:27 dpinger 54594 WAN_DHCP 167.179.136.1: sendto error: 65
Jan 17 17:11:28 dpinger 54594 WAN_DHCP 167.179.136.1: sendto error: 65These happen before the last errors
-
@bleve Yep, so what's your System->Routing setting for the Monitoring IP on your WAN set to?
-
WAN_DHCP WAN 167.179.136.1 167.179.136.1 Interface WAN_DHCP Gateway
That's the next hop, confirmed by traceroute :
[2.6.0-RELEASE][carl@barry.aboc.net.au]/home/carl: traceroute www.sun.com
traceroute: Warning: www.sun.com has multiple addresses; using 23.214.90.91
traceroute to e120265.dscx.akamaiedge.net (23.214.90.91), 64 hops max, 40 byte packets
1 loop1671791360.bng.mel.aussiebb.net (167.179.136.1) 2.708 ms 3.123 ms 1.961 ms
2 10.241.4.108 (10.241.4.108) 1.813 ms 1.851 ms 2.029 ms.
.
. -
@bleve So that's your upstream IP and that's the default action. But if you change it to a public IP that always replies (Google DNS, CloudFlare, any other like that which replies to a ping) you will likely stay online.
This is from an email I sent to a TAC Professional customer earlier this evening regarding a request for a configuration review on this very issue:
I recommend changing your Gateway Monitoring IP from {blank} to something that will respond to a ping always on the internet, typically a DNS server will fit this need. By not specifying this your upstream device is used and ISPs often treat a once-a-second ping as an attempted Denial-of-Service (DoS) attack and will block that ping from happening. It then results in the gateway being marked down, even though the ISP is still routing all the traffic and then you are down for 5-15 minutes depending on the ISP policies. I recommend setting this to either Google's or CloudFlare's DNS server IP.
-
@rcoleman-netgate So you're suggesting that my ISP is blocking the monitor, and then the DHCP fails? What about just getting rid of the monitor entirely? I see that's an option. Silly or not?
-
@bleve I would keep dpinger working and just set the IP to something not in the ISP purview. Easier than trying to guess if the internet is down because you don't have a monitor at all anymore.
-
@rcoleman-netgate Thank you, I've set it to the poor much-hammered-on 8.8.8.8, will wait & see.
Thank you for your help!
Any idea for how I can get the temp sensor to behave? I don't believe it's 27.9 degrees all the time!
Carl
-
@bleve said in Australian NBN connection stops after random time:
I don't believe it's 27.9 degrees all the time!
That sounds kinda low, honestly. Could be reading the wrong detail. I can't comment on the third party hardware -- that's outside of the scope of my work.
-
@rcoleman-netgate It's 100% wrong!
Under all this, is a FreeBSD 12.3 box, is it safe to install mbmon and see if it'll work? -
@bleve We don't recommend side-loading software but if you want to there's nothing to stop you from it.
-
@rcoleman-netgate I'll skip it, if it's not recommended.
Thank you again for your time. -
I'm guessing but it looks like you're with Aussie Broadband - based on that gateway IP address.
What sort of NBN connection do you have? Just wondering what your "NBN modem/router" is.
I'm with Aussie on HFC. I haven't had gateway monitoring on for more than three years. You could just try turning it off and see what happens. ABB-allocated IP addresses are very "sticky".
-
@biggsy said in Australian NBN connection stops after random time:
I'm guessing but it looks like you're with Aussie Broadband - based on that gateway IP address.
Yes, FTTP, with a static IP and a /24 behind it. The IP address won't change. Does it still want some sort of monitoring? Yes. A ping every second? Probably more than it needs!