WAN alarm triggers complete loss of internal routing

korgua

Hi,

I am probably missing something real simple but I am puzzled why this event:

Sep 7 14:58:12 dpinger 77623 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 81.1.113.55 bind_addr 89.240.248.21 identifier "WAN_PPPOE "

results in a 2.6.0 pfSense from being unreachable on the LAN.....a cold boot seems the only cure.

I have since disabled Gateway monitoring - not sure if this will help?

Thank you for any assistance / insight offered.

stephenw10

That log entry is dpinger starting when the PPPoE WAN comes up.

That implies it went down.

By the most common cause of issues like this is that there are multiple gateways defined and the default IPv4 gateway is still set to automatic.

Go to System > Routing > Gateways and set the Default v4 gateway to WAN_PPPoE.

You probably have invalid gateway(s) configured on internal interfaces. Unless you are actually routing internally.

Steve

korgua

@stephenw10

Thanks Steve.

Under System / Routing / Gateways there has only ever been one gateway, WAN_PPPOE. No static routes or gateway groups.

All the internal L3 routing is via the pfSense, 9x /24 interfaces defined. IPv4 upstream gateway is set to none on all of these interfaces.

When the WAN drops, none of the VLAN IPs are reachable from devices in those subnets. Reboot the pfSense and connectivity is restored.

I would expect the internal routing to work regardless of WAN connectivity.

johnpoz

@korgua said in WAN alarm triggers complete loss of internal routing:

I would expect the internal routing to work regardless of WAN connectivity.

Yeah that would/should be the case.. Your saying when the wan goes down you can not even ping the pfsense IP from a device in that segment..

And you can ping it before right - its not like you have some rules on the interface blocking ping? Do you have some policy routing setup? What does your rules look like.. because with wan monitor it is possible when you have rules setup with a gateway, and that gateway is down that those rules are not loaded.. But that shouldn't prevent local routing, since if you policy route out your wan, to get to other local networks you would need rules above that anyway, which shouldn't go away.

system / advanced / misc

Possible when the wan goes down, pfsense crashes completely?

stephenw10

Yup, not that common issue then. More digging required!

korgua

@johnpoz

Thanks John.

Yes, ping is allowed as is web access to the FW from some trusted subnets. On reboot the ping and web access resumes as does all the L3 routing.

All the rules have * under "Gateway"

No policy routing is setup.

pfSense has been working faultlessly for a few months....two WAN wobbles and LAN routing stops.

It's a headless mini PC with a single NIC, that requires a cold boot to get working again.

johnpoz

@korgua said in WAN alarm triggers complete loss of internal routing:

It's a headless mini PC with a single NIC

Ah - so this is just single nic.. And your other networks are just vlans on this physical nic.. That throws a wrench into it.

Are you saying before - when your wan would go down, you were able to still route between your vlans?

korgua

@johnpoz

We had one outage when the FTTP was first installed - put that down to 'other' events.

Since then, no WAN issues but two today and LAN routing stops...so, to answer your question, I think that from day 1, WAN outage would cause LAN routing to stop.

stephenw10

So hosts in each VLAN can still reach pfSense but not any other subnet?

korgua

@stephenw10

No, pfSense will not communicate to any device....i.e. pinging the local gateway or web browsing to it, does not work. Apologies, I can see why my answers may have suggested this

I have done the same configuration / setup on VMs before and they do not exhibit this issue.

stephenw10

Hmm, so actually it loses connectivity on all interfaces?

What are those interfaces, how are they connected?

Nothing else is logged?

johnpoz

@stephenw10 said in WAN alarm triggers complete loss of internal routing:

What are those interfaces, how are they connected?

Looks like this is a 1 nic device..

It's a headless mini PC with a single NIC

stephenw10

Ah, that would do it!

Is it a Realtek NIC?....

korgua

@stephenw10

It is

re0@pci0:2:0:0: class=0x020000 card=0x012310ec chip=0x816810ec rev=0x15 hdr=0x00
vendor = 'Realtek Semiconductor Co., Ltd.'
device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller'
class = network
subclass = ethernet

it has been performant and trouble free - until the WAN wobbles.

Other than swapping out the hardware is there something else to configure / check?

Addendum:

Id Refs Address Size Name
1 26 0xffffffff80200000 3aed878 kernel
2 1 0xffffffff83cee000 39adb0 zfs.ko
3 2 0xffffffff84089000 9860 opensolaris.ko
4 1 0xffffffff84093000 d01c0 if_re.ko
5 1 0xffffffff84321000 1000 cpuctl.ko
6 1 0xffffffff84322000 2150 acpi_wmi.ko
7 1 0xffffffff84325000 ab40 snd_uaudio.ko
8 1 0xffffffff84330000 8e10 aesni.ko

Under System / Advanced / Networking

All ticked except ARP handling and Reset all states

stephenw10

Check the system logs for watchdog timeout messages from the re driver. If you see any try using the alternative driver. Are you running 2.7?

korgua

@stephenw10

Looking at some notes I made we had watchdog timeouts when performing speedtests on the new FTTP.

We did the following

fetch -v https://pkg.opnsense.org/FreeBSD:12:amd64/snapshots/latest/All/realtek-re-kmod-196.04.txz
pkg install -f -y realtek-re-kmod-196.04.txz

and added
if_re_load="YES"
if_re_name="/boot/modules/if_re.ko"

to /boot/loader.conf.local

This resolved the issue with watchdog timeouts - not seen one since.

We are running 2.6

stephenw10

Well I'd recommend upgrading to 2.7 if you can. The re kmod driver is in our repo there and is at v1.98.

However, ultimately, you should use some other NIC.

korgua

@stephenw10

Thanks Steve. We will replace this mini-PC with a VM on an Intel NIC platform.

In every other respect the mini-PC has been great but there is nothing to gain with experimenting on this hardware.