I have been doing some tests on my GNS3 lab, where I've replicated the production network.
On 23.09.1, no matter whether I advertise the secondary default route or not, the "netstat -rn" only shows one default route out the WAN1 link, and the "show ip route" in FRR only shows the one default route, and there are the 4 kernel routes.
On 24.3, I've had either none, three or all four kernel routes show up, but i have to bounce the WAN links and shake things up a bit to get things to change. There does seem like some kind of race condition going on with the learning and installing of routes.
When I have the secondary remote default route advertised, I end up seeing it in the "show ip route" display on 24.3 but not on 23.09.1 which has the kernel default route instead. Example of the route that shows up on 24.3 is:
O>* 0.0.0.0/0 [110/5] via 10.254.40.2, vmx1.40, weight 1, 00:07:47
And in netstat, I see two default routes.
Destination Gateway Flags Nhop# Mtu Netif Expire
default <ISP's WAN IP> UGS 0 1500 vmx2
default 10.254.40.2 UG1 0 1500 vmx1.40
It looks like it's load balancing, because some traceroutes work, and others ping-pong towards the firewall and back:
GT_Data> trace 1.2.3.4
trace to 1.2.3.4, 8 hops max, press Ctrl+C to stop
1 192.168.27.1 0.703 ms 0.554 ms 0.605 ms
2 10.254.40.1 1.323 ms 1.261 ms 1.767 ms
3 192.168.27.1 1.953 ms 1.248 ms 1.116 ms
4 10.254.40.1 2.337 ms 2.688 ms 1.876 ms
5 192.168.27.1 1.814 ms 2.516 ms 1.852 ms
6 10.254.40.1 3.325 ms 2.818 ms 2.670 ms
7 192.168.27.1 2.532 ms 3.280 ms *
8 10.254.40.1 4.716 ms 4.612 ms 2.889 ms
etc etc. There's where the network breaks, as per the title of this thread. Doesn't happen in 23.09.1.
On 23.09.1, the routing always converges to the same view no matter how or when I bounce various interfaces, but in 24.3, the "show ip route" can have different results based on when various interfaces are bounced and different things happen.
What's not satisfactory is how pfSense+ is load-balancing a directly connected default route, with one learned via OSPF. The connected one should always take precedence due to lower administrative distance.
[https://docs.frrouting.org/en/latest/zebra.html#administrative-distance](link url)
System, kernel and connected routes should win out over OSPF routes.
Just looking at FRR source files, there are no build instructions for FreeBSD 15 - only up to FreeBSD 14. Is FreeBSD 15 too experimental for third party packages?!
The FRR package installed is "pfSense-pkg-frr-2.0.2_3". Is there a beta package I could test which contains FRR 9.1.1 or even 10.0 that I could test, to see if this issue is still in the newer versions?