VTI gateways not adding static routes in 24.03
-
I just remember that I installed another new Netgate 4100 for a new client and that device isn't actively being used, so I can use it for testing. It was immediately updated to 24.03 and it is showing exactly the same behavior.
I tried deleting the existing static route and re-create it, it is still not appearing in the routes table. No error messages in system logs -> routing.
My gut feeling is that the core reason of the bug is pfsense not considering 0.0.0.0/0 routing valid and thus not applying the static routes to the routes table.
-
@OhYeah-0 As mentioned above, mine are using a /30 transit network rather than the 0.0.0.0/0 config you have, but we seem to be seeing the same thing: the static route doesn't load. My curious gut says: is there a timing issue where the tunnel hasn't come up yet which makes the static route seem invalid? Seems like the logs are not telling us the whole story though.
--Larry
-
@LarryFahnoe said in VTI gateways in 24.03:
My curious gut says: is there a timing issue where the tunnel hasn't come up yet which makes the static route seem invalid?
IPSEC P1 instances have come online in both cases without problems for me.
EDIT: I think I might've have slightly misunderstood your point. It's an interesting thought that it could be a timing issue but I don't ever recall seeing such a problem with pfsense before.
-
Yeah it seems likely it fails to add the route because the gateway is not yet available. If you have a dynamic gateway like that it won't show as up until the link is established.
However I would expect it to then be able to add routes after the VTI and hence the gateway is up.
Using 0.0.0.0/0 means there is not a dynamic gateway so that could be a problem. I'm not sure why that would be any different in 23.09 though.
But I'm surprised the route command doesn't throw an error.
Can you manually add a route at the CLI?
-
@stephenw10 said in VTI gateways in 24.03:
Using 0.0.0.0/0 means there is not a dynamic gateway so that could be a problem.
BTW, just to clarify: using 0.0.0.0/0 routing, the gateway IP always showed as "dynamic" in previous versions (in GUI under System -> Routing -> Gateways). In the dashboard it shows as "n/a" as before.
-
Yeah so to add a static route there it would need to be via the interface directly. I'd have to dig into the syntax to test that.
Do you know how that static route appeared in the routing table in 23.09?
-
@stephenw10 said in VTI gateways in 24.03:
Can you manually add a route at the CLI?
Yes, but the question is when. With the system up and tunnel functioning I can add another route (to a bogus network for test):
# route add -net 192.168.15.0/24 192.168.8.2 add net 192.168.15.0: gateway 192.168.8.2
I can reboot later this morning but the tunnel comes up immediately, so it will likely not throw any error by the time I log in to add the route via the CLI. Again, one one system, rc.newwanip triggers the route to be added about 15 min after reboot.
Any insight to offer on my question about enabling debugging?
--Larry
-
Setting that would not give you any additional debug info AFAIK.
Hmm, yet resaving the static route does not create the route which should run that exact same command....
-
@stephenw10 said in VTI gateways in 24.03:
Do you know how that static route appeared in the routing table in 23.09?
DESTINATION - GATEWAY - FLAGS - USES - MTU - INTERFACE
10.10.24.0/24 link#13 US 7 1400 ipsec2
192.168.24.0/24 link#13 US 7 1400 ipsec2
192.168.131.0/24 link#13 US 7 1400 ipsec2
(From GUI: Diagnostics -> Routes)
-
Right so via the link directly.
-
@stephenw10 said in VTI gateways in 24.03:
Hmm, yet resaving the static route does not create the route which should run that exact same command
Okay, so I just rebooted and then ssh'd in. The static route to 192.168.3.0/24 is missing, added it without issue. Lightly edited output removing the references to external addresses.
# netstat -rn4 Routing tables Internet: Destination Gateway Flags Netif Expire 127.0.0.1 link#6 UH lo0 192.168.0.2 link#6 UH lo0 192.168.5.0/24 link#3 U igc2 192.168.5.1 link#6 UHS lo0 192.168.8.1 link#6 UHS lo0 192.168.8.2 link#9 UH ipsec1 192.168.10.1 link#6 UH lo0 # # route add -net 192.168.3.0/24 192.168.8.2 add net 192.168.3.0: gateway 192.168.8.2 # # netstat -rn4 Routing tables Internet: Destination Gateway Flags Netif Expire 127.0.0.1 link#6 UH lo0 192.168.0.2 link#6 UH lo0 192.168.3.0/24 192.168.8.2 UGS ipsec1 192.168.5.0/24 link#3 U igc2 192.168.5.1 link#6 UHS lo0 192.168.8.1 link#6 UHS lo0 192.168.8.2 link#9 UH ipsec1 192.168.10.1 link#6 UH lo0
From my position, the commonality here is that @OhYeah-0 and I both have systems with static routes not getting loaded. Beyond that there are variations on the theme:
- One of my systems does not get the static route on boot, but rc.newwanip triggers the route to be loaded about 15 min after boot
- Another of my systems now does get the static route loaded on boot, but this was a result of the steps Lev suggested in the redmine. I haven't been able to get Lev's steps to work on my other system
- It sounds like @OhYeah-0 has systems that do not get the static route loaded at all
--Larry
-
@stephenw10 said in VTI gateways in 24.03:
Right so via the link directly.
Hmm, so you've uncovered a new wrinkle, but I wonder if that might be due to @OhYeah-0 using the 0.0.0.0/0?
I have yet to roll back to 23.09.1 and look at how the route was loaded. I would assume however that since I am using a /30 transit network, the route would be via the gateway IP I provided; not sure if an interface route would make sense if the user provides a gateway IP.
Under 24.03 I did just add the route via the link & traffic passes as expected.
# route del -net 192.168.3.0/24 192.168.8.2 del net 192.168.3.0: gateway 192.168.8.2 # # route add -net 192.168.3.0/24 -interface ipsec1 add net 192.168.3.0: gateway ipsec1 # # netstat -rn4 Routing tables Internet: Destination Gateway Flags Netif Expire 127.0.0.1 link#6 UH lo0 192.168.0.2 link#6 UH lo0 192.168.3.0/24 link#9 US ipsec1 192.168.5.0/24 link#3 U igc2 192.168.5.1 link#6 UHS lo0 192.168.8.1 link#6 UHS lo0 192.168.8.2 link#9 UH ipsec1 192.168.10.1 link#6 UH lo0
--Larry
-
Ok one thing to try here is editing and saving the VTI gateway in Sys > Routing > Gateways.
Do that creates an entry for it in the config which can change it's behaviour, espeically at boot.
-
@stephenw10 said in VTI gateways in 24.03:
Ok one thing to try here is editing and saving the VTI gateway in Sys > Routing > Gateways.
Do that creates an entry for it in the config which can change it's behaviour, espeically at boot.
Hmm, you're shedding light here Stephen, thank you. Before making further changes I looked more carefully at the config.
First off, while I am using a /30 transit network and routing via the gateway IP address, the config suggests it probably ought to be via the link since it does not record the IP:
<staticroutes> <route> <network>192.168.3.0/24</network> <gateway>MPLS_ALEX_VTIV4</gateway> <descr><![CDATA[Alex LAN]]></descr> </route> </staticroutes>
Next, while I only see 2 gateways in the GUI, there are 3 defined in the config. The MPLS_ALEX_VTIV4 gateway is defined twice, once as dynamic, and again with the transit net IP:
<gateways> <gateway_item> <interface>wan</interface> <gateway>dynamic</gateway> <name>WAN_DHCP</name> <weight>1</weight> <ipprotocol>inet</ipprotocol> <interval>1000</interval> <descr><![CDATA[Via Quantum Fiber C5500XK]]></descr> <action_disable></action_disable> <gw_down_kill_states></gw_down_kill_states> </gateway_item> <gateway_item> <interface>opt3</interface> <gateway>192.168.8.2</gateway> <name>MPLS_ALEX_VTIV4</name> <weight>1</weight> <ipprotocol>inet</ipprotocol> <descr><![CDATA[Interface MPLS_ALEX_VTIV4 Gateway]]></descr> <gw_down_kill_states></gw_down_kill_states> </gateway_item> <gateway_item> <interface>opt3</interface> <gateway>dynamic</gateway> <name>MPLS_ALEX_VTIV4</name> <weight>1</weight> <ipprotocol>inet</ipprotocol> <interval>1000</interval> <descr><![CDATA[Interface MPLS_ALEX_VTIV4 Gateway]]></descr> <action_disable></action_disable> <gw_down_kill_states></gw_down_kill_states> </gateway_item> <defaultgw4>WAN_DHCP</defaultgw4> <defaultgw6></defaultgw6> </gateways>
I do not have enough experience with pfSense to know what is normal.
Now, to your point about editing and saving the VTI gateway.
- Delete static route via MPLS_ALEX_VTIV4
- Edit gateway MPLS_ALEX_VTIV4 (no change), save
- Add static route via MPLS_ALEX_VTIV4
The gateways and staticroutes config sections remain the same. I'll reboot & expect the same 15 min delay before rc.newwanip triggers the route to be loaded.
Next test is to delete the static route AND the VTI gateway, then recreate both...
--Larry
-
Huh, that is interesting. Did you perhaps add a gateway manually as well as the dynamic gateway?
In a test instance here I only see the dynamic gateway. However static roots are added and are shown via the real gateway address:
[24.03-RELEASE][admin@5100.stevew.lan]/root: netstat -rn4 Routing tables Internet: Destination Gateway Flags Netif Expire default 172.21.16.1 UGS igb0 10.52.52.1 link#8 UHS lo0 10.52.52.2 link#19 UH ipsec3 10.86.8.0/24 link#21 U ovpns1 10.86.8.1 link#8 UHS lo0 10.110.20.0/26 link#11 U tun_wg0 10.110.20.10 link#8 UHS lo0 127.0.0.1 link#8 UH lo0 172.21.16.0/24 link#1 U igb0 172.21.16.1 link#1 UHS igb0 172.21.16.21 link#8 UHS lo0 172.21.16.149 172.21.16.1 UGHS igb0 172.21.16.186 172.21.16.1 UGHS igb0 192.168.21.0/24 link#2 U igb1 192.168.21.1 link#8 UHS lo0 192.168.21.5 link#8 UHS lo0 192.168.144.0/24 10.52.52.2 UGS ipsec3 192.168.221.0/24 link#14 U lagg0 192.168.221.1 link#8 UHS lo0
-
@stephenw10 After quite a bit more testing, I have narrowed the missing static route problem down to the non-dynamic <gateway_item> shown above. The real puzzler is that rolling back to 23.09.1 (BE right before the upgrade), I only have the two dynamic <gateway_items>. <staticroutes> are the same.
23.09.1 pre-upgrade:
<staticroutes> <route> <network>192.168.3.0/24</network> <gateway>MPLS_ALEX_VTIV4</gateway> <descr><![CDATA[Alex LAN]]></descr> </route> </staticroutes>
<gateways> <gateway_item> <interface>opt3</interface> <gateway>dynamic</gateway> <name>MPLS_ALEX_VTIV4</name> <weight>1</weight> <ipprotocol>inet</ipprotocol> <descr><![CDATA[Interface MPLS_ALEX_VTIV4 Gateway]]></descr> <monitor_disable></monitor_disable> <gw_down_kill_states></gw_down_kill_states> </gateway_item> <gateway_item> <interface>wan</interface> <gateway>dynamic</gateway> <name>WAN_DHCP</name> <weight>1</weight> <ipprotocol>inet</ipprotocol> <interval>1000</interval> <descr><![CDATA[Via Quantum Fiber C5500XK]]></descr> <gw_down_kill_states></gw_down_kill_states> </gateway_item> <defaultgw4>WAN_DHCP</defaultgw4> <defaultgw6></defaultgw6> </gateways>
To clean up the errant <gateway_item> required tearing down and rebuilding much of the config:
- Delete static route to 192.168.3.0/24 via MPLS_ALEX_VTIV4
- Delete MPLS_ALEX_VTIV4 interface assignment
- Disable IPsec P1 and P2
- Delete gateway MPLS_ALEX_VTIV4
[ Gateway was grayed out (Gateway inactive, interface is missing) before attempting to delete and remains in this state after attempting to delete ] - Delete the IPsec P2
- Delete the gateway MPLS_ALEX_VTIV4
[ Gateway is deleted and deleted from config.xml ] - Recreate IPsec P2
- Enable IPsec P1 and P2
- Add interface ipsec1
- Enable interface OPT3 (skip renaming to MPLS_ALEX_VTIV4)
- OPT3_VTIV4 gateway is created automatically
- Add static route to 192.168.3.0/24 via OPT3_VTIV4
- Add the OPT3 rules for site to site traffic and gateway monitoring
- Reboot
Did this on both of my systems and they are both rebooting cleanly with the IPsec VTI coming up and passing traffic immediately.
I will add this update to the redmine, but it still does not explain where the non-dynamic <gateway_item> came from, and I'm not sure it addresses the problem that @OhYeah-0 is seeing.
--Larry
-
Nice troubleshooting.
Hmm, so the bug is potentially that an additional gateway is created at upgrade.
I'll try to find any other instances. I'd expect to see quite a few if so.
-
@stephenw10 said in VTI gateways in 24.03:
Hmm, so the bug is potentially that an additional gateway is created at upgrade.
I'm quite curious now as to the root cause & will look forward to hearing if you uncover more. Will also be interesting to see what @OhYeah-0 finds.
--Larry
-
@LarryFahnoe said in VTI gateways in 24.03:
Will also be interesting to see what @OhYeah-0 finds.
Well this doesn't move us closer to a solution.. I have only 2 gateways defined in the config file.
<gateways> <gateway_item> <interface>wan</interface> <gateway>xxx.xx.xxx.xx</gateway> <name>WANGW</name> <weight>1</weight> <descr><![CDATA[WAN Gateway]]></descr> <defaultgw></defaultgw> </gateway_item> <gateway_item> <interface>opt5</interface> <gateway>dynamic</gateway> <name>IPSEC_SWE_GW</name> <weight>1</weight> <ipprotocol>inet</ipprotocol> <descr><![CDATA[Test description]]></descr> <monitor_disable></monitor_disable> <action_disable></action_disable> <gw_down_kill_states></gw_down_kill_states> </gateway_item>
The problem is that I cannot remember if I performed the upgrade before I created the IPSEC tunnel or not.
-
Tried something a bit more drastic.
- Deleted everything: static routes, gateway, disabled interface, deleted assignment, deleted P2, deleted P1.
- Restart.
- Switch global states back to "floating" and IPSEC filter mode back to "on IPSEC tab".
- Restart.
- Add everything back in the same order as standard (but different names just to make sure something doesn't clash with cached or old entries).
- Restart.
Same status. P1 comes up, routes are not added to the routing table.