Amazon VPC Routing: OSPF and IPsec Backup
-
Hello all, I just thought I'd crane my neck down as Im looking at an issue to do with our setup..
We have:
two pfsense firewall/routers connected to our network.
Four interfaces.:
em0 - Internet
em1 - VPLS layer 2 with OSPF routing to our provider that connects us to our VPC in Amazon.
em2 - local admin lan
em3 - pfSyncIve got OSPF routing setup in area 0 - which is correctly routing to our amazon virtual cloud (vpc) (via amazon's Direct connect which our provider is connected to, essentially connected at their side to aws via BGP)
Amazon recommends that when you're using a direct connect (DX for short) they suggest an IPSec vpn backup from your device to your VPC incase your DX goes down. *( which we've done and tested and seen an outage - those who were in the UK and saw telecity loose power last week will know that )
1. My ipsec tunnel to the AwS vpn server is good - its bound to our VPG (virtual private gw) in our vpc, as is the BGP session with our provider - that part is largely transparent.
2. the aws route back to our Data centre is configured and I can ping our dc lan gw without issues when i test the vpn only connection by dropping the VPLS/ospf interface on both firewalls - because there's no OSPF traffic going over these links.
Now, here's the interesting part.
scenario: Shutdown the em1 - vpls interface test vpn fail over to test the traffic from dc goes out via the VPN - whilst the vpls interface is not running,.
if i do a netstat, and look for the vpc subnet i see this
VPLS int.
[2.2.4-RELEASE][root@ee-dr-fw1-adm.fmlocal]/root: ifconfig em1em1: flags=8802 <broadcast,simplex,multicast>metric 0 mtu 1500
options=209b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic>ether d2:34:c0:1e:6d:ac
inet6 fe80::d034:c0ff:fe1e:6dac%em1 prefixlen 64 scopeid 0x2
inet 192.168.5.132 netmask 0xffffff80 broadcast 192.168.5.255
nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
(at this point i ifconfig em1 down)
EM2 is my internet interface..
I look at the routing table and see this[2.2.4-RELEASE][root@ee-dr-fw1-adm.fmlocal]/root: netstat -nra | grep 10.99
10.99.0.0/16 172.29.33.5 UG1 em2
so its pushing a route for the VPN to go via the second firewall's vpls interface.
so i bring up the interface again (em1)
[2.2.4-RELEASE][root@ee-dr-fw1-adm.fmlocal]/root: ifconfig em1 up
[2.2.4-RELEASE][root@ee-dr-fw1-adm.fmlocal]/root: netstat -nra | grep 10.99
10.99.0.0/16 172.29.33.5 UG1 em2
[2.2.4-RELEASE][root@ee-dr-fw1-adm.fmlocal]/root:
It's still showing the ospf route to 2nd firewall
In another ssh session, i do a netstat to see whats happening with my vpc route (10.99/16)
[2.2.4-RELEASE][root@ee-dr-fw1-adm.fmlocal]/root:
[2.2.4-RELEASE][root@ee-dr-fw1-adm.fmlocal]/root:
[2.2.4-RELEASE][root@ee-dr-fw1-adm.fmlocal]/root: netstat -nra | grep 10.99
{nothing there}
[2.2.4-RELEASE][root@ee-dr-fw1-adm.fmlocal]/root: netstat -nra | grep 10.99
[2.2.4-RELEASE][root@ee-dr-fw1-adm.fmlocal]/root: netstat -nra | grep 10.99
I check again
[2.2.4-RELEASE][root@ee-dr-fw1-adm.fmlocal]/root: netstat -nra | grep 10.99
10.99.0.0/16 192.168.5.129 UG1 em1
VPLS interface is back up and OSPF injects the route back via em1.
However a tcpdump shows that enc0 is my vpn interface for ipsec and is still showing traffic leaving over the vpn
[2.2.4-RELEASE][root@ee-dr-fw1-adm.fmlocal]/root: tcpdump -li enc0 -n| head
tcpdump: WARNING: enc0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enc0, link-type ENC (OpenBSD encapsulated IP), capture size 65535 bytes
capability mode sandbox enabled
10:20:34.175102 (authentic,confidential): SPI 0x014e0213: IP 172.29.33.37.13541 > 10.99.101.90.7301: Flags [P.], seq 689148127:689148227, ack 3747528986, win 517, length 100
10:20:34.243513 (authentic,confidential): SPI 0x014e0213: IP 172.29.33.37.13541 > 10.99.101.90.7301: Flags [P.], seq 100:1107, ack 1, win 517, length 1007
10:20:35.006258 (authentic,confidential): SPI 0x014e0213: IP 172.29.33.37.13541 > 10.99.101.90.7301: Flags [P.], seq 1107:1203, ack 1, win 517, length 96
10:20:35.086412 (authentic,confidential): SPI 0x014e0213: IP 172.29.33.37.13541 > 10.99.101.90.7301: Flags [P.], seq 1203:1301, ack 1, win 517, length 98
And on the VPLS i see the replies for the ACk's back
[2.2.4-RELEASE][root@ee-dr-fw1-adm.fmlocal]/root: tcpdump -lni em1 net 10.99.101.0/24 and port 7301|head
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em1, link-type EN10MB (Ethernet), capture size 65535 bytes
capability mode sandbox enabled
10:20:26.644723 IP 10.99.101.90.7301 > 172.29.33.37.13541: Flags [.], ack 689131649, win 516, length 0
10:20:26.652621 IP 10.99.101.90.7301 > 172.29.33.37.13541: Flags [.], ack 1459, win 517, length 0
10:20:26.708810 IP 10.99.101.90.7301 > 172.29.33.37.13541: Flags [.], ack 1762, win 515, length 0
10:20:27.101622 IP 10.99.101.90.7301 > 172.29.33.37.13541: Flags [.], ack 3249, win 517, length 0
10:20:27.102206 IP 10.99.101.90.7301 > 172.29.33.37.13541: Flags [.], ack 4644, win 517, length 0
10:20:27.114552 IP 10.99.101.90.7301 > 172.29.33.37.13541: Flags [.], ack 5247, win 514, length 0
10:20:27.220642 IP 10.99.101.90.7301 > 172.29.33.37.13541: Flags [.], ack 5348, win 514, length 0
But yet the routes are good still and I've not changed anything on the config..
So what Im wondering is, is this symptomatic of what amazon is doing or our firewall or both ?
Can anyone tell me why the traffic does not fail back to em1 and stop pushing the traffic over the Ipsec VPN, to stop it i have to manually disable ipsec.
I have also noticed that when I shut down em1 on the primary firewall, OSPF re-routes traffic for 10.99/16 to the 2nd firewall because it still has a link via that host to the target, but from what i've seen is that once all the links are correctly up again, there's traffic for 10.99/16 being seen over the VPN, the LAN and VPLS interfaces, what is going on ??
Whilst I know the traffic is going over the VPN, i can do a traceroute from a host on the admin lan that's sending data to the VPC.
|–----------------------------------------------------------------------------------------|
| WinMTR statistics |Host - % Sent Recv Best Avrg Wrst Last enf-dr-fw1-adm.fmlocal - 0 2 2 0 0 0 0 No response from host - 100 1 0 0 0 0 0 No response from host - 100 1 0 0 0 0 0 No response from host - 100 1 0 0 0 0 0 aws-dev-p-app-1.fmlocal - 0 2 2 12 12 13 13 ________________________________________________ ______ ______ ______ ______ ______ ______ WinMTR v0.92 GPL V2 by Appnor MSP - Fully Managed Hosting & Cloud Provider When the VPLS is working, it shows the routes going over the 8 hops to amazon.
-----------------------------------------------------------------------------------------|
| WinMTR statistics |Host - % Sent Recv Best Avrg Wrst Last enf-dr-fw1-adm.fmlocal - 0 5 5 0 0 0 0 expo-e-router1-vpls-vrf.fmlocal - 0 5 5 1 2 7 1 192.168.0.2 - 0 5 5 1 1 1 1 192.168.0.1 - 0 5 5 1 1 1 1 80.85.65.157 - 0 5 5 1 1 1 1 80.85.65.158 - 0 5 5 1 1 1 1 aws-dev-p-app-1.fmlocal - 0 5 5 11 11 11 11 ________________________________________________ ______ ______ ______ ______ ______ ______ WinMTR v0.92 GPL V2 by Appnor MSP - Fully Managed Hosting & Cloud Provider</full-duplex></performnud,auto_linklocal></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic></broadcast,simplex,multicast>