AWS ENA issues
-
Hey everyone!
I don't usually have pfSense related issues, but here's a fun one.
I'm trying to run Netgate's 2.4.2p1 in an AWS instance and was hoping to use the new(ish) ENA support that was recently added in PfSense for AWS but ran into some issues with all that. Was thinking I'm going mad and that I forgot how to network but it seems the problem wasn't in my config (yay…)The problem in short:
ENA support does not work but somehow still works. Go figure.So after too much hours spent troubleshooting, I finally figured something out but am stuck with it at the moment.
While ENA is enabled, I can locally ping anything from pfsaws01 but nothing from the other end of the tunnel can reach the instances in the VPC. After tcpdumping I realized that the packets just go missing :) So, if you leave a ping running from 172.16.0.100 -> 192.168.0.100 you can see (with tcpdump, ntop, whatever) both the request and reply flowing through every interface they go through on both pfsenses. Meaning the request reaches my DC, a server there replies and pfsaws01 gets the replies, puts them on the local interface and all seems good. But on the 172.16.0.100 server, I don't get any replies destined for it. Turned on AWS flow logging for both 172.16.0.0 interfaces but couldn't find the replies there, not in ACCEPT or REJECT logs.
I switched a couple different instance flavours which support ENA but the behaviour was the same.
Then I decided to try it without ENA… Disabled it via awscli while the instance was stopped. The first difference I noticed is the interface naming scheme - with ENA enabled, interfaces are reported as ena[0-9], while without ENA they are named xn[0-9].
After ENA was gone, everything started working like a charm, but I only managed to squeeze 500 Mbps through the VPN. I blame AWS limitations for that (correct me if I'm wrong, but ENA enabled instances should be able to go up to 25Gbps).
So is this a bug or am I missing a setting in the pfSense?
A bit more details:
- OpenVPN establishes the connection just fine, p2p addresses pingable from pfSenses, internal addresses pingable from the pfSense, routing looks good
- With ENA enabled on the AWS instance, I can connect to the AWS pfSense, configure it, ping around the VPC just fine, all looks good
So from the server at 172.16.0.100:
- pings the internal IP of the pfSense (172.16.0.1) without a problem
- pings other machines in the same or different VPC subnets
- If I start pinging the tunnel p2p addresses or anything behind the tunnel - ping just times out
From the server at 192.168.0.100
- the DC network works like it should
- pings from this server to any interface on pfsaws01 works just fine - Have only one at the moment, but was testing multiple scenarios
- can't ping server at 172.16.0.100
More info on the setup
AWS Side
a) Netgate official pfSense 2.4.2p1 found in the AWS Marketplace- Single interface, with a local IP and an Elastic public bound to it
- tested a 2 interface, WAN/LAN setup just in case, same results
- source/destination checks disabled
- Manual NAT, all rules removed - the instances use AWS networking to access Internet
- Runs OpenVPN server
- Has a static route for the internal network through the interface
- Tested multiple flavors, was aiming for a r4.large
- SG set to Pass ANY ANY for testing
- All Firewall rules set to Pass ANY ANY
b) Server
Any kind of instance(s) in the VPC subnets… I used multiple instances to generate traffic for testing the tunnel
While testing all firewalls were down or had Pass ANY ANY Passc) General setup
- All relevant VPC routes pointed to the pfSense Interface
- Default is set to igw/nat depending on the subnet
DC side
a) pfSense XG-1540- running 2.4.0 at the moment
- CARPed
- has a 1/1 Gbps line for WAN
- Hooked up to a 10G switch (it's bigger on the inside)
- multiple networks and interfaces for DMZ etc
- Internal routing is handled by another device
b) Server
A physical rig, eg Dell 630 with 10G LAN cards -
I was having the same issue. Here's what fixed it for me:
1. Disabling the dest/source checks from the Instances panel only disables the check for the primary network adapter
2. Go to Network & Security > Network Interfaces
3. Right-click on the LAN adapter (172.16.0.1/24) and choose Change Source/Dest. Check -> Disable