Point-to-Multipoint OpenVPN not routing traffic between sites

coachmark2

Hello,

I posted this on the /r/pfsense subreddit but figured it would be best to also post it here.

So, this has been a multi-week odyssey with more hours devoted to it than I am comfortable admitting.

I have pfSense installations at multiple branch offices acting as edge firewall/router. Each one has Quagga OSPF installed. Each one has different private IP ranges on several internal interfaces. Each one of those internal interfaces is enabled in Quagga as a stub network.

Each installation is running an OpenVPN client that connects back to a pfSense installation in a Vultr VPS. This VPS acts as the OpenVPN server. It is also running Quagga OSPF.

All of the routers have found one another and can ping one another. Packet captures on their OpenVPN client interfaces (in the case of the branch offices) or OpenVPN server interface (in the case of the Vultr VPS) show a LOT of ICMP traffic between all the router IPs. This is due to OSPF seeing if each neighbor is still up.

All of the advertised private subnets are showing up in the routing tables of all of the pfSense routers. OSPF is working exactly as it should

Problem: none of the pfSense routers can ping anything except for each other's OpenVPN client interface IP. All clients can ping the server's OpenVPN server interface IP and each other's client interface IP. However, they cannot ping the each others' internal interfaces or anything on each other's internal LANs.

I have an IPv4 PASS rule on the OpenVPN server interface for the VPS and on each client's OpenVPN client interface. It's set to ANY protocol, ANY source, ANY destination.

What I've tested:

Packet capture running on VPS OpenVPN server interface. In another tab, Diagnostics –> Ping and tried to ping one of the pfSense branch router's internal interface or a device on an internal LAN. The packet capture shows the ICMP echo request on the VPS' OpenVPN server interface, but a packet capture running on the OpenVPN client interface of the endpoint that should be receiving the ping does NOT show the packet being captured.

I've turned on firewall rule logging to the nines and there is no record of pings being dropped/blocked/rejected on the branch pfSense installations.

I've also tried TCP (in the form of trying to access an HTTP server) and UDP (by using DNS lookup requests) just to rule out some ICMP weirdness. No luck.

I've been working this issue for over a week now and am completely out of ideas and frustrated. Any assistance that anyone can render would be awesome.

mikeisfly

In the OpenVPN client section of the PfSense config have you specified the remote network(s) that should be available to this tunnel? Could be a lot of configuration but you can summarize the networks if possible. I'm curious if you try to ping a remote network do you get a ICMP packet back from the router saying no host found and which router is returning that packet back to you.

coachmark2

Unfortunately, I don't get a No route to Host packet or anything like that. It seems to simply be dying in the tunnel as far as I can tell. There's a lot of ICMP noise due to OSPF running so I may be missing it if it's there, but I haven't been able to find it.

Do I need to put every subnet for every branch router into EVERY branch router's OpenVPN client configuration? I may be mistaken, but I thought the point of OSPF was to inform the router how to get to each network in a dynamic fashion. Explicitly listing the networks in the OpenVPN client section would be tantamount to static routing everything, no?

Derelict

The remote OpenVPN sites have to be told to send traffic for all of the other sites through the tunnel.

That is done with a "Local Networks" directive (or a client-specific override) on the server if configured so settings can be pushed or with a "Remote Networks" directive on the client.

If you want all sites to communicate with each other it is often best to design it with assignments for each site from a "supernet" so all sites can have just one large subnet to route that encompasses all sites. Then OpenVPN figures out where to send the traffic using iroutes.

mikeisfly

@coachmark2:

Do I need to put every subnet for every branch router into EVERY branch router's OpenVPN client configuration? I may be mistaken, but I thought the point of OSPF was to inform the router how to get to each network in a dynamic fashion. Explicitly listing the networks in the OpenVPN client section would be tantamount to static routing everything, no?

I agree that the purpose for Dynamic routing protocol is so that you shouldn't have to specify which routes should use the tunnel. It would seem to me that if there was a route in the routing table and the next hop router is the adjacent routers IP which is part of the OpenVPN interface subnet then the router should forward the packet through the tunnel assuming all routers are part of the same area. To Derelict point though this is not the behavior. I hope this is something that can be added in future PfSense versions. I believe PfSense is a firewall first and a router second. I would like to see a routing protocol priority established as well. I have not fully tested this since like version 2.x or maybe even 1.x but if you have both IPsec tunnel and a OpenVPN tunnel set up which tunnel does PfSense prefer assuming they point to the same networks? Will the router use BGP before OSPF if both protocols have the route in their routing tables? I posted about this before but I don't remember getting a answer to the question. I will try to find the post and add a link to it here.

https://forum.pfsense.org/index.php?topic=62551.msg337835#msg337835

Derelict

If you want to see the direction things are going, try the FRR package instead of Quagga.

coachmark2

@Derelict:

The remote OpenVPN sites have to be told to send traffic for all of the other sites through the tunnel.

That is done with a "Local Networks" directive (or a client-specific override) on the server if configured so settings can be pushed or with a "Remote Networks" directive on the client.

If you want all sites to communicate with each other it is often best to design it with assignments for each site from a "supernet" so all sites can have just one large subnet to route that encompasses all sites. Then OpenVPN figures out where to send the traffic using iroutes.

So I should add all of the private subnets that are at all of the sites to the OpenVPN clients' remote networks section? I will try that; the network is pretty small right now so it's not THAT much overhead, but it is a bit questionable as to why pfSense acts in this way.

Thank you very much for your great explanation.

On the "supersite" network suggestion, how would that work? Would it be something like this (obviously this gives no room for expansion, but just as example)?

"Main" network is 10.10.0.0/22

Corporate HQ uses 10.10.0.0/24
Branch 1 uses 10.10.1.0/24
Branch 2 uses 10.10.2.0/24
Branch 3 uses 10.10.3.0/24

And then each location uses their pfsense router as a member of an OpenVPN network to route between?

coachmark2

@mikeisfly:

@coachmark2:

Do I need to put every subnet for every branch router into EVERY branch router's OpenVPN client configuration? I may be mistaken, but I thought the point of OSPF was to inform the router how to get to each network in a dynamic fashion. Explicitly listing the networks in the OpenVPN client section would be tantamount to static routing everything, no?

I agree that the purpose for Dynamic routing protocol is so that you shouldn't have to specify which routes should use the tunnel. It would seem to me that if there was a route in the routing table and the next hop router is the adjacent routers IP which is part of the OpenVPN interface subnet then the router should forward the packet through the tunnel assuming all routers are part of the same area. To Derelict point though this is not the behavior. I hope this is something that can be added in future PfSense versions. I believe PfSense is a firewall first and a router second. I would like to see a routing protocol priority established as well. I have not fully tested this since like version 2.x or maybe even 1.x but if you have both IPsec tunnel and a OpenVPN tunnel set up which tunnel does PfSense prefer assuming they point to the same networks? Will the router use BGP before OSPF if both protocols have the route in their routing tables? I posted about this before but I don't remember getting a answer to the question. I will try to find the post and add a link to it here.

https://forum.pfsense.org/index.php?topic=62551.msg337835#msg337835

Thanks for the reply. So this seems to be a limitation of the pfSense routing engine more than anything… Hmmm...

Thanks for the clarification

mikeisfly

@coachmark2:

@Derelict:

The remote OpenVPN sites have to be told to send traffic for all of the other sites through the tunnel.

That is done with a "Local Networks" directive (or a client-specific override) on the server if configured so settings can be pushed or with a "Remote Networks" directive on the client.

If you want all sites to communicate with each other it is often best to design it with assignments for each site from a "supernet" so all sites can have just one large subnet to route that encompasses all sites. Then OpenVPN figures out where to send the traffic using iroutes.

So I should add all of the private subnets that are at all of the sites to the OpenVPN clients' remote networks section? I will try that; the network is pretty small right now so it's not THAT much overhead, but it is a bit questionable as to why pfSense acts in this way.

Thank you very much for your great explanation.

On the "supersite" network suggestion, how would that work? Would it be something like this (obviously this gives no room for expansion, but just as example)?

"Main" network is 10.10.0.0/22

Corporate HQ uses 10.10.0.0/24
Branch 1 uses 10.10.1.0/24
Branch 2 uses 10.10.2.0/24
Branch 3 uses 10.10.3.0/24

And then each location uses their pfsense router as a member of an OpenVPN network to route between?

Yup, it is call route summerization or supernetting.

coachmark2

I think I'm still not getting something. Here's something I tried and expected to work but of course it didn't.

Laptop connects as a client to this VPN. Gets IP of 10.98.0.50.

One branch router has an OpenVPN client IP of 10.98.0.2. It also has a LAN IP address of 10.20.30.2/24

I have added a static route on my laptop for the 10.20.30.0/24 network with a gateway of 10.98.0.2.

My laptop can ping 10.98.0.2. It cannot ping 10.20.30.2. Both pings leave on the laptop's OpenVPN client interface. The ping to 10.98.0.2 shows up in packet captures on the branch router's web UI, but the pings to 10.20.30.2 do not even though the laptop has a static route pointing traffic to 10.20.30.2 to the gateway of 10.98.0.2.

I'm beyond confused and rather frustrated at this point. What is not going right here!?

mikeisfly

@coachmark2:

I think I'm still not getting something. Here's something I tried and expected to work but of course it didn't.

Laptop connects as a client to this VPN. Gets IP of 10.98.0.50.

One branch router has an OpenVPN client IP of 10.98.0.2. It also has a LAN IP address of 10.20.30.2/24

I have added a static route on my laptop for the 10.20.30.0/24 network with a gateway of 10.98.0.2.

My laptop can ping 10.98.0.2. It cannot ping 10.20.30.2. Both pings leave on the laptop's OpenVPN client interface. The ping to 10.98.0.2 shows up in packet captures on the branch router's web UI, but the pings to 10.20.30.2 do not even though the laptop has a static route pointing traffic to 10.20.30.2 to the gateway of 10.98.0.2.

I'm beyond confused and rather frustrated at this point. What is not going right here!?

Need to see your openvpn configs. Do you have the option Redirect IPv4 Gateway

Force all client-generated IPv4 traffic through the tunnel

set?
What I suspect is happening is because you may not have that remote network specified in the remote network section of the client override, your computer is trying to use your computer's default gateway instead of the tunnel even though you have the static route. I could be completely off base here need to see the configs to know for sure.

coachmark2

For this particular laptop, the Force all traffic down the tunnel option is selected. Regardless which static route I put in (one pointing at 10.98.0.2 and another at 10.98.0.1), still no good.

I do NOT have the force all traffic option set for the branch firewalls because I want them to function split-tunnel where they only use the tunnel for what they have to.

Derelict

What type of OpenVPN server is this (SSL/TLS, Remote Access, Point-to-multipoint? Shared key?) What is the tunnel network mask?

coachmark2

It's SSL/TLS and the tunnel network is 10.98.0.0/21

Derelict

Tunnel network is a /21? Why? Expecting thousands of clients on one server?

You cannot arbitrarily route subnets across a network like that. You cannot run OSPF between endpoints on a network like that. If you want to use OpenVPN and OSPF you have to configure a different PtP tunnel process for every endpoint. (Shared-key mode or SSL/TLS with a tunnel network of /30). In PtP mode OpenVPN will accept traffic for any destination routed to it and shove it across the tunnel without looking for an iroute (explained later).

When you configure an OpenVPN server as SSL/TLS with a tunnel network larger than /30 it configures itself in "server" mode.

The networks work like this:

Server Side
Tunnel Network = The tunnel address of the client comes from this network
Remote Network = Server Kernel Route into OpenVPN Server Process
Local Network = Client kernel route pushed to client and installed as a kernel route into OpenVPN on that side.

Server Client-Specific Override
Remote Network = Internal OpenVPN route (iroute) telling the process into which tunnel to send traffic that is not addressed to the client's tunnel address
Local Network = Client kernel route pushed to client and installed as a kernel route into OpenVPN on that side in addition to the Local Network(s) configured in the server (if any)

So, in the case of the example of these remote networks if you want everyone to be able to talk to everyone:

Corporate HQ uses 10.10.0.0/24
Branch 1 uses 10.10.1.0/24
Branch 2 uses 10.10.2.0/24
Branch 3 uses 10.10.3.0/24

Server Configuration
Local Network(s): 10.10.0.0/22 (Pushes this route to clients - they install it into their routing table) 10.10.[123].0/24 is longer so it will be the best route for that locally
Remote Network(s): 10.10.0.0/22 (Installs this route in the local routing table - 10.10.0.0/24 is longer so it will be the best route locally.

Client Specific Overrides
Branch 1: Remote Network(s): 10.10.1.0/24 (Installs iroute in OpenVPN telling traffic to go out this tunnel)
Branch 2: Remote Network(s): 10.10.2.0/24 (ditto)
Branch 3: Remote Network(s): 10.10.3.0/24 (ditto)

Anyway that's what I would try…

coachmark2

Thanks for the detailed reply. The large subnet was chosen because previous openvpn installations I've done have handed out every 4th IP to clients and I figured a /24would run out eventually. /21 was probably too big tbh though.

Let's start over and rethink this whole mess from the start.

I have a half dozen branch sites on various subnets. Off the top of my head, one is 10.0.0.0/16, one is 10.20.0.0/21, and one is 10.6.0.0/24. All of them have pfSense Firewalls acting as the firewall and router for those sites.

Knowing that some of these subnets are rather stupid (I didn't set them up…) how would you suggest uniting all of them into a network where a node on any subnet can communicate with any other? I'm not Loyal to openvpn, could use L2TP/IP Sec.

I realize this is alot to ask of goodwill forum support so let me continue to express my appreciation for your help in this.

Derelict

If the default is left selected in the server configuration, every site gets one address out of the pool. You have to manually select topology net30 there to get the old behavior.

The size of the subnet mask really doesn't matter. It is the number of hosts on the broadcast domain that actually matters. Using unnecessarily large subnets:

1. Increases the likelihood of "colliding" with another private site for VPN purposes, forcing someone (or both) to renumber or perform NAT - both undesirable.
2. Increases the likelihood of configuration errors because people and some gear tend to assume /24.

With "just" a half-dozen sites you can consider creating a different Site-to-Site configuration for each one or use a single site with iroutes or a combination of both.

In the latter case you have more flexibility because you can "push" settings to the clients:

One is 10.0.0.0/16, one is 10.20.0.0/21, and one is 10.6.0.0/24.

Server configuration:
Tunnel Network: Something unused anywhere - probably a /24
Remote Networks: 10.0.0.0/16,10.20.0.0/21,10.6.0.0/24
Local Networks: [Insert Local Subnet/CIDR],10.0.0.0/16,10.20.0.0/21,10.6.0.0/24
Inter-Client Communication: Enabled.
Topology: subnet

Client-specific Overrides:
Site 1 Remote Network: 10.0.0.0/16
Site 2 Remote Network: 10.20.0.0/21
Site 3 Remote Network: 10.6.0.0/24

It is possible you might see some (almost always harmless) errors logged in that configuration when OpenVPN at a remote site tries to add the route for its own network because they are globally pushed to everyone. One such technique to mitigate that would be to take manual control of the client-specific overrides (using the advanced box) like this:

Site 1:
iroute 10.0.0.0 255.255.0.0;
push-reset;
push "route 10.20.0.0 255.255.248.0";
push "route 10.6.0.0 255.255.255.0";

Site 2:
iroute 10.20.0.0 255.255.248.0;
push-reset;
push "route 10.0.0.0 255.255.0.0";
push "route 10.6.0.0 255.255.255.0";

Etc.

You could also do something like this:

Server configuration:
Tunnel Network: Something unused anywhere - probably a /24
Remote Networks: [none]
Local Networks: [Insert Local Subnet/CIDR]
Inter-Client Communication: Enabled.
Topology: subnet
Custom options:
route 10.0.0.0 255.255.0.0;
route 10.20.0.0 255.255.248.0;
route 10.6.0.0 255.255.255.0;

Client-specific Overrides:
Site 1 Remote Network: 10.0.0.0/16
Site 1 Local Network/s: 10.20.0.0/21,10.6.0.0/24

Site 2 Remote Network: 10.20.0.0/21
Site 2 Local Network/s: 10.0.0.0/16,10.6.0.0/24

Site 3 Remote Network: 10.6.0.0/24
Site 3 Local Network/s: 10.0.0.0/16,10.20.0.0/21

A caveat here is by taking manual control of the routes on the server, pfSense will not know what the remote networks are so you will lose some things that are automated there such as source rules for Automatic Outbound NAT for the remote networks. They will also have to be manually-added.

Another consideration is firewalling. In this server, point-to-multipoint configuration you are relying on OpenVPN to pass all traffic between the sites. There is no way to firewall it other than the OpenVPN rules at the remote endpoint controlling what traffic is allowed (which might very well be sufficient). If you create a tunnel for each site you have control over what traffic can traverse between endpoints. You have even greater control if you assign interfaces for them and use per-instance rules. Each instance will also be assigned a CPU core as needed. Point-to-multipoint connections will all use the same core on the server side - at least I am pretty sure this is still true in OpenVPN 2.4 on pfSense.

You could also do a tunnel to each site with a /30 and use OSPF. I don't know if that is worth all of the configuration. It depends on how dynamic the routing table is I suppose. Probably a call you'll have to make.

ETA: Moved to OpenVPN

mikeisfly

Another thing to consider if you are reimagining your network is to consolidate your IP network into a continuous IP block. You want to think about what you need today and what you may want tomorrow. If possible you may want to start with a very large network that covers your whole network say a /18 and then break that network down into /20(s) which could then be broken down into /22(s) or /23….... (this makes summarizing routes easier) , this could be or will be a big project but if you have control of all these site might be something that you start making plans or something you think about in the future that could save you years of headaches. All the sites wouldn't (shouldn't) have to be done at the same time. This is a wish list thing. First get all your sites talking then work on better IP management.

The second point to your network that you might want to consider is a point to multipoint gives you a single point of failure. While you will have better control of which sites communicate to which sites going through your main office PfSense box, if that box /site goes down all sites lose connectivity unless you have some backup plan (Alternate route). The main office will also get more traffic than it needs to because two branch offices are communicating. A mesh network while requiring more work might be something that you want to consider.

coachmark2

You guys are both fantastic. Thank you so much for helping to explain to me how all this works. This morning, I setup things as Derelict recommended:

Server configuration:
Tunnel Network: Something unused anywhere - probably a /24
Remote Networks: [none]
Local Networks: [Insert Local Subnet/CIDR]
Inter-Client Communication: Enabled.
Topology: subnet
Custom options:
route 10.0.0.0 255.255.0.0;
route 10.20.0.0 255.255.248.0;
route 10.6.0.0 255.255.255.0;

Client-specific Overrides:
Site 1 Remote Network: 10.0.0.0/16
Site 1 Local Network/s: 10.20.0.0/21,10.6.0.0/24

Site 2 Remote Network: 10.20.0.0/21
Site 2 Local Network/s: 10.0.0.0/16,10.6.0.0/24

Site 3 Remote Network: 10.6.0.0/24
Site 3 Local Network/s: 10.0.0.0/16,10.20.0.0/21

I can now access all resources on the subnets mentioned thanks to your help. I shall buy another SG-3100 in your honor and definitely buy you a beer next time you're in my area

P.S. We can mark thread as solved if that's a thing