open VPN and vlans

AdmiralBTech

Hey everyone.
Got a weird one I'm hoping you all can help with!
Basically I have a range of between 3 and 5 sites which I'm trying to get up and running.
All my xg7100s are sat behind the ISPs provided routers (don't have a choice) at each site.
I'm hoping to be able to set up site 1 as the server and each other site as the client, and route everything through the server.
My problem lies in the fact that I need about 10 identical vlans on each site, with devices on each site able to talk to people on each vlan (eg site 1 vlan 10 to talk to devices on sites 2 through 5 on vlan 10), ideally on the same IP range.
Does anyone know if this is possible? If so, then I'd like to be able to use site 1 as the DHCP server too, and have the other xg7100s point their devices to the site 1 xg7100.
I was thinking about whether using qinq would work but I don't know whether you can so qinq over vpn or not.
Any ideas would be appreciated.
Cheers!

JeGr

TL;DR

Short: You don't want that. Like: at all and never ever ;)

Long:
@AdmiralBTech said in open VPN and vlans:

My problem lies in the fact that I need about 10 identical vlans on each site, with devices on each site able to talk to people on each vlan (eg site 1 vlan 10 to talk to devices on sites 2 through 5 on vlan 10), ideally on the same IP range.

The VLAN ID is nothing to worry about as they are L2 and local to the network there. But why would you even think about setting up a VPN-mesh-kinda-setup with

ideally on the same IP range.

that idea? If you avoid one fact like the plague then having the same IP address space on both (or multiple) sides of the connection.

It - is - utter - chaos!

Simply DON'T do that at all! As the only way that would even remotely work is to have NAT over NAT over NAT, it would be an absolute (pardon me for the wording) clusterfuck ;)
Also you'd have to use NATted addresses to access those sites' IP ranges anyway, so why go to a rock and a hard place and do colliding IP spaces in the first place?

To explain with a simple 3-part-mesh and only 1 LAN: Central is "C". Branches are "A" and "B":

You'd setup an OVPN Site2Site tunnel from A (client) to C (servers) and from B (client) to C. So you'd have to tunnels set up.
You want to use 10.0.0.0/24 as your LAN on any side.
From Central if you'd want to access the 10.0.0.0/24 on side A or B, you'd have to use NATted addresses so the routing has a chance to find out, which way to go. So you define 192.168.1.0/24 as NAT for site A, 192.168.2.0/24 as NAT for side B and map that 1:1. So every address on 192.168.1.x gets over S2S tunnel to site A and rewritten to 10.0.0.x there.
That game has to be played on all sites, so site A also has to use a NATted range for Central - you use 192.168.0.x. The range 192.168.2.x again sends it via As tunnel to C and again into the tunnel to B.
For every VLAN you add on Central (and sites A and B) you'd have to again add that NAT translation tables on all sides.

Now think about having 5 sites with 10 VLANs -> that would have you BiNAT 40(!) networks on every site on your VPN mesh setup. That's why I was calling that a huge clusterfuck to happen! It's simply insane to get your head wrapped around and it is a playground for errors to make and routing to break.

So I'd propose to go the route, every sane network tech would go:

define a network range for your main site where all routing will happen. Let's have that 172.17.y.x/24
You can then add multiple VLANs to it, like 172.17.0.x, 172.17.1.x, .2.x, up to .9.x that makes 10 VLANs
You can use simple VLAN numbering that is using the IP ranges. Like using VLAN 1700-1709 for those networks, then it's really easy to debug that, too.
For your "client" sites, we'll take 4 different ranges, like:
- 10.21.y.x - Site 1
- 10.22.y.x - Site 2
- 10.23.y.x - Site 3
- 10.24.y.x - Site 4
Again, you can use 0-9 for "y" to have your 10 VLANs in these sites. Also you can simply use VLAN IDs like 2[1-4]yy (2100-2109, etc. etc.) as VLAN IDs - you hopefully get the drift ;)

What does that achieve?

a) It makes debugging easier. If something is wrong with VLAN IDs on the switches, a simple tcpdump can show you, what packets arrive and depending on their IP you quickly know if there's something configured wrong if e.g. 172.17.4.1 shows up on VLAN 1701 (it should be 1704!).
b) It uses distinct networks all over your network. You can fully route your traffic between all locations without ever needing NAT between any site whatsoever.
c) It makes routing through the VPN tunnels simple. By packing the networks together in nice little /16 adressable segments (or you can use /20 to only address the first 16 /24 networks), you can ease the configuration of the S2S tunnels
* C to anywhere just needs 172.17.0.0/16 as local network, 10.21.0.0/16 as remote for tunnel 1, 10.22.0.0/16 as remote for tunnel 2 etc. etc.
* Sites 1 (for exmaple) has its 10.21.0.0/16 as local and 172.17.0.0/16,10.0.0.0/8 for remote (so all 10.x traffic not local would go to central, too, so it can be sent further to one of the other sites 2-4)
* Sites 2-4 the same with their local and the same remote setting as 1.
d) That makes the setup easy to add further VLANs on all sites without having to reconfigure the tunnel settings at all, as all tunnels have at least a /16 netmask configured, so you can easily expand your networks on all sites or on the main site by going upwards with your networks.

If so, then I'd like to be able to use site 1 as the DHCP server too, and have the other xg7100s point their devices to the site 1 xg7100.

Also using a central DHCP Server would shoot you in the foot, as you would then have devices on a far network with the same ip range requesting IPs out of the same pool of local devices what would result in the network masks becoming too small. E.g. using 10.1.0.0/24 on all sites would leave you ~250 DHCP IPs that would have to be divided between 5 locations so you could only address 50 at all. Also not knowing if a client is local or not or a L2 ARP/IP request via VPN from the same address range would IMHO bring again chaos very quickly.

Regards,
Jens

JKnott

@AdmiralBTech

You can't send L2 VLANs over L3 IP. However, you can route each subnet across as needed. They can all pass through the same tunnel. So, you could set up the various VLANs and then route as appropriate.

AdmiralBTech

@JeGr
Thanks for your comprehensive reply!
To answer some of your points

But why would you even think about setting up a VPN-mesh-kinda-setup with

The main reason I think I need to do this is that I am trying to set up event control systems (e.g. lighting, video, sound etc).
If I just use lighting for an example.
The lighting desk needs to be able to see the node at the other side to communicate with it. So the desk is either broadcasting to all the nodes on the network or sending via multicast. In a normal situation this is ok as everyone is local and nothing has to be routed.
My aim is to have the lighting desk in one location (e.g. a wearhouse), and the lights in another location in the country (e.g. someones home). I want to use the pfsense to route the packets between the desk and the lights.
So if the desk is on 10.1.1.50, it needs to be able to see the lighting node on the other side of the country on an ip address like 10.1.1.60.

Clusterfuck

I like it. Brit by any chance?

So using routing, how would I get my lighting desk at ip add 10.1.1.50 to have active communication between it and the node at 10.1.1.60?
Can I send you what I think my ip scheme should look like after reading your comment?

Cheers!

JeGr

@AdmiralBTech said in open VPN and vlans:

So if the desk is on 10.1.1.50, it needs to be able to see the lighting node on the other side of the country on an ip address like 10.1.1.60.

That may be, but you can't just route multicast or broadcasts from the same network over VPNs so that they arrive at some external location to be picked up and replied to as the answer would never reach the controller again because every site thinks that the network 10.1.1.x is local to it and thus not sending it to the default router. That simply doesn't work. As I understand you're trying to "bridge" multiple locations and open up one big subnet 10.1.1.x over multiple locations. You'd need a L2 capable "VPN" for that. I don't see that working out without using proper routing techniques otherwise.

So using routing, how would I get my lighting desk at ip add 10.1.1.50 to have active communication between it and the node at 10.1.1.60?

That depends if your lighting desk can actually use real routing and IP addresses rather than just trying to detect anything via multicast or broadcasts. If that's not possible, what you'd like to do seems near to impossible to achieve via normal VPN software like OVPN.

AdmiralBTech

@JeGr
Thats fair enough.

You'd need a L2 capable "VPN" for that

I was thinking of trying to use OpenVPN in TAP mode rather than TUN mode.

BTW, thanks for all the advice so far! Really appreciate it.

JeGr

@AdmiralBTech said in open VPN and vlans:

I was thinking of trying to use OpenVPN in TAP mode rather than TUN mode.

I wouldn't count on that. Even in TAP mode, there are some things better left rather than to open pandora's box ;)
I'd think more along the lines of tools like Zerotier or anything alike that aim to make a L2 capable VPN connection.

But really, if the soft-/hardware you have deals heavily with local broadcast or multicasts and "autodiscovery" and such "automagic" things rather then plain IP, I'd leave it alone even if I understand the idea.