Best practice pfsense on ESXi, VLANs, Virtualisation lab
i would like to discuss something thats on my mind for awhile: what is the best practice for pfsense on a Hypervisor with 4 NICs and a managed switch. (HP DL360 G7, Netgear GS724T)
I have set it up (just to prove my concept) with all 4 NICs trunked (LAG, Failover) and then completely separated my Net via Vlans into WAN, LAN, Guest & Surveillance. Management ist done over LAN. On the ESXi Host i´m also going to have some sort of test Lab or maybe even a NAS/File Storage. It´s working, but is it the optimal setup in terms of security and performance? Or would it be better to separate the WAN connection to a separate NIC? Or maybe something else i didn’t think of?
Thank you for sharing your opinion.
KOM last edited by
I don’t know if there is a best way. For my labs and production, I use a dedicated NIC for WAN and one for LAN/OPT1. I don’t use VLANs.
Basic VLAN is more like a switch routing technique (based on MAC or IP tables). Basic VLANs do not really add anything to security like encryption – just stuff that can be spoofed (VLAN tags in header, MACs, IPs etc). Not saying that some of the best and newest commercial VLAN switches cannot throw in encryption…but not likely on affordable home and SOHO boxes. Heck probably not that affordable for medium sized corps if its available. That is a lot of gigabit encryption power on a decent sized switch appropriate to that size of company.
So a quick surface analysis says your firewall capabilities go up in flame if you essentially have only 1 NIC for both WAN and LAN after aggregation. Nothing keeps outsiders from spoofing onto a different VLAN intended for LAN use.
But if you are only after VPN capabilities then 1 NIC is fine if you pass all going traffic including the “outgoing” VPN tunnel through another properly setup firewall.
Sorry VLANs do benefit security and performance by limiting broadcast domains. But that is more a narrowing of publicly advertised information to smaller groups,
rather than preventing spoofing (network identity theft) or preventing eavesdropping without proper authentication/credentials/key once a legitimate network identity is spoofed. Which is what a firewall separating physical WAN and LAN.
*Yes a WAN using only a VPN on its VLAN would be secure on the same interface as VLANs used for LANs.
But that case requires that the WAN VPN on its VLAN be created on that hypothetical “proper firewall”. Or at least using a VPN tunnel to bridge an entire raw WAN interface to your pfsense box. The VPN bridge tunnel seems wasteful and depending on the configuration and VPN box maybe dangerously prone to losing VPN and spewing raw WAN data through its VLAN without aggressive alert.
After thinking it over, your single aggregate interface with VLANs might not be completely horrible…as long as your managed switch explicitly allows only a single VLAN on each of its ports (other than the 4 ports of your aggregate server trunk which has them all).
Then it all depends on how hard your switch is to hack. Unfortunately, some switches crack pretty easy. Many simply are NOT focused on being hardened like a firewall since they assume decent network security resides elsewhere. Some are simply focused on performance. Some simply bear symptoms of hasty readiness for market or plain poor workmanship in their code.
What I was thinking earlier was that things can be a lot worse if each switch port is configured to autolearn its VLAN configuration based on the received VLAN tags, IP addresses, MACs etc. That is often the default switch configuration – accept all traffic as good and set the best network switching from that. Maximum bad if the actual port attached to the WAN learns configuration from whatever packets the switch receives. In those cases you have primed the switch to accept spoofed traffic, identities, and even VLANs.
Lets talk about the performance part of this since you lagged the connections… 1+1+1+1 does not = 4… It just equals 4 x 1 links… Where your traffic may or may not load balance across the physical links…
I would see little advantage to such a setup from a performance point of view, because you have no idea what physical path traffic would be taking… Much easier to assign your physical nics to specific vlans that are tied to specific vms on vlans or pfsense, etc.
I you need more than 1 network into that specific nic - then sure vlan it…
But I do not see such a setup helping in any way security or performance… The more complex you make a setup the more likely to make a security mistake… So if anything such a setup could be seen as lowering the security… And from a such a setup you have idea if the return traffic or other intervlan traffic will take a different physical path so you could have bunch of hairpins which could lower your overall performance…
Thank you for your replies.
First, i want to talk about the security part and how i have set it up.
The DSL Modem, which is my WAN connection is connected to a certain port on the switch which is untagged member of the WAN VLAN and is tagged by the PVID Setting. The Port is no member of any other VLAN, so Packets with “spoofed” VLAN IDs should be dropped. The Switch Management Interface is only reachable through my Main Network VLAN.
For the performance Part, i´m aware of the fact that i would not get 4 Gbit Connection to a single Point, but i would usually have various data streams and therefore i should have a benefit of this setup.
When i separate the WAN to a single NIC however, i would loose the major part of its bandwidth, as my WAN Connection is not anything near Gbit speed.
“but i would usually have various data streams and therefore i should have a benefit of this setup.”
Between what and what… What hashing is being used to put a session on physical path A vs path B, etc. Laggs are great when you want redundancy… Laggs are great when you have hundreds if not thousands of devices talking to lots of devices on the other end of the lagg/etherchannel, etc. etc. Take a look at the counters of your interfaces… Is the traffic being evenly distributed across them?
“When i separate the WAN to a single NIC however”
So your saying you exceed 3GB of traffic over your lagg, and that if you broke out this gig interface to just your wan that say is only 100, you would be saturating your lagg? Sorry but I find that highly unlikely… If you are using 3gig of your 4gig lagg… Then you need a bigger pipe plain and simple…
Maybe i wouldn’t max out the 3 Gbit LAG, but if it isn’t necessary for security reasons i would prefer the LAG over 4 Ports.
However, i don´t understand the benefit of having a dedicated NIC per VLAN as uplink for a dedicated vSwitch rather than separating it through port groups on the same vSwitch. Or did i get that wrong in Post #6?
The load balancing is negotiated between the Hypervisor and the Switch. i´m not familiar with the underlying process.
But i’m quite confident, that traffic will be evenly distributed over the interfaces. As i mentioned, i only ran this system for proof of concept yet, not in “production use” so i have no “real” data to examine.
“not in “production use” so i have no “real” data to examine.”
Then you have no use of the lagg to be honest… It makes no sense to create a more complex setup that remove your insight into the traffic flow pattern. There seems to be misconception at a basic level of what a lagg, etherchannel, portchannel, etc. whatever term you want to use… On what it actually accomplishes.
No single session is ever going to split traffic across the different physical paths… You have no way to know that your traffic is even going to flow down a physical path in such a setup for intervlan traffic. So when you put all the 4 vlans on the lag you have no way to really know that traffic between vlan A and B will take different physical paths. So you could end up with hairpin on the same physical path when talking from vlan A to B so vs increasing your available bandwidth you actually /2 it… Now if you had vlan A on physical path 1 and vlan B on path 2 you are sure that you will no hairpin and not cut your bandwidth in half…
How many vlans do you have? Unless its something way higher than 4. It makes no sense to do what your doing. And even if understand the traffic flow it doesn’t make any sense either… Since if you do know what you your traffic flow is for intervlan traffic you would want to make sure the higher traffic intervlan traffic is on different physical paths. And the vlans that don’t have a lot of traffic can all share the same path, etc…
Not sure what POC you were trying to do here… Yes you can vlan over a lagg, this is has been around for years and years and years. This is not something that needs to be proven out. But adopting the setup without clear understanding of traffic flow and how you are going to leverage the traffic flow and why/how how the number of clients could load share over the 4 int lag… Where are they going? You stated your internet is not even gig… So if you are flowing 4 gig into the router, and the internet is not 4 gig… Where is the traffic going that could eat up the 4 gig? To other vlans on the same lagg? You just cut your bandwidth in half maybe because you have no idea when you might hairpin vs using 2 physical paths when vlan A is talking to B…
Such setup might make sense if you had a 10ge wan… And you had LOTS of clients and you were wanting to be able to use up as much as possible of the 10ge wan if needed, etc. But no single client could ever use more than the 1 gig anyway in a session.
Laggs really only make sense for redundant paths… People think that oh I lag 1+1 and now I have 2… Sorry it just doesn’t work that way… If you want 2, then you need to put in 2 Not 1+1…
man, thats a whole lot to read…
to start with, i already mentioned that i know i don´t get a 4 Gbit connection if i put 4x1. I know that traffic between a and b will always only use Path a, but if i have traffic between a+b, c+d and e+f at the exact same time, it will most likely be split across 3 Interfaces. thats what i understand a lag to be. Correct me if i’m wrong. What i definitely don’t care about is if the connection between a+b is going over IF 1, 2, 3 or 4.
Who said i wouldn’t put it to production? i just didn´t want to put the system into use, before knowing it was “safe” or at least OK. What´s wrong with that?
Where did i say i have to route fast Internet? i mentioned Surveillance on a separate VLAN, which will need inter VLAN routing to my NAS.
I mentioned VMs on the same Host with iSCSI traffic over that LAG, guest network, access to storage on or connected to the ESXI Host.
Why doesn’t it make sense to have more than one NIC per VLAN ? If i wanted to transfer data at gigabit from two different computers on the same VLAN at a time, i would have more throughput with LAG. Don’t always tell me i don’t understand it, explain it to me!
Yes, maybe its a bit overkill but thats not the question. I wanted to know if there are security issues or if the performance would be bad on this setup.
“I wanted to know if there are security issues or if the performance would be bad on this setup.”
How many vlans do you have??
“If i wanted to transfer data at gigabit from two different computers on the same VLAN at a time, i would have more throughput with LAG. Don’t always tell me i don’t understand it, explain it to me!”
If the computers were on the same vlan - they wouldn’t be talking to pfsense anyway… So now you have multiple computers on vlan A talking to Vlan B?? So which physical path do they take? For all you know since you lagged this and have no control over the hashing that determines when traffic uses a specific physical path… You might be going up and down the same physical path… So vs having 1 gig road from A to B… You drive down the same road Twice and now only have 1/2 your road…
I should of clarified, I didn’t mean you just don’t understand… Its a whole lot of people, even people in the field… They are just so use to using lagg that they don’t really get that its not doing what they think its doing, etc. There seems to be this conception that lagg, etherchannel, port channel 1+1=2, which has never been the case… 1+1 does not = 2 in this sort of setup unless your sure that you have multiple sessions, and variance in the variables to spread the traffic over both vs sharing the same physical path…
What determines that traffic from point A to B goes over what physical path? Is it same mac, same IPs, what determines when packet goes on path A or path B in the lagg? If you do not understand this, and have enough variable in this to make sure traffic does flow over different paths you do not know for sure that you have full bandwidth…
But if you use different physical paths for A and B… Then you are SURE that your flow will not be hairpinned down the same physical path and be a restriction on your bandwidth… This is my point…
You would lagg such a connection that your talking about when your worried that path A might fail out of A and B… So your ok with maybe having a few hairpins now and then because you don’t want it to be down if 1 of the physical paths in the lagg fail, etc.
Does that make more sense? See the pics
You would not do lagg unless you had Lots and Lots of vlans, and lots and lots of stuff talking to each other that your fairly confident that there is going to be great spread of traffic across the lagg… AND!!! Your worried about loss of 1 of the leggs in the path… Because if your not - then its prob better to set the vlans manually on the physical paths that will have the most/least traffic between them so ensure!! That the heavier traffic can not share the same physical path in a hairpin, etc.
In 3rd attached pic… Which setup are you SURE that vlan A talking to C will have 1 gig road all the way there back and forth… In which setup are you NOT sure that your A and C talking to each other might go up and down the same physical path in a hairpin?
From a security point of view… While if setup correctly there would be no issue, but since your now making the setup more complex… Vs having specific vlan on specific physical nic on the host… Your now sharing vlans on multiple nics into same vswitch with port groups? And trunk on the switch allow vlan or not allow vlan, etc. Since the setup is more complex, its more likely to have a mistake. Mistakes cause production down, they are reasons for security issues, etc. KISS is your friend in more ways than not
If the computers were on the same vlan - they wouldn’t be talking to pfsense anyway…
You’re Missing something here… there are more Machines on the Hypervisor, so there is traffic on the same vlan passing through the LAG, but not going to pfsense!
I will for now just assume that i configured everything correctly and test the performance with multiple Data streams on my own.
So your running traffic from physical to vm… across the lagg. How many physical, how many vms? What algorithm did you select for the load balance… Keep in mind that this can only be done for outgoing traffic.
You will need to check the counters on the actual interfaces on the switch to see what kind of distribution your getting across the physical paths, etc…