lagg bandwith issue
-
Not sure. I only use lag on pfsense in a school with around 600 client devices.
Even there I have trouble simulating aggregation on a small scale.In reality it does provide more then a gigabit in random situations when there is high load like deployment of of a dozen PC's while simultaneously keeping the normal browsing working for the other hundreds of devices.
Maybe @johnpoz can pitch in?
-
@jbisson said in lagg bandwith issue:
bandwith is splitted into two ~400-500 mbps each.
And who says that is not because of a hairpin.. When your doing vlans over a lag you really don't know what leg will be taken.
Why not just specific uplinks for the vlans vs just throwing them all together in a lag.. When you do intervlan routing like that you have no idea which physical path will be taken.. Yeah if you hairpin traffic you could end up seeing less than what you would expect.
Remember lagg isn't 1+1 = 2, its juts 1 and 1.. You don't really have any control over which physical path traffic will take..
But if you have 4 different paths for the 4 vlans, then you are sure which path will be taken for the flow between vlans..
-
Fair point. But I'm not sure what you mean by "specifc uplinks for the vlans" ? What exactly are you suggesting?
I did another test with two different interfaces (not lagged) between 4 pcs (2 on each lan card, all using different vlan) I was able to get full 1GB on each connection and saw the cpu spike to around 10% so definitely the cpu is able to handle the load.
I could put one interface on a specific vlan and the rest vlan on the other interface, this will give me the full 2gb IF the requests came on the two different interface.
I still think the lagg would be better because that would give me all the combinaison of vlans instead of seperate them on specific interface but it is what it is i guess.
-
If you want control of the path traffic will take then you need to sep them.. Not throw everything into a lag..
What does the lag get you?? Other than no control over traffic path? And if one of the physical paths fail you would still have connectivity... Do you really think its going to fail?? Is that your concern? Is redundancy in case of nic failure bigger concern than ability to provide bandwidth to intervlan traffic?
if you had say 3 vlans, and 2 interfaces - sure throw it into a lag and hope for the best. Or 4 vlans and 3 ports.. But you have 4 vlans and 4 ports.. Just put 1 vlan on each uplink.
Or when you have more networks than you have ports - control it by putting the vlans that do no talk to each other on the same interface or the networks that have lower bandwidth reqirements.. Example my wireless vlans are on 1 interface, because they don't talk to each other.. And no single client on any specific vlan would ever get close to gig.. Other vlans/networks are on their own uplink..
Laggs are great when you have 20 some vlans say and hey they need more than 1gig, you can balance all that different traffic across more physical paths to give you more bandwidth. But in a case where you want to make sure vlan x talking to y has max possible bandwidth no throwing them both on the same lag would not allow you to know for sure there is not going to be a hairpin.
-
@johnpoz said in lagg bandwith issue:
Laggs are great when you have 20 some vlans say and hey they need more than 1gig, you can balance all that different traffic across more physical paths to give you more bandwidth.
My idea on the lag is purely to allow two simultaneous connections 1GB at the same time regardless of their VLAN without having one uplink interface for each VLAN to the switch. Currently, I have ~5VLANS but I might add more in the future and my idea would be to just use a trunk lagg0 instead of starting to have all of them on their own 1gb uplink.
I'm still not sure to understand in which case would the lagg be good though. Isn't not what I'm trying to achieve here? Having couple of vlan's sharing one truck connection to allow a greather (total) bandwith of 1gb. I understand I would not get 2gb for one connection but you would think I could probably get two different connection at 1gb speed on that trunk.
I'm just trying to understand how would a lagg should be working without hairpin. From what I could understand, in order to avoid hairpin in a lagg, each of the VLAN should only have one possible gateway path. Would you have a architecture example on how could you support 20vlan in a lagg configuration without hairpin?
Lot of new stuff and new knowledge for me, thanks for all the information so far, I feel i'm missing some bit but hopefully this thread could be useful for others :)
-
@jbisson said in lagg bandwith issue:
I understand I would not get 2gb for one connection but you would think I could probably get two different connection at 1gb speed on that trunk.
It might, 2 different clients talking to 2 different things across the lagg. But you really don't have a lot of control here..
Yes if you have 100 clients talking to multiple things across the lag, then your bandwidth will be split across the physical paths and you would be able to get up to your 2 gig (1 and 1)..
But when you have only couple of clients talking you have no control really making sure that their connections will be split across the 2 physical paths. And especially if client in vlan X is talking to something in vlan Y and both vlans go over the same lagg, you can not be sure that the traffic is not hairpinned down the same physical path in the lagg.
If you have only 4 vlans, and 4 physical ports it doesn't make a lot of sense to lagg them unless your goal is redundancy.. If your goal is to make sure each vlan has the max possible bandwith for your 1 gig connections. And clients are talking to each other across vlans that transverse this - then lagg doesn't make much sense.
edit: If you have physical ports and interfaces and your goal is making sure that vlan x and vlan y clients talking to each other have max bandwidth talking to each other. Its best to use specific uplinks for each vlan.
If you have lots of clients talking to each other from vlan x to vlan y and 1 gig is not enough. Then put 2 ports in lagg for vlan X and 2 ports in lagg for Y.. Or up the uplinks bandwidth from 1 gig, say 2.5 or 5 or 10, or even higher if possible.
Optimal design of connections requires understanding the data flow, how much of it, from what to what. And goal be it max possible data flow, or redundancy and increased flow when lots of clients are talking to lots of other things.
If your goal is knowing that vlan x talking to y will always have full 1 gig between each other. Then using different uplinks for each vlan is better choice. But once you throw the connections into a lagg you loose the control of what physical path might be taken for any specific conversations..
edit2: Maybe this helps.. When you have physical path for each vlan your SURE it traffic flow will be like left side. When you lagg it, you don't really have any control and while sure it could still take both physical paths, you can not be sure. So you could end up with traffic on right where you hairpin the traffic and other physical path is not even used for specific conversations.
This is taken down to basic level for sure.. My point is for any "specific" conversation you can not be sure which physical paths will be taken. So as the number of conversations and number of clients increase - sure you will have load sharing across both links and all is good, its easy to setup, and hey if 1 of those physical paths fail - your clients will still be able to talk to each other.
But you can not be sure which path will be taken, if all of them will be used for your conversation(s) or just 1 of them or some of them.. Not lagging gives you that specific sort of control knowing exactly which physical path will be taken between vlan X and Y.. Sure if your talking vlans A,B,C ... Q and you only have 2 physical ports to use, then it becomes less of a concern of hey when client in C talks to something in P the physical path was a hairpin.. you just want to provide more bandwidth across all your clients talking to each other.
If you KNOW clients in A don't talk to X and Y doesn't talk to B.. then you might do something like
(A,X) (B,Y) for your 2 physical uplinks.
-
@johnpoz said in lagg bandwith issue:
how much of it, from what to what. And goal be it max possible data flow, or redundancy and increased flow when lots of clients are talking to lots of other things.
If your goal is knowing that vlan x talking to y will always have full 1 gig between each othanks for taking the time to explains, that is much appreciated - wow!!!!
It make perfect sense to me, I think the issue was my lack of understanding on how the LAGG (LCAP) actually works, I think I assumed it would always be fully load-balanced between the two legs (or what ever legs the lagg has) - wondering what's the logic on which one to choose from?
It seems like it was not really random either because in a random situation, I would then have been able to get 2 simultaneous connection 50% of the time but in my cases, I was not at all. Would the same "hairpin" issue happens on same VLAN to VLAN though where one connection could go on one leg but the other one might go on the same one? I'm assuming with a large amount of clients, the lagg will be using the two legs at some point but technically, you would have the same hairpin issue correct?
I did follow your recommendation, I did use one uplink on the VLAN I'm more interested in and use the other uplink for the rest of the VLAN (where I don't care too too much) and that seems to be working fine :)
-
There are different methods that can be used to determine which path traffic will take over lacp. The variables include the dest mac, the source mac the dest IP the source IP, etc. A hash is created and then the devices involved use this hash to determine which physical path is taken.
You normally have more control over what will be used in a switch to switch sort of setup. And different switch makers have their own proprietary methods and algorithms.
You can get pretty deep down the rabbit hole pretty quickly if you want too ;) And you could run into out of order packets, that can be a real performance hit on tcp communications depending on how you setup the connections. When packets can take different physical it is possible for packet A sent before B to get to its destination after B..
This is why sometimes simpler approach to making sure you have the bandwidth you want for specific traffic flow patterns not to lagg and use specific uplinks for your specific vlans or networks.
Sure for example if you have say a server vlan, and then wireless vlans - putting the server vlan on its own dedicated interface and them lumping the wireless vlans on to another uplink. You are now 100% sure that traffic to and from the server(s) will flow over both the server vlan uplink and the clients vlan uplink. And this intervlan traffic could never run into a hairpin sort of traffic flow.
But you run into a problem that hey when these vlans you lumped onto the 1 physical interface want to talk to each other you know its going to be a hairpin.
This is why you need to understand your traffic flow patterns, amounts of traffic who exactly will be talking to who in what amounts to be able to determine the optimal setup with your goals in mind. Sure lagging all your interfaces and throwing all your vlans on the lagg makes it a simple setup, and sure overall your going to have more bandwidth available for all your conversations. But for any specific conversation you might not have optimal flow paths ;)
So which way you do it for your specific needs might be different than how someone else would do it for their needs.
The problem is that many users think that hey I lump 4 gig interfaces together I have big fat 4gig pipe, and that just is not the case..
-
If you want to use the power of LACP, you need the same load balancing algo on both sides.
pfsense ist L2,L3,L4, but your Switch do whot, L2?
If you can use L3 oder L3,L4 it balnce both lines very nicely.
If you use L3,L4 on ony side and L2 on the other, it will broke the load-balancing.Sorry, but Netgar is not best in network, rather the opposite.
-
![@NOCling Seems like the switch is able to those LCAP load balancing, I did tried them all but with the same result - however, I did NOT change it at the pfsense level. The switch I'm using is a netgear GS324TP
-
Mmm, more config options there than I would have expected.
-
Those config options are just what is used for the hash to figure out which path to take.. Has little to know which path any specific conversation will take..
As I stated in the big picture your traffic will be spread across the physical paths in a lagg - but for any specific conversations there is no way to be sure the traffic will not hairpin..
-
@johnpoz yeah, exactly, I dont think any other switch would make it work as intended. LCAP is just not meant to distribute the load equally, its based on a hash which is non-deterministic really...
-
Mode 6 is the correct to work with your pfsense setting.
Restart both devices to be shure, it will work.
Is there a Firmware upgrade for the Switch? Is any type of LAG/LACP Bugfixing in the release notes?