LAG & Interface issues & Now Packetloss - Solved
-
Ok I just configured LAG today between pfSense & my Cat3750 switch. Yes LACP is enabled & properly configured.
I don't know what magically happen & why pfsense has shit the bed halfway through my conversion.
So after everything was setup. I tested moving one VLAN subnet over to the LAG. So I went to Interfaces -> Assign -> VLAN & switched VLAN 13 to interface LAGG0. Then back to the Assign Interfaces & switched that to VLAN 13 on LAGG0.
Verified everything worked with that first subnet. PC attached was able to access internet no problems. I started moving over other subnets & now randomly a few subnets will not hand out any IPs or do DHCP. On the default pfSense webpage which will display interfaces on the right, magically some subnets do not say anything such as 192.168.x.1. Some of my subnets show the gateway as 192.168.x.1 & all those interfaces have been assigned to the LAGG & are working. I don't understand how all of my interfaces are configured THE EXACT SAME WAY & now magically some of them have broke. I've tried everything to make them work & they will not show any active gateway under interfaces even though again they are all configured the exact same, both the working and nonworking interfaces.
Anyone want to take a guess at what has magically happened.
EDIT: Just got it fixed. Developers should check into this, but I guess sometimes when moving interfaces over DHCP breaks or something. I just went into the problematic interface under the main "Interface" tab at the top, completely disabled the interface, applied changes & reset it to static IPv4 with the range I had previously configured, & now the gateway is displaying under listed interfaces on homepage & I just got it to pull an IP.
EDIT2:Pic of problem. Notice TRUNK interface has no gateway info like the rest of the interfaces do.
-
Let me reitrate this again. Something is severely broken between the LAGG connection & interfaces. I just spent 2 hours trying to get my network back online.
So I was adjusting some settings on the Cisco switch. Mainly trying to trim some VLANs from the LAGG that didn't need to be there anymore. Anyways, somehow network just went down & PCs weren't pulling IPs. I VPN into pfSense & it shows the autoselect & gateway underneath, so I assumed those were working. WRONG.
Just because LAGG is displaying correctly showing autoselect & a gateway on each interface doesn't mean they will pull IPs. Actually all IPs on my network are static, but they still wouldn't pull IP info. After 2+ hours & I reset all the settings back to the exact way they were earlier when it was working & it still didn't work. I randomly thought of disabling an interface & re-enabling it. ANY interface that is a part of the LAGG can be disabled and re-enabled & ALL subnets will magically start working again.
Any input from developers that could take a look at this issue. Because if someone doesn't have a way to VPN in to gain access (like I did from a phone), then your pretty much locked out your network & would have to mess with the interfaces through CLI. I don't know how to enable/disable interfaces through CLI so I can't comment on that.
EDIT:Also want to note I did try a full reboot of the pfSense machine & the problem still persisted even after a reboot. Only disabling & re-enabling the interface actually fixed the issue.
-
Ok, my server load is low in the mornings, so I moved my server over to the LAG interface. The server runs a Teamspeak 3 server. I now have major packetloss going on. Had none at all before, so I know this is related to configuring of LAG.
I tried to do some reading, but i'm completely baffled by this. I did a wireshark capture of traffic on the server in question. Data being sent is UDP. According to google searches on using wireshark to find packetloss with UDP, you can't. So i'm confused on how if wireshark can't figure out packetloss for UDP, how the hell can pfSense & teamspeak 3 both measure & notice packetloss?
The only other thing I have found out is, the server is setup as a VM in ESXi. I read the vSwitch needs to be switched to load balancing IP hash. I turned that on & it didn't fix anything.
Anyone got any suggestions why LACP/LAG is causing packetloss?
EDIT:Ha, after lots of reading, apparently regular vSwitches can't handle LACP only special switches in vCenter. Regular ESXi hosts can only do static etherchannel. I will try to adjust my setup in AM to etherchannel only & see if that fixes packetloss & post back to confirm. Thought I am confused on how my network even works if the VMs don't support LACP. They are running fine just packetloss.
-
Did you read this? https://doc.pfsense.org/index.php/Upgrade_Guide#LAGG_LACP_Behavior_Change
-
Here is a simple network diagram
http://corfumedia.com/uploads/diagram.jpg -
Did you read this? https://doc.pfsense.org/index.php/Upgrade_Guide#LAGG_LACP_Behavior_Change
No I did not see that link before. However I do appear to have solved the problem as I indicated in the edit of my previous post.
On my Cisco 3750 under global config, I have set "port-channel load-balance src-dst-ip" & under my 4 LAG ports on the switch "channel-group 2 mode on", "switchport mode trunk", "switchport trunk encapsulation dot1q", & "switchport trunk allowed vlan X-X".
Under interface "port-channel 2", set options also for "switchport mode trunk", "switchport trunk encapsulation dot1q", & "switchport trunk allowed vlan X-X".
Packetloss looks to have cleared up. Though i'm confused on how if LACP isn't supported in the VMs without vCenter & a distributed switch as to how they had any connectivity at all.
Here are the 2 official vmware related links about this. I switched my LAG interface mode to loadbalance in pfSense. Though I did try to read & am confused on difference between FEC and loadbalance options. I believe they both do the same thing.
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2034807
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2006129
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2034277