LACP not working

cyberbot

Like both cross-chassis?

Maybe check the port settings in the running config then. Something is generating that mismatch. The incoming 'lag's peer info' looks normal, similar to what I see. There is almost nothing to set in pfSense anyway beyond the strict setting, which should work anyway since the switch is using active lacp.

My Brocade knowledge is exhausted it this point though.

Steve

i appreaciate it Steve,

we have called for pfsense help monday they will check this wit hus.
ill report back the finding.

one thing i have noticed two LAGS are sharing the same MAC Address.
is this normal ?

=== LAG "NAS" ID 11 (dynamic Deployed) ===
LAG Configuration:
   Ports:         e 1/1/11 to 1/1/12 e 2/1/11 to 2/1/12
   Port Count:    4
   Primary Port:  1/1/11
   Trunk Type:    hash-based
   LACP Key:      20011
Deployment: HW Trunk ID 3
Port       Link    State   Dupl Speed Trunk Tag Pvid Pri MAC             Name
1/1/11     Up      Forward Full 1G    11    Yes 141  0   609c.9f4b.808c
1/1/12     Up      Forward Full 1G    11    Yes 141  0   609c.9f4b.808c
2/1/11     Up      Forward Full 1G    11    Yes 141  0   609c.9f4b.808c
2/1/12     Up      Forward Full 1G    11    Yes 141  0   609c.9f4b.808c

=== LAG "WAN" ID 1 (dynamic Deployed) ===
LAG Configuration:
   Ports:         e 1/1/1 e 2/1/1
   Port Count:    2
   Primary Port:  1/1/1
   Trunk Type:    hash-based
   LACP Key:      20001
   LACP Timeout:  long
Deployment: HW Trunk ID 1
Port       Link    State   Dupl Speed Trunk Tag Pvid Pri MAC             Name
1/1/1      Down    None    None None  1     No  141  0   609c.9f4b.808c  WAN1
2/1/1      Down    None    None None  1     No  141  0   609c.9f4b.808c  WAN2

stephenw10

That's the switch side MAC address so it could be the same. I would not expect that to be an issue on a separate link / layer 2 segment.

Steve

cyberbot

@stephenw10 said in LACP not working:

That's the switch side MAC address so it could be the same. I would not expect that to be an issue on a separate link / layer 2 segment.

Steve

tomorrow we are having pfsense support team to check this for us, ill report back with the finding.
today been doing some reading and found Port flap dampening configuration on the switch can cause such behaviour,
does it ring a bell?
after doing some research I come across this
https://forum.netgate.com/topic/137927/interface-stopped-destributing-possable-flapping/7

stephenw10

Well if you have an internal loop in the switches like that guy had and no STP it would cartainly do it!

Do you mean you have opened a support ticket with us? Do you have the ticket number? I can add notes there so whoever works it knows what to look for.

Steve

cyberbot

@stephenw10 said in LACP not working:

Well if you have an internal loop in the switches like that guy had and no STP it would cartainly do it!

Do you mean you have opened a support ticket with us? Do you have the ticket number? I can add notes there so whoever works it knows what to look for.

Steve

do you mean I don't have spanning tree enabled on the switches or the other way around to disable spanning tree?

I believe its already enabled on the LACP, I see also MAC- learning is enabled, we have two pfsense boxes connected to the switch one is on and one is off, both boxes run the same configuration in case first one goes down we fired up the second one, maybe the Mac-learning causing this?

GigabitEthernet1/1/2 is up, line protocol is down (LACP-BLOCKED)
  Port down (LACP-BLOCKED) for 1 day(s) 14 hour(s) 28 minute(s) 40 second(s)
  Hardware is GigabitEthernet, address is 609c.9f4b.808d (bia 609c.9f4b.808d)
  Configured speed auto, actual 1Gbit, configured duplex fdx, actual fdx
  Configured mdi mode AUTO, actual MDIX
  EEE Feature Disabled
  Member of 7 L2 VLANs, port is tagged, port state is BLOCKING
  BPDU guard is Disabled, ROOT protect is Disabled, Designated protect is Disabled
  Link Error Dampening is Enabled
  STP configured to ON, priority is level0, mac-learning is enabled
  Openflow is Disabled, Openflow Hybrid mode is Disabled,  Flow Control is config enabled, oper enabled, negotiation disabled
  Mirror disabled, Monitor disabled
  Mac-notification is disabled
  Member of active trunk ports 1/1/2,2/1/2, primary port is 1/1/2
  Member of configured trunk ports 1/1/2,2/1/2, primary port is 1/1/2
  Port name is LAN1
  IPG MII 96 bits-time, IPG GMII 96 bits-time
  MTU 10200 bytes, encapsulation ethernet
  300 second input rate: 0 bits/sec, 0 packets/sec, 0.00% utilization
  300 second output rate: 928 bits/sec, 0 packets/sec, 0.00% utilization
  15187 packets input, 1943872 bytes, 0 no buffer
  Received 1 broadcasts, 15186 multicasts, 0 unicasts
  0 input errors, 0 CRC, 0 frame, 0 ignored
  0 runts, 0 giants
  154231 packets output, 19755504 bytes, 0 underruns
  Transmitted 214 broadcasts, 153930 multicasts, 86 unicasts
  0 output errors, 0 collisions
  Relay Agent Information option: Disabled

stephenw10

I don't think you have an STP problem since I would expect to see that logged very clearly.

If you are getting support from us it will be highly beneficial if I can add notes to any ticket you have open.

Steve

cyberbot

@stephenw10 said in LACP not working:

I don't think you have an STP problem since I would expect to see that logged very clearly.

If you are getting support from us it will be highly beneficial if I can add notes to any ticket you have open.

Steve

Hi Steve than you, we are having a local support as we are from Europe.
that company is having engineer available to investigate with us.
are you guys active in Europe?

I don't think its a spanning tree either I've disabled the MAC-learning on the lag but still shows this.

Dec  1 01:58:16:I:System: dynamic lag interface 2/1/2's peer info (priority=5,id=d067.e5e6.fe1a,key=0) mis-matches with lag's peer info (priority=32768,id=d067.e5e6.fe1a,key=363), set to mismatch Error
Dec  1 01:58:16:I:System: dynamic lag interface 1/1/2's peer info (priority=6,id=d067.e5e6.fe1a,key=0) mis-matches with lag's peer info (priority=32768,id=d067.e5e6.fe1a,key=363), set to mismatch Error
Dec  1 01:58:16:I:System: Logical link on dynamic lag interface ethernet 2/1/2 is down.

stephenw10

Ok. You can purchase support from us but not on-site.

It's showing the peer as having a different ID/MAC address now. Did you swap the ports in use in pfSense? Different priorities there also, 5&6 vs 3&4 previously. Something has changed there.

Also there is no-where in Europe it's 2am yet, what timezone is that switch set to?

Steve

cyberbot

@stephenw10 said in LACP not working:

Ok. You can purchase support from us but not on-site.

It's showing the peer as having a different ID/MAC address now. Did you swap the ports in use in pfSense? Different priorities there also, 5&6 vs 3&4 previously. Something has changed there.

Also there is no-where in Europe it's 2am yet, what timezone is that switch set to?

Steve

the switch is in Europe +1. do you think the time is the cause?
we are based in Germany. do you think different on the time would cause this?

I see it one day ahead than the normal time. 1 Dec
there is nothing changed on the switch or cables are still the same as before.

Wednesday they arranged a engineer who will come on site to check and assist. if he cannot help I can reach out for you guys, but time different is going to be difficult.

stephenw10

I doubt the clock offset would cause a problem for lagg. It does show the switch is either not configured for ntp though or unable to reach it. Or just set the wrong timezone, and not UTC.

Previously the switch logs were showing this:

lag's peer info (priority=32768,id=e839.3511.faab,key=715)

That's the expected ID, it matches the MAC address of lagg0 in pfSense.
So what is this new ID? Is that a MAC you recognise?

It throws doubt on what is physically connected to what.

Steve

cyberbot

@stephenw10 said in LACP not working:

I doubt the clock offset would cause a problem for lagg. It does show the switch is either not configured for ntp though or unable to reach it. Or just set the wrong timezone, and not UTC.

Previously the switch logs were showing this:

lag's peer info (priority=32768,id=e839.3511.faab,key=715)

That's the expected ID, it matches the MAC address of lagg0 in pfSense.
So what is this new ID? Is that a MAC you recognise?

It throws doubt on what is physically connected to what.

Steve

this possibly because I connected last time two different cables. to test.
have port em2 em 3 connected to port 8 on each switch and ebc1 and etc 2 on port 2 on each switch.
maybe that why? the currenty lag is running on the port 2 on each switch. I can disable that lacp and try port 8 lacp.
I have tried on a different port group 8 on each switch and this error shows up.
so appear I have to configure priority on the lag, but don't know where and how. never seen it before.

Nov 30 23:43:40:I:System: dynamic lag interface 1/1/8's peer info (priority=3,id=e839.3511.faab,key=0) mis-matches with lag's peer info (priority=32768,id=e839.3511.faab,key=363), set to mismatch Error

stephenw10

Seems more like you have to omit the priority in the switch/port to have it match what looks like the default value pfSense is sending. But it's not something I've had to do before.

Steve

stephenw10

In fact there is no way to set a priority in pfSense:

https://www.freebsd.org/cgi/man.cgi?lagg(4)#BUGS

     There is no way to	configure LACP administrative variables, including
     system and	port priorities.  The current implementation always performs
     active-mode LACP and uses 0x8000 as system	and port priorities.

So that must be in the switch config somewhere. It looks like it's set to 0 in the output you have posted but there must be somewhere else it's pulling in that value.

Steve

cyberbot

@stephenw10 said in LACP not working:

In fact there is no way to set a priority in pfSense:

https://www.freebsd.org/cgi/man.cgi?lagg(4)#BUGS
     There is no way to	configure LACP administrative variables, including
     system and	port priorities.  The current implementation always performs
     active-mode LACP and uses 0x8000 as system	and port priorities.
So that must be in the switch config somewhere. It looks like it's set to 0 in the output you have posted but there must be somewhere else it's pulling in that value.

Steve

Hi Steve,
today we had a call with pfsense engineer, however the issue appear the switch still.

the pfsense is sending LACP and binding the ports nicely.
but the switch is still blocking the ports and we cannot see any log neither on the switch or the firewall.
the issue still remains unresolved unfortunately.

stephenw10

Yes, it looks like a switch issue to me too.

Have you been able to test a lagg to a single switch?

Check the full switch config from both switches. It must be pulling in the port priority from somewhere.

Steve

cyberbot

@stephenw10 said in LACP not working:

Yes, it looks like a switch issue to me too.

Have you been able to test a lagg to a single switch?

Check the full switch config from both switches. It must be pulling in the port priority from somewhere.

Steve

Hi Steve,
the priority we were able to fix it, with specifying the priority and the error of the flapping were gone, but the switch was still blocking the interfaces LAG

stephenw10

No longer logging the mismatch in the switch?

cyberbot

@stephenw10 said in LACP not working:

No longer logging the mismatch in the switch?

no, not on the switch and no flapping on the firewall,
but when we do show lag it still shows blocked LAG.

stephenw10

Can we see the current output from both sides?

cyberbot

@stephenw10

yes of course only there is no log now, nothing happening at all.