LACP not working
-
@stephenw10 said in LACP not working:
Well it should work with either as long as both sides are set the same. The switch ports are not showing Exp (expired) which I would expect if pfSense was still using Long timeouts.
I know Long works though.
The port operational state (Ope) is still showing Err:
Err: If there is a peer information mismatch, then that particular port is moved to the Error disable state (Err).
https://docs.commscope.com/bundle/fastiron-08095-commandref/page/GUID-06AFF73D-6957-44A5-AF25-3527B2BE1580.html
I would have expected that to be logged still.
Can we see the full 'show lag' output?
Steve
on the switch side I don't see any logs showing anything about the LACP but only about the when I logging using ssh ect...
I have change it to long but still shows blocked.=== LAG "LAN" ID 1 (dynamic Deployed) === LAG Configuration: Ports: e 1/1/2 e 2/1/2 Port Count: 2 Primary Port: 1/1/2 Trunk Type: hash-based LACP Key: 20001 LACP Timeout: long Deployment: HW Trunk ID 1 Port Link State Dupl Speed Trunk Tag Pvid Pri MAC Name 1/1/2 Up Blocked Full 1G 1 Yes N/A 0 LAN1 2/1/2 Up Blocked Full 1G 1 Yes N/A 0 LAN2 Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope] 1/1/2 1 1 20001 Yes L Agg No No No No No Ina 2/1/2 1 1 20001 Yes L Agg Syn Col Dis Def No Err
-
We need to see the full output, including the partner info.
The first port there is showing it's not seeing ant LACP packets from the other side.
-
@stephenw10 said in LACP not working:
We need to see the full output, including the partner info.
The first port there is showing it's not seeing ant LACP packets from the other side.
Hi Steve
What full output are you referring to ? On the pfsense or switch ?
On the switch side it shows only information about who is connected from the ssh and what time nothing else . -
The complete output from
show lag
on the switch contains the Partner info and PDU stats at the end which show what is connected and on which port. It also shows the system wide lag parameters at the top on my switch. It would be good to compare those.
Your switches are different to mine though, the output is very similar but not identical.You posted a more complete output here: https://forum.netgate.com/post/947906
Steve
-
@stephenw10 said in LACP not working:
The complete output from
show lag
on the switch contains the Partner info and PDU stats at the end which show what is connected and on which port. It also shows the system wide lag parameters at the top on my switch. It would be good to compare those.
Your switches are different to mine though, the output is very similar but not identical.You posted a more complete output here: https://forum.netgate.com/post/947906
Steve
Hi Steve,
Are you referring to these?
=== LAG "LAN" ID 1 (dynamic Deployed) === LAG Configuration: Ports: e 1/1/2 e 2/1/2 Port Count: 2 Primary Port: 1/1/2 Trunk Type: hash-based LACP Key: 20001 LACP Timeout: long Deployment: HW Trunk ID 1 Port Link State Dupl Speed Trunk Tag Pvid Pri MAC Name 1/1/2 Up Blocked Full 1G 1 Yes N/A 0 609c.9f4b.606d LAN1 2/1/2 Up Blocked Full 1G 1 Yes N/A 0 609c.9f4b.606d LAN2 Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope] 1/1/2 1 1 20001 Yes L Agg Syn Col Dis Def No Err 2/1/2 1 1 20001 Yes L Agg Syn Col Dis Def No Err Partner Info and PDU Statistics Port Partner Partner LACP LACP System ID Key Rx Count Tx Count 1/1/2 32768-e839.3511.faab 363 0 237550 2/1/2 32768-e839.3511.faab 363 0 237550
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=800008<VLAN_MTU> ether e8:39:35:11:fa:ab inet6 fe80::ea39:35ff:fe11:faab%lagg0 prefixlen 64 scopeid 0xb inet 192.168.73.1 netmask 0xffffff00 broadcast 192.168.73.255 laggproto lacp lagghash l2,l3,l4 lagg options: flags=10<LACP_STRICT> flowid_shift: 16 lagg statistics: active ports: 2 flapping: 291 lag id: [(8000,E8-39-35-11-FA-AB,016B,0000,0000), (0001,60-9C-9F-4B-80-8C,4E21,0000,0000)] laggport: em2 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING> [(8000,E8-39-35-11-FA-AB,016B,8000,0003), (0001,60-9C-9F-4B-80-8C,4E21,0001,0002)] laggport: em3 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING> [(8000,E8-39-35-11-FA-AB,016B,8000,0004), (0001,60-9C-9F-4B-80-8C,4E21,0001,0102)] groups: lagg media: Ethernet autoselect status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
-
Ok so from that we can see that the system ID the switch sees on both those ports matches the ID pfSense has set on it's lagg ports.
However we cannot see the switch system ID. pfSense sees it as:
0001,60-9C-9F-4B-80-8C
That info is at the top of the show lag output from the switch as I showed above:
SSH@ICX6450-24P Switch>show lag Total number of LAGs: 1 Total number of deployed LAGs: 1 Total number of trunks created:1 (123 available) LACP System Priority / ID: 1 / 609c.9f54.14f2 LACP Long timeout: 90, default: 90 LACP Short timeout: 3, default: 3
It obviously should match but....
What we can see is that the switch has recorded precisely 0 lacpdus received.
Steve
-
@stephenw10 said in LACP not working:
Ok so from that we can see that the system ID the switch sees on both those ports matches the ID pfSense has set on it's lagg ports.
However we cannot see the switch system ID. pfSense sees it as:
0001,60-9C-9F-4B-80-8C
That info is at the top of the show lag output from the switch as I showed above:
SSH@ICX6450-24P Switch>show lag Total number of LAGs: 1 Total number of deployed LAGs: 1 Total number of trunks created:1 (123 available) LACP System Priority / ID: 1 / 609c.9f54.14f2 LACP Long timeout: 90, default: 90 LACP Short timeout: 3, default: 3
It obviously should match but....
What we can see is that the switch has recorded precisely 0 lacpdus received.
Steve
what are you suggesting ?
if i connect a different switch the ports comes online and shows the LACP is fine. -
To that switch or to pfSense?
I have once seen a similar issue to this that was eventually resolved by simply rebooting the switch stack. Something no-one had thought to do because generally switches do not require that sort of thing.
Beyond that I would try a lagg to ports on the same switch to remove the cross-chassis LACP as an issue.
Then I'm out of suggestions. You probably need to get Brocade/Rukus/Commscope support involved at that point.
Steve
-
@stephenw10 said in LACP not working:
To that switch or to pfSense?
I have once seen a similar issue to this that was eventually resolved by simply rebooting the switch stack. Something no-one had thought to do because generally switches do not require that sort of thing.
Beyond that I would try a lagg to ports on the same switch to remove the cross-chassis LACP as an issue.
Then I'm out of suggestions. You probably need to get Brocade/Rukus/Commscope support involved at that point.
Steve
Hi Steve
I can do the reboot tonight. When I am logged in shut reboot its will reboot both switches ? -
I can't answer that I only have a single Brocade switch. I would hope there is some failover happens. I've never configured that.
Be sure to have saved the running config. Have a recovery plan etc....Steve
-
@stephenw10 said in LACP not working:
I can't answer that I only have a single Brocade switch. I would hope there is some failover happens. I've never configured that.
Be sure to have saved the running config. Have a recovery plan etc....Steve
I have rebooted the switch stack, but unfortunately it still shows blocked.
1000 MHz ARM processor ARMv7 88 MHz bus 8192 KB boot flash memory 2048 MB code flash memory 2048 MB DRAM STACKID 1 system uptime is 3 minute(s) 34 second(s) STACKID 2 system uptime is 3 minute(s) 31 second(s) The system : started=warm start reloaded=by "reload" My stack unit ID = 1, bootup role = active
=== LAG "LAN" ID 10 (dynamic Deployed) === LAG Configuration: Ports: e 1/1/2 e 2/1/2 Port Count: 2 Primary Port: 1/1/2 Trunk Type: hash-based LACP Key: 20001 LACP Timeout: long Deployment: HW Trunk ID 1 Port Link State Dupl Speed Trunk Tag Pvid Pri MAC Name 1/1/2 Up Blocked Full 1G 1 Yes N/A 0 609c.9f4b.606d LAN1 2/1/2 Up Blocked Full 1G 1 Yes N/A 0 609c.9f4b.606d LAN2 Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope] 1/1/2 1 1 20001 Yes L Agg Syn Col Dis Def No Err 2/1/2 1 1 20001 Yes L Agg Syn Col Dis Def No Err Partner Info and PDU Statistics Port Partner Partner LACP LACP System ID Key Rx Count Tx Count 1/1/2 32768-e839.3511.faab 363 0 59 2/1/2 32768-e839.3511.faab 363 0 66
I saw the error of the mismatch
Dec 7 23:47:37:I:Stack: Stack unit 2 has been assigned as STANDBY unit of the stack system Dec 7 23:46:41:I:System: Logical link on dynamic lag interface ethernet 2/1/2 is down. Dec 7 23:46:41:I:System: Logical link on dynamic lag interface ethernet 2/1/11 is up. Dec 7 23:46:41:I:System: Interface ethernet 2/1/11, state up Dec 7 23:46:40:I:System: Logical link on dynamic lag interface ethernet 2/1/12 is up. Dec 7 23:46:40:I:System: Interface ethernet 2/1/12, state up Dec 7 23:46:40:I:Trunk: Group (1/1/11, 1/1/12, 2/1/11, 2/1/12) created by 802.3ad link-aggregation module. Dec 7 23:46:38:I:System: Logical link on dynamic lag interface ethernet 2/1/2 is down. Dec 7 23:46:38:I:System: dynamic lag interface 2/1/2's peer info (priority=32768,id=e839.3511.faab,key=363) mis-matches with lag's peer info (priority=32768,id=609c.9f4b.808c,key=363), set to mismatch Error Dec 7 23:46:36:I:System: Interface ethernet 2/1/48, state up
but after it disappear it didn't comes back.
I know freebsd is using the key 363
peer-info sys-mac MAC of the LACP sys-pri 32768 key 363
fix the problem of the mismatch but the ports are still blocked
I cannot seems to see if lag interface on the pfsense is using the fast or long timeout
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=800008<VLAN_MTU> ether e8:39:35:11:fa:ab inet6 fe80::ea39:35ff:fe11:faab%lagg0 prefixlen 64 scopeid 0xb inet 192.168.73.1 netmask 0xffffff00 broadcast 192.168.73.255 laggproto lacp lagghash l2,l3,l4 laggport: em2 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> laggport: em3 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> groups: lagg media: Ethernet autoselect status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
-
The only thing I have not seen from your switch is the LACP ID value. My switch shows that at the top of the show lag output but either yours doesn't or you're just not including it here.
We can guess it's
609c.9f4b.808c
because that's what the error logs show and the ifconfig from pfSense. But if it isn't that it would be mismatched.Steve
-
@stephenw10 said in LACP not working:
609c.9f4b.808c
Thank you for your answer Steve,
i have managed to get the 609c.9f4b.808c corrected. it was the second Mac of a different LAG.
i've remvoed that lag and the issue is fixed.
both Pfsense and Switch share the same LACP ID now.
what information do you need from the switch? -
Ah, so the lag to pfSense came up correctly with the 2nd lag removed?
On my switch the first lines of output from show lag are:
SSH@ICX6450-24P Switch>show lag Total number of LAGs: 1 Total number of deployed LAGs: 1 Total number of trunks created:1 (123 available) LACP System Priority / ID: 1 / 609c.9f54.14f2 LACP Long timeout: 90, default: 90 LACP Short timeout: 3, default: 3
I was hoping to compare that output with what you switch(es) show. But they are not the same as mine it may not show that.
This seems like it could be something like different lacp IDs between the two switches in the stack. I have no way to test that.
Steve
-
@stephenw10 said in LACP not working:
Ah, so the lag to pfSense came up correctly with the 2nd lag removed?
On my switch the first lines of output from show lag are:
SSH@ICX6450-24P Switch>show lag Total number of LAGs: 1 Total number of deployed LAGs: 1 Total number of trunks created:1 (123 available) LACP System Priority / ID: 1 / 609c.9f54.14f2 LACP Long timeout: 90, default: 90 LACP Short timeout: 3, default: 3
I was hoping to compare that output with what you switch(es) show. But they are not the same as mine it may not show that.
This seems like it could be something like different lacp IDs between the two switches in the stack. I have no way to test that.
Steve
Hi Steve,
i thought those info werent relevant apologies.
see below.SSH@Ruckus@Berlin#show lag Total number of LAGs: 2 Total number of deployed LAGs: 2 Total number of trunks created:2 (254 available) LACP System Priority / ID: 1 / 609c.9f4b.808c LACP Long timeout: 120, default: 120 LACP Short timeout: 3, default: 3 === LAG "LAN" ID 1 (dynamic Deployed) === LAG Configuration: Ports: e 1/1/2 e 2/1/2 Port Count: 2 Primary Port: 1/1/2 Trunk Type: hash-based LACP Key: 20001 LACP Timeout: long Deployment: HW Trunk ID 1 Port Link State Dupl Speed Trunk Tag Pvid Pri MAC Name 1/1/2 Up Blocked Full 1G 1 Yes N/A 0 609c.9f4b.808d LAN1 2/1/2 Up Blocked Full 1G 1 Yes N/A 0 609c.9f4b.808d LAN2 Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope] 1/1/2 1 1 20001 Yes L Agg Syn Col Dis Def No Err 2/1/2 1 1 20001 Yes L Agg Syn Col Dis Def No Err Partner Info and PDU Statistics Port Partner Partner LACP LACP System ID Key Rx Count Tx Count 1/1/2 32768-e839.3511.faab 363 0 49297 2/1/2 32768-e839.3511.faab 363 0 49303
-
Ok, so it's still down there. Removing the other lag did not allow it to come up?
We see the LACP ID though and as expected it matches what pfSense is seeing.
-
@stephenw10 said in LACP not working:
Ok, so it's still down there. Removing the other lag did not allow it to come up?
We see the LACP ID though and as expected it matches what pfSense is seeing.
What lag are you referring to ? Do you mean recreate the lag ?
-
@cyberbot said in LACP not working:
i have managed to get the 609c.9f4b.808c corrected. it was the second Mac of a different LAG.
i've remvoed that lag and the issue is fixed.Whatever you removed there that fixed it.
-
@stephenw10 said in LACP not working:
@cyberbot said in LACP not working:
i have managed to get the 609c.9f4b.808c corrected. it was the second Mac of a different LAG.
i've remvoed that lag and the issue is fixed.Whatever you removed there that fixed it.
i am not sure i can follow sorry :)
i havent removed anything and its not fixed yet.
my question was do you meant i have to remove the already created LAG and create it again? -
You may well have to re-deploy it on the switches to have it use the new settings.
I can only make an educated guess at this point.
What exactly did you do before then when you said that was fixed? And what was it that was fixed?
Steve