LACP not working
-
@stephenw10 said in LACP not working:
That the console output rather than the system log? It looks like it.
You should see more that that at least when it initially comes up. Check the system log in the webgui or use:
clog /var/log/system.log
at the CLI.Steve
Thank you Steve,
I owe you plenty of
I have run the command its shows alot of information I've collected the ones talking about LACPNov 23 21:13:19 firewall /flowd_aggregate.py[92972]: start watching flowd Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_times (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack as_info (unpack requires a buffer of 12 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_engine_info (unpack requires a buffer of 12 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack srcdst_port (unpack requires a buffer of 4 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_times (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_engine_info (unpack requires a buffer of 12 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack tag (unpack requires a buffer of 4 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack recv_time (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack srcdst_port (unpack requires a buffer of 4 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_times (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_engine_info (unpack requires a buffer of 12 bytes) Nov 23 21:12:06 firewall kernel: pflog0: promiscuous mode disabled Nov 23 21:12:06 firewall kernel: pflog0: promiscuous mode enabled Nov 23 21:12:06 firewall /flowd_aggregate.py[92972]: startup, check database.
on the switch side I have this
=== LAG "Pfsense" ID 1 (dynamic Deployed) === LAG Configuration: Ports: e 1/2/2 e 1/2/4 Port Count: 2 Primary Port: 1/2/2 Trunk Type: hash-based LACP Key: 20001 Deployment: HW Trunk ID 1 Port Link State Dupl Speed Trunk Tag Pvid Pri MAC Name 1/2/2 Up Blocked Full 1G 1 Yes 18 0 609c.9f3a.a488 1/2/4 Up Blocked Full 1G 1 Yes 18 0 609c.9f3a.a488 Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope] 1/2/2 1 1 20001 Yes S Agg Syn Col Dis Def No Ina 1/2/4 1 1 20001 Yes S Agg Syn No No Def No Ina
-
OK, the flowd stuff is not relevant. You need to looks at the lacpdu output which should look at least vaguely like my output above.
It's all from the 'kernel' process so you can filter by that to lose all the netflow entries.The fact the switch shows the ports as blocked is not good...
Steve
-
@stephenw10 said in LACP not working:
OK, the flowd stuff is not relevant. You need to looks at the lacpdu output which should look at least vaguely like my output above.
It's all from the 'kernel' process so you can filter by that to lose all the netflow entries.The fact the switch shows the ports as blocked is not good...
Steve
Thank you Steve,
if the switch shows they are blocked mostly is the switch or the pfsense?
-
@cyberbot said in LACP not working:
@stephenw10 said in LACP not working:
That the console output rather than the system log? It looks like it.
You should see more that that at least when it initially comes up. Check the system log in the webgui or use:
clog /var/log/system.log
at the CLI.Steve
Thank you Steve,
I owe you plenty of
I have run the command its shows alot of information I've collected the ones talking about LACPNov 23 21:13:19 firewall /flowd_aggregate.py[92972]: start watching flowd Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_times (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack as_info (unpack requires a buffer of 12 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_engine_info (unpack requires a buffer of 12 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack srcdst_port (unpack requires a buffer of 4 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_times (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_engine_info (unpack requires a buffer of 12 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack tag (unpack requires a buffer of 4 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack recv_time (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack srcdst_port (unpack requires a buffer of 4 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_times (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_engine_info (unpack requires a buffer of 12 bytes) Nov 23 21:12:06 firewall kernel: pflog0: promiscuous mode disabled Nov 23 21:12:06 firewall kernel: pflog0: promiscuous mode enabled Nov 23 21:12:06 firewall /flowd_aggregate.py[92972]: startup, check database.
on the switch side I have this
=== LAG "Pfsense" ID 1 (dynamic Deployed) === LAG Configuration: Ports: e 1/2/2 e 1/2/4 Port Count: 2 Primary Port: 1/2/2 Trunk Type: hash-based LACP Key: 20001 Deployment: HW Trunk ID 1 Port Link State Dupl Speed Trunk Tag Pvid Pri MAC Name 1/2/2 Up Blocked Full 1G 1 Yes 18 0 609c.9f3a.a488 1/2/4 Up Blocked Full 1G 1 Yes 18 0 609c.9f3a.a488 Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope] 1/2/2 1 1 20001 Yes S Agg Syn Col Dis Def No Ina 1/2/4 1 1 20001 Yes S Agg Syn No No Def No Ina
Hi Steve.
I have this log now after I connect the cables.Nov 24 00:13:35 firewall kernel: em2: lacpdu receive Nov 24 00:13:35 firewall kernel: actor=(0001,60-9C-9F-4B-80-8C,4E22,0001,0002) Nov 24 00:13:35 firewall kernel: actor.state=7<ACTIVITY,TIMEOUT,AGGREGATION> Nov 24 00:13:35 firewall kernel: partner=(8000,E8-39-35-11-FA-AB,016B,8000,0003) Nov 24 00:13:35 firewall kernel: partner.state=1d<ACTIVITY,AGGREGATION,SYNC,COLLECTING> Nov 24 00:13:35 firewall kernel: maxdelay=0 Nov 24 00:13:35 firewall kernel: em2: old pstate cf<ACTIVITY,TIMEOUT,AGGREGATION,SYNC,DEFAULTED,EXPIRED> Nov 24 00:13:35 firewall kernel: em2: new pstate f<ACTIVITY,TIMEOUT,AGGREGATION,SYNC> Nov 24 00:13:35 firewall kernel: em3: lacpdu transmit Nov 24 00:13:35 firewall kernel: actor=(8000,E8-39-35-11-FA-AB,016B,8000,0004) Nov 24 00:13:35 firewall kernel: actor.state=1d<ACTIVITY,AGGREGATION,SYNC,COLLECTING> Nov 24 00:13:35 firewall kernel: partner=(0001,60-9C-9F-4B-80-8C,4E22,0001,0102) Nov 24 00:13:35 firewall kernel: partner.state=cf<ACTIVITY,TIMEOUT,AGGREGATION,SYNC,DEFAULTED,EXPIRED> Nov 24 00:13:35 firewall kernel: maxdelay=0 Nov 24 00:13:35 firewall kernel: em2: lacpdu transmit Nov 24 00:13:35 firewall kernel: actor=(8000,E8-39-35-11-FA-AB,016B,8000,0003) Nov 24 00:13:35 firewall kernel: actor.state=1d<ACTIVITY,AGGREGATION,SYNC,COLLECTING> Nov 24 00:13:35 firewall kernel: partner=(0001,60-9C-9F-4B-80-8C,4E22,0001,0002) Nov 24 00:13:35 firewall kernel: partner.state=f<ACTIVITY,TIMEOUT,AGGREGATION,SYNC> Nov 24 00:13:35 firewall kernel: maxdelay=0
and the switch still showing the ports are blocked.
LAG Configuration: Ports: e 1/1/2 e 2/1/2 Port Count: 2 Primary Port: 1/1/2 Trunk Type: hash-based LACP Key: 20002 Deployment: HW Trunk ID 2 Port Link State Dupl Speed Trunk Tag Pvid Pri MAC Name 1/1/2 Up Blocked Full 1G 2 Yes N/A 0 609c.9f4b.808d LAN1 2/1/2 Up Blocked Full 1G 2 Yes N/A 0 609c.9f4b.808d LAN2 Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope] 1/1/2 1 1 20002 Yes S Agg No No No No No Ina 2/1/2 1 1 20002 Yes S Agg Syn No No Def Exp Err
-
The switch probably blocked the ports for the same reason pfSense stopped the interfaces.
They were flapping for some reason and leaving them running like that could cause far more problems. We have to try to find out why.I would disconnect and reconnect the ports and see what is logged.
Try connecting only one port and see if that prevents the flapping.
Steve
-
After I connected the cables one by one it still shows the flapping, but the error comes up.
Nov 24 01:07:31 firewall configctl[3820]: event @ 1606176450.60 exec: system event config_changed Nov 24 01:07:38 firewall kernel: em2: Interface stopped DISTRIBUTING, possible flapping Nov 24 01:07:38 firewall kernel: lagg0: link state changed to DOWN Nov 24 01:07:43 firewall kernel: lagg0: link state changed to UP Nov 24 01:07:44 firewall kernel: em2: Interface stopped DISTRIBUTING, possible flapping Nov 24 01:07:44 firewall kernel: lagg0: link state changed to DOWN Nov 24 01:07:49 firewall kernel: lagg0: link state changed to UP Nov 24 01:07:51 firewall kernel: em2: Interface stopped DISTRIBUTING, possible flapping Nov 24 01:07:51 firewall kernel: lagg0: link state changed to DOWN Nov 24 01:07:56 firewall kernel: lagg0: link state changed to UP Nov 24 01:07:57 firewall kernel: em2: Interface stopped DISTRIBUTING, possible flapping Nov 24 01:07:57 firewall kernel: lagg0: link state changed to DOWN Nov 24 01:08:02 firewall kernel: lagg0: link state changed to UP Nov 24 01:08:03 firewall kernel: em2: Interface stopped DISTRIBUTING, possible flapping Nov 24 01:08:03 firewall kernel: lagg0: link state changed to DOWN Nov 24 01:08:08 firewall kernel: lagg0: link state changed to UP Nov 24 01:08:09 firewall kernel: em2: Interface stopped DISTRIBUTING, possible flapping Nov 24 01:08:09 firewall kernel: lagg0: link state changed to DOWN Nov 24 01:08:14 firewall kernel: lagg0: link state changed to UP Nov 24 01:08:15 firewall kernel: em2: Interface stopped DISTRIBUTING, possible flapping Nov 24 01:08:15 firewall kernel: lagg0: link state changed to DOWN Nov 24 01:08:20 firewall kernel: lagg0: link state changed to UP
ifconfig laag0 show
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=850098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO> ether e8:39:35:11:fa:ab inet6 fe80::ea39:35ff:fe11:faab%lagg0 prefixlen 64 scopeid 0xb inet 192.168.55.1 netmask 0xffffff00 broadcast 192.168.55.255 laggproto lacp lagghash l2,l3,l4 laggport: em2 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> laggport: em3 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> groups: lagg media: Ethernet autoselect status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
-
Maybe check debugging is still enabled. You should see some log entries from lacp there.
Edit: Sorry missed a whole post of yours somehow. Ok I see logs showinh both ends timing out.
And it still showed flapping with only one NIC connected? em2 there in the logs?
What do you see if you run:
ifconfig -vv lagg0
?Steve
-
Looking at the switch there it looks like you might have short timeouts set?
For reference:
SSH@ICX6450-24P Switch>show lag Total number of LAGs: 1 Total number of deployed LAGs: 1 Total number of trunks created:1 (123 available) LACP System Priority / ID: 1 / 609c.9f54.14f2 LACP Long timeout: 90, default: 90 LACP Short timeout: 3, default: 3 === LAG "lacp1" ID 2047 (dynamic Deployed) === LAG Configuration: Ports: e 1/2/1 e 1/2/3 Port Count: 2 Primary Port: 1/2/1 Trunk Type: hash-based LACP Key: 22047 LACP Timeout: long Deployment: HW Trunk ID 1 Port Link State Dupl Speed Trunk Tag Pvid Pri MAC Name 1/2/1 Up Learn Full 10G 2047 No 1 0 609c.9f54.150b 1/2/3 Up Learn Full 10G 2047 No 1 0 609c.9f54.150b Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope] 1/2/1 1 1 22047 Yes L Agg Syn Col Dis No No Ope 1/2/3 1 1 22047 Yes L Agg Syn Col Dis No No Ope Partner Info and PDU Statistics Port Partner Partner LACP LACP System ID Key Rx Count Tx Count 1/2/1 32768-00e0.ed86.a68c 690 4 4 1/2/3 32768-00e0.ed86.a68c 690 3 5
-
@stephenw10 said in LACP not working:
Maybe check debugging is still enabled. You should see some log entries from lacp there.
Edit: Sorry missed a whole post of yours somehow. Ok I see logs showinh both ends timing out.
And it still showed flapping with only one NIC connected? em2 there in the logs?
What do you see if you run:
ifconfig -vv lagg0
?Steve
Hi Steve
yes each NIC I connect em0 or em3 it shows the flagging error.
what do you mean with time out of set?root@pfsense:~ # ifconfig -vv lagg0 lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=8520b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO> ether 00:08:a2:0c:99:7b inet6 fe80::208:a2ff:fe0c:997b%lagg0 prefixlen 64 scopeid 0x10 inet 192.168.15.1 netmask 0xffffff00 broadcast 192.168.15.255 laggproto lacp lagghash l2,l3,l4 lagg options: flags=10<LACP_STRICT> flowid_shift: 16 lagg statistics: active ports: 2 flapping: 0 lag id: [(8000,00-08-A2-0C-99-7B,020B,0000,0000), (8000,74-83-C2-48-2F-67,0042,0000,0000)] laggport: igb4 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING> [(8000,00-08-A2-0C-99-7B,020B,8000,0005), (8000,74-83-C2-48-2F-67,0042,0080,0017)] laggport: igb5 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING> [(8000,00-08-A2-0C-99-7B,020B,8000,0006), (8000,74-83-C2-48-2F-67,0042,0080,0018)] groups: lagg media: Ethernet autoselect status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
-
My switch is set for Long LACP timeouts (the TIO flag) and yours it set short.
Try changing that.
-
@stephenw10 said in LACP not working:
My switch is set for Long LACP timeouts (the TIO flag) and yours it set short.
Try changing that.
Are you using a broadcom switch?
i have changed the timeouts to longdevice(config)# lag blue dynamic device(config-lag-blue)# lacp-timeout long
but it stills shows blocked
=== LAG "Pfsense WAN" ID 1 (dynamic Deployed) === LAG Configuration: Ports: e 1/1/1 e 2/1/1 Port Count: 2 Primary Port: 1/1/1 Trunk Type: hash-based LACP Key: 20001 Deployment: HW Trunk ID 1 Port Link State Dupl Speed Trunk Tag Pvid Pri MAC Name 1/1/1 Up Blocked Full 1G 1 No 141 0 609c.9f4b.808c WAN1 2/1/1 Up Blocked Full 1G 1 No 141 0 609c.9f4b.808c WAN2 Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope] 1/1/1 1 1 20001 Yes S Agg No No No Def Exp Err 2/1/1 1 1 20001 Yes S Agg No No No Def Exp Err Partner Info and PDU Statistics Port Partner Partner LACP LACP System ID Key Rx Count Tx Count 1/1/1 4-e839.3511.faab 0 0 1 2/1/1 3-e839.3511.faab 0 0 1
-
I'm using a Brocade ICX6450 there. It's a value you can set for each lag group:
SSH@ICX6450-24P Switch(config-lag-lacp1)#lacp-timeout long Long timeout mode short Short timeout mode
Steve
-
@stephenw10 said in LACP not working:
I'm using a Brocade ICX6450 there. It's a value you can set for each lag group:
SSH@ICX6450-24P Switch(config-lag-lacp1)#lacp-timeout long Long timeout mode short Short timeout mode
Steve
Thank you Steve, i set it to long.
see above
appreciate your support -
You might have to redeploy it. The ports still show
S
in theTio
field.That's not something I've ever tried on this switch.
-
@stephenw10 said in LACP not working:
You might have to redeploy it. The ports still show
S
in theTio
field.That's not something I've ever tried on this switch.
Thank you Steve,
do you mean to remove the ethernets from the lag or delete the lag completed?
i am getting this errorError: LAG WAN is deployed, please undeploy it first.
-
It's been a while since I had to mess with anything on this switch but I believe, like that error implies, you cannot make changes to the lag while it's deployed. You'll have to 'undeploy' it to switch to long mode timeouts.
You could also try changing the pfSense LACP timeout mode to fast (short) to match which might be easier. In 2.5 that's a GUI setting, I'm not sure I've ever set that in 2.4.5....
-
Ah here we go:
[2.4.5-RELEASE][root@7100.stevew.lan]/root: ifconfig lagg0 lacp_fast_timeout [2.4.5-RELEASE][root@7100.stevew.lan]/root: ifconfig -vv lagg0 lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=500b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO> ether 00:e0:ed:86:a6:8c inet6 fe80::2e0:edff:fe86:a68c%lagg0 prefixlen 64 scopeid 0x15 inet 172.21.16.206 netmask 0xffffff00 broadcast 172.21.16.255 nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> media: Ethernet autoselect status: active groups: lagg laggproto lacp lagghash l2,l3,l4 lagg options: flags=90<LACP_STRICT> flowid_shift: 16 lagg statistics: active ports: 2 flapping: 0 lag id: [(8000,00-E0-ED-86-A6-8C,02B2,0000,0000), (0001,60-9C-9F-54-14-F2,561F,0000,0000)] laggport: ixl0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3f<ACTIVITY,TIMEOUT,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING> [(8000,00-E0-ED-86-A6-8C,02B2,8000,0001), (0001,60-9C-9F-54-14-F2,561F,0001,0041)] laggport: ixl1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3f<ACTIVITY,TIMEOUT,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING> [(8000,00-E0-ED-86-A6-8C,02B2,8000,0002), (0001,60-9C-9F-54-14-F2,561F,0001,0043)]
Though I note it did not cause the lagg to break with pfSense set as short and the switch as long...
-
@stephenw10 said in LACP not working:
ifconfig lagg0 lacp_fast_timeout
Thank you Steve,
do you mean i have to do it also on the pfsensen side too ?
ifconfig lagg0 lacp_fast_timeout
-
It would be better to change that on the switch if you can but, as a test, you can set fast(short) mode in pfSense and that should also match each end.
Steve
-
@stephenw10 said in LACP not working:
It would be better to change that on the switch if you can but, as a test, you can set fast(short) mode in pfSense and that should also match each end.
Steve
Thank you Steve, ive done hte fast timeout on the pfsense unfortunately it still blocking the ports
on the switch too
=== LAG "wan" ID 5 (dynamic Not Deployed) === LAG Configuration: Ports: Port Count: 0 Primary Port: none Trunk Type: hash-based LACP Key: 20005 LACP Timeout: long
both switches are stacked maybe forgot to mention, maybe its relevant .