LACP not working
-
In pfSense run this command to set it temporarily:
sysctl net.link.lagg.lacp.debug=1
Or add it in Sys > Adv > System Tunables to set it more permanently.
Steve
-
@stephenw10 said in LACP not working:
In pfSense run this command to set it temporarily:
sysctl net.link.lagg.lacp.debug=1
Or add it in Sys > Adv > System Tunables to set it more permanently.
Steve
Steve I appreciate it really. ill do it tomorrow when I get to the office. after I am done with the debug can I disable it using the below command?
sysctl net.link.lagg.lacp.debug=0
-
Yes exactly.
Or just reboot since that sysctl is not set in the config.
Steve
-
@stephenw10 said in LACP not working:
Yes exactly.
Or just reboot since that sysctl is not set in the config.
Steve
Steve I appreciate it, ill give it a try tomorrow and report back.
I hope to see the cause on the debug. -
@cyberbot said in LACP not working:
@stephenw10 said in LACP not working:
In pfSense run this command to set it temporarily:
sysctl net.link.lagg.lacp.debug=1
Or add it in Sys > Adv > System Tunables to set it more permanently.
Steve
Steve I appreciate it really. ill do it tomorrow when I get to the office. after I am done with the debug can I disable it using the below command?
sysctl net.link.lagg.lacp.debug=0
Good evening Steve,
I have enabled the debug, when I connect the cables nothings really shows on the log of console when I was ssh to the box.
am I supposed to do anything to see those logs? -
You should see a bunch of entries in the main system log once you enable that.
For example:Nov 23 20:06:01 kernel ixl1: lacpdu transmit Nov 23 20:06:01 kernel actor=(8000,00-E0-ED-86-A6-8C,02B2,8000,0002) Nov 23 20:06:01 kernel actor.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING> Nov 23 20:06:01 kernel partner=(8000,0C-80-63-69-C2-DE,0ADC,8000,001A) Nov 23 20:06:01 kernel partner.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING> Nov 23 20:06:01 kernel maxdelay=0 Nov 23 20:06:01 kernel ixl0: lacpdu transmit Nov 23 20:06:01 kernel actor=(8000,00-E0-ED-86-A6-8C,02B2,8000,0001) Nov 23 20:06:01 kernel actor.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING> Nov 23 20:06:01 kernel partner=(8000,0C-80-63-69-C2-DE,0ADC,8000,0019) Nov 23 20:06:01 kernel partner.state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING> Nov 23 20:06:01 kernel maxdelay=0
The problem with that setting is that it fills the system log with data usually. Especially if there's a problem.
For reference that is an XG-7100 connected with the X710 expansion card to a Brocade ICX6450. And working correctly.
Steve
-
Sorry Steve,
I've found it had to use google ssh to show the logs I don't know why putty didn't wanna shows the logs.lagg0: link state changed to DOWN pflog0: promiscuous mode disabled pflog0: promiscuous mode enabled pflog0: promiscuous mode disabled pflog0: promiscuous mode enabled lagg0: link state changed to UP em3: Interface stopped DISTRIBUTING, possible flapping em2: Interface stopped DISTRIBUTING, possible flapping lagg0: link state changed to DOWN pflog0: promiscuous mode disabled pflog0: promiscuous mode enabled pflog0: promiscuous mode disabled pflog0: promiscuous mode enabled lagg0: link state changed to UP em3: Interface stopped DISTRIBUTING, possible flapping em2: Interface stopped DISTRIBUTING, possible flapping lagg0: link state changed to DOWN pflog0: promiscuous mode disabled pflog0: promiscuous mode enabled pflog0: promiscuous mode disabled pflog0: promiscuous mode enabled lagg0: link state changed to UP em3: Interface stopped DISTRIBUTING, possible flapping em2: Interface stopped DISTRIBUTING, possible flapping
-
That the console output rather than the system log? It looks like it.
You should see more that that at least when it initially comes up. Check the system log in the webgui or use:
clog /var/log/system.log
at the CLI.Steve
-
@stephenw10 said in LACP not working:
That the console output rather than the system log? It looks like it.
You should see more that that at least when it initially comes up. Check the system log in the webgui or use:
clog /var/log/system.log
at the CLI.Steve
Thank you Steve,
I owe you plenty of
I have run the command its shows alot of information I've collected the ones talking about LACPNov 23 21:13:19 firewall /flowd_aggregate.py[92972]: start watching flowd Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_times (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack as_info (unpack requires a buffer of 12 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_engine_info (unpack requires a buffer of 12 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack srcdst_port (unpack requires a buffer of 4 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_times (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_engine_info (unpack requires a buffer of 12 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack tag (unpack requires a buffer of 4 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack recv_time (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack srcdst_port (unpack requires a buffer of 4 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_times (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_engine_info (unpack requires a buffer of 12 bytes) Nov 23 21:12:06 firewall kernel: pflog0: promiscuous mode disabled Nov 23 21:12:06 firewall kernel: pflog0: promiscuous mode enabled Nov 23 21:12:06 firewall /flowd_aggregate.py[92972]: startup, check database.
on the switch side I have this
=== LAG "Pfsense" ID 1 (dynamic Deployed) === LAG Configuration: Ports: e 1/2/2 e 1/2/4 Port Count: 2 Primary Port: 1/2/2 Trunk Type: hash-based LACP Key: 20001 Deployment: HW Trunk ID 1 Port Link State Dupl Speed Trunk Tag Pvid Pri MAC Name 1/2/2 Up Blocked Full 1G 1 Yes 18 0 609c.9f3a.a488 1/2/4 Up Blocked Full 1G 1 Yes 18 0 609c.9f3a.a488 Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope] 1/2/2 1 1 20001 Yes S Agg Syn Col Dis Def No Ina 1/2/4 1 1 20001 Yes S Agg Syn No No Def No Ina
-
OK, the flowd stuff is not relevant. You need to looks at the lacpdu output which should look at least vaguely like my output above.
It's all from the 'kernel' process so you can filter by that to lose all the netflow entries.The fact the switch shows the ports as blocked is not good...
Steve
-
@stephenw10 said in LACP not working:
OK, the flowd stuff is not relevant. You need to looks at the lacpdu output which should look at least vaguely like my output above.
It's all from the 'kernel' process so you can filter by that to lose all the netflow entries.The fact the switch shows the ports as blocked is not good...
Steve
Thank you Steve,
if the switch shows they are blocked mostly is the switch or the pfsense?
-
@cyberbot said in LACP not working:
@stephenw10 said in LACP not working:
That the console output rather than the system log? It looks like it.
You should see more that that at least when it initially comes up. Check the system log in the webgui or use:
clog /var/log/system.log
at the CLI.Steve
Thank you Steve,
I owe you plenty of
I have run the command its shows alot of information I've collected the ones talking about LACPNov 23 21:13:19 firewall /flowd_aggregate.py[92972]: start watching flowd Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_times (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack as_info (unpack requires a buffer of 12 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_engine_info (unpack requires a buffer of 12 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack srcdst_port (unpack requires a buffer of 4 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_times (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_engine_info (unpack requires a buffer of 12 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack tag (unpack requires a buffer of 4 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack recv_time (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack srcdst_port (unpack requires a buffer of 4 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_times (unpack requires a buffer of 8 bytes) Nov 23 21:13:23 firewall /flowd_aggregate.py[92972]: flowparser failed to unpack flow_engine_info (unpack requires a buffer of 12 bytes) Nov 23 21:12:06 firewall kernel: pflog0: promiscuous mode disabled Nov 23 21:12:06 firewall kernel: pflog0: promiscuous mode enabled Nov 23 21:12:06 firewall /flowd_aggregate.py[92972]: startup, check database.
on the switch side I have this
=== LAG "Pfsense" ID 1 (dynamic Deployed) === LAG Configuration: Ports: e 1/2/2 e 1/2/4 Port Count: 2 Primary Port: 1/2/2 Trunk Type: hash-based LACP Key: 20001 Deployment: HW Trunk ID 1 Port Link State Dupl Speed Trunk Tag Pvid Pri MAC Name 1/2/2 Up Blocked Full 1G 1 Yes 18 0 609c.9f3a.a488 1/2/4 Up Blocked Full 1G 1 Yes 18 0 609c.9f3a.a488 Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope] 1/2/2 1 1 20001 Yes S Agg Syn Col Dis Def No Ina 1/2/4 1 1 20001 Yes S Agg Syn No No Def No Ina
Hi Steve.
I have this log now after I connect the cables.Nov 24 00:13:35 firewall kernel: em2: lacpdu receive Nov 24 00:13:35 firewall kernel: actor=(0001,60-9C-9F-4B-80-8C,4E22,0001,0002) Nov 24 00:13:35 firewall kernel: actor.state=7<ACTIVITY,TIMEOUT,AGGREGATION> Nov 24 00:13:35 firewall kernel: partner=(8000,E8-39-35-11-FA-AB,016B,8000,0003) Nov 24 00:13:35 firewall kernel: partner.state=1d<ACTIVITY,AGGREGATION,SYNC,COLLECTING> Nov 24 00:13:35 firewall kernel: maxdelay=0 Nov 24 00:13:35 firewall kernel: em2: old pstate cf<ACTIVITY,TIMEOUT,AGGREGATION,SYNC,DEFAULTED,EXPIRED> Nov 24 00:13:35 firewall kernel: em2: new pstate f<ACTIVITY,TIMEOUT,AGGREGATION,SYNC> Nov 24 00:13:35 firewall kernel: em3: lacpdu transmit Nov 24 00:13:35 firewall kernel: actor=(8000,E8-39-35-11-FA-AB,016B,8000,0004) Nov 24 00:13:35 firewall kernel: actor.state=1d<ACTIVITY,AGGREGATION,SYNC,COLLECTING> Nov 24 00:13:35 firewall kernel: partner=(0001,60-9C-9F-4B-80-8C,4E22,0001,0102) Nov 24 00:13:35 firewall kernel: partner.state=cf<ACTIVITY,TIMEOUT,AGGREGATION,SYNC,DEFAULTED,EXPIRED> Nov 24 00:13:35 firewall kernel: maxdelay=0 Nov 24 00:13:35 firewall kernel: em2: lacpdu transmit Nov 24 00:13:35 firewall kernel: actor=(8000,E8-39-35-11-FA-AB,016B,8000,0003) Nov 24 00:13:35 firewall kernel: actor.state=1d<ACTIVITY,AGGREGATION,SYNC,COLLECTING> Nov 24 00:13:35 firewall kernel: partner=(0001,60-9C-9F-4B-80-8C,4E22,0001,0002) Nov 24 00:13:35 firewall kernel: partner.state=f<ACTIVITY,TIMEOUT,AGGREGATION,SYNC> Nov 24 00:13:35 firewall kernel: maxdelay=0
and the switch still showing the ports are blocked.
LAG Configuration: Ports: e 1/1/2 e 2/1/2 Port Count: 2 Primary Port: 1/1/2 Trunk Type: hash-based LACP Key: 20002 Deployment: HW Trunk ID 2 Port Link State Dupl Speed Trunk Tag Pvid Pri MAC Name 1/1/2 Up Blocked Full 1G 2 Yes N/A 0 609c.9f4b.808d LAN1 2/1/2 Up Blocked Full 1G 2 Yes N/A 0 609c.9f4b.808d LAN2 Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope] 1/1/2 1 1 20002 Yes S Agg No No No No No Ina 2/1/2 1 1 20002 Yes S Agg Syn No No Def Exp Err
-
The switch probably blocked the ports for the same reason pfSense stopped the interfaces.
They were flapping for some reason and leaving them running like that could cause far more problems. We have to try to find out why.I would disconnect and reconnect the ports and see what is logged.
Try connecting only one port and see if that prevents the flapping.
Steve
-
After I connected the cables one by one it still shows the flapping, but the error comes up.
Nov 24 01:07:31 firewall configctl[3820]: event @ 1606176450.60 exec: system event config_changed Nov 24 01:07:38 firewall kernel: em2: Interface stopped DISTRIBUTING, possible flapping Nov 24 01:07:38 firewall kernel: lagg0: link state changed to DOWN Nov 24 01:07:43 firewall kernel: lagg0: link state changed to UP Nov 24 01:07:44 firewall kernel: em2: Interface stopped DISTRIBUTING, possible flapping Nov 24 01:07:44 firewall kernel: lagg0: link state changed to DOWN Nov 24 01:07:49 firewall kernel: lagg0: link state changed to UP Nov 24 01:07:51 firewall kernel: em2: Interface stopped DISTRIBUTING, possible flapping Nov 24 01:07:51 firewall kernel: lagg0: link state changed to DOWN Nov 24 01:07:56 firewall kernel: lagg0: link state changed to UP Nov 24 01:07:57 firewall kernel: em2: Interface stopped DISTRIBUTING, possible flapping Nov 24 01:07:57 firewall kernel: lagg0: link state changed to DOWN Nov 24 01:08:02 firewall kernel: lagg0: link state changed to UP Nov 24 01:08:03 firewall kernel: em2: Interface stopped DISTRIBUTING, possible flapping Nov 24 01:08:03 firewall kernel: lagg0: link state changed to DOWN Nov 24 01:08:08 firewall kernel: lagg0: link state changed to UP Nov 24 01:08:09 firewall kernel: em2: Interface stopped DISTRIBUTING, possible flapping Nov 24 01:08:09 firewall kernel: lagg0: link state changed to DOWN Nov 24 01:08:14 firewall kernel: lagg0: link state changed to UP Nov 24 01:08:15 firewall kernel: em2: Interface stopped DISTRIBUTING, possible flapping Nov 24 01:08:15 firewall kernel: lagg0: link state changed to DOWN Nov 24 01:08:20 firewall kernel: lagg0: link state changed to UP
ifconfig laag0 show
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=850098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO> ether e8:39:35:11:fa:ab inet6 fe80::ea39:35ff:fe11:faab%lagg0 prefixlen 64 scopeid 0xb inet 192.168.55.1 netmask 0xffffff00 broadcast 192.168.55.255 laggproto lacp lagghash l2,l3,l4 laggport: em2 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> laggport: em3 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> groups: lagg media: Ethernet autoselect status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
-
Maybe check debugging is still enabled. You should see some log entries from lacp there.
Edit: Sorry missed a whole post of yours somehow. Ok I see logs showinh both ends timing out.
And it still showed flapping with only one NIC connected? em2 there in the logs?
What do you see if you run:
ifconfig -vv lagg0
?Steve
-
Looking at the switch there it looks like you might have short timeouts set?
For reference:
SSH@ICX6450-24P Switch>show lag Total number of LAGs: 1 Total number of deployed LAGs: 1 Total number of trunks created:1 (123 available) LACP System Priority / ID: 1 / 609c.9f54.14f2 LACP Long timeout: 90, default: 90 LACP Short timeout: 3, default: 3 === LAG "lacp1" ID 2047 (dynamic Deployed) === LAG Configuration: Ports: e 1/2/1 e 1/2/3 Port Count: 2 Primary Port: 1/2/1 Trunk Type: hash-based LACP Key: 22047 LACP Timeout: long Deployment: HW Trunk ID 1 Port Link State Dupl Speed Trunk Tag Pvid Pri MAC Name 1/2/1 Up Learn Full 10G 2047 No 1 0 609c.9f54.150b 1/2/3 Up Learn Full 10G 2047 No 1 0 609c.9f54.150b Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope] 1/2/1 1 1 22047 Yes L Agg Syn Col Dis No No Ope 1/2/3 1 1 22047 Yes L Agg Syn Col Dis No No Ope Partner Info and PDU Statistics Port Partner Partner LACP LACP System ID Key Rx Count Tx Count 1/2/1 32768-00e0.ed86.a68c 690 4 4 1/2/3 32768-00e0.ed86.a68c 690 3 5
-
@stephenw10 said in LACP not working:
Maybe check debugging is still enabled. You should see some log entries from lacp there.
Edit: Sorry missed a whole post of yours somehow. Ok I see logs showinh both ends timing out.
And it still showed flapping with only one NIC connected? em2 there in the logs?
What do you see if you run:
ifconfig -vv lagg0
?Steve
Hi Steve
yes each NIC I connect em0 or em3 it shows the flagging error.
what do you mean with time out of set?root@pfsense:~ # ifconfig -vv lagg0 lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=8520b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO> ether 00:08:a2:0c:99:7b inet6 fe80::208:a2ff:fe0c:997b%lagg0 prefixlen 64 scopeid 0x10 inet 192.168.15.1 netmask 0xffffff00 broadcast 192.168.15.255 laggproto lacp lagghash l2,l3,l4 lagg options: flags=10<LACP_STRICT> flowid_shift: 16 lagg statistics: active ports: 2 flapping: 0 lag id: [(8000,00-08-A2-0C-99-7B,020B,0000,0000), (8000,74-83-C2-48-2F-67,0042,0000,0000)] laggport: igb4 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING> [(8000,00-08-A2-0C-99-7B,020B,8000,0005), (8000,74-83-C2-48-2F-67,0042,0080,0017)] laggport: igb5 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3d<ACTIVITY,AGGREGATION,SYNC,COLLECTING,DISTRIBUTING> [(8000,00-08-A2-0C-99-7B,020B,8000,0006), (8000,74-83-C2-48-2F-67,0042,0080,0018)] groups: lagg media: Ethernet autoselect status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
-
My switch is set for Long LACP timeouts (the TIO flag) and yours it set short.
Try changing that.
-
@stephenw10 said in LACP not working:
My switch is set for Long LACP timeouts (the TIO flag) and yours it set short.
Try changing that.
Are you using a broadcom switch?
i have changed the timeouts to longdevice(config)# lag blue dynamic device(config-lag-blue)# lacp-timeout long
but it stills shows blocked
=== LAG "Pfsense WAN" ID 1 (dynamic Deployed) === LAG Configuration: Ports: e 1/1/1 e 2/1/1 Port Count: 2 Primary Port: 1/1/1 Trunk Type: hash-based LACP Key: 20001 Deployment: HW Trunk ID 1 Port Link State Dupl Speed Trunk Tag Pvid Pri MAC Name 1/1/1 Up Blocked Full 1G 1 No 141 0 609c.9f4b.808c WAN1 2/1/1 Up Blocked Full 1G 1 No 141 0 609c.9f4b.808c WAN2 Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope] 1/1/1 1 1 20001 Yes S Agg No No No Def Exp Err 2/1/1 1 1 20001 Yes S Agg No No No Def Exp Err Partner Info and PDU Statistics Port Partner Partner LACP LACP System ID Key Rx Count Tx Count 1/1/1 4-e839.3511.faab 0 0 1 2/1/1 3-e839.3511.faab 0 0 1
-
I'm using a Brocade ICX6450 there. It's a value you can set for each lag group:
SSH@ICX6450-24P Switch(config-lag-lacp1)#lacp-timeout long Long timeout mode short Short timeout mode
Steve