10G copper connection drop on XG-71001U with Intel X540-T2 adapter


  • Hello everyone,

    I have a supermicro server connecting to my XG7100-1U over 10G copper and suffering random connection drops (mostly when there is some traffic, ie, backup runnings).

    When a drop occurs the link displays a down on both end and I usually just go for a reboot on the netgate appliance.

    On the server side, it connects using a X552/X557-AT adapter, while on the pfSense side it connects using an X540-T2 intel adapter.

    Any recommendations to troubleshoot this further or known issue that would seem related to those symptoms?

    Regards,


  • This post is deleted!

  • Some extra info:
    I am running 2.4.5-RELEASE-p1 (amd64)

    If I move the vlan interfaces away from the 10G copper (ix0) onto the built-in 1G lagg0 (the internal switch), then connectivity is stable
    I followed advises here https://docs.netgate.com/pfsense/en/latest/hardware/tune.html#intel-ix-4-cards so my /boot/loader.conf.local currently looks like:

    hw.intr_storm_threshold=10000
    hw.ix.flow_control=0
    

    The below is advised in the doc but was already in /boot/loader.conf:

    kern.ipc.nmbclusters="1000000"
    kern.ipc.nmbjumbop="524288"
    

    TSO/LRO and hardware checksum are all disabled from the GUI.

    On latency: even though my WAN connection (pppoe) is on another NIC (ix5), it has increased latency beyond the avg for this link (rtt around 40ms, rttsd about 80ms).
    The latency goes away if I stop using the 10G port and move my server on a 1G port of the built-in switch of the XG-7100 (connected via LAGG0).

    When the physical interface ix0 is going down, other physical interfaces are still online and the pfsense box is still reachable via other logical interface (ie via some vlan interface over the lagg0 built-in switch).