Netgate 6100 and ONT Link Negotiation
-
I'm having no luck getting my 6100 and ONT to establish a 2.5 GbE link.
If the 6100 is set to autoselect they eventually establish a 1 GbE link but I can see that they appear to at least try to negotiate the higher speed. If I force the 6100 interface to 2500Base-T then 1 of the following happens:
- the 6100 crashes
- the 6100 hangs but shows a 2.5 GbE light on the port
- the 6100 displays 2.5 GbE on the interface page but actually links at 1 GbE (as shown on the dashboard widget and the physical port
The ONT is a Adtran SDX 611Q 2.5GbE, provided by Openreach. It is devoid of any indication of the link speed but typically shows an established LAN connection even when the 6100 hangs or crashes.
igc3: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1508 description: ONT options=4e020bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP> ether 90:ec:77:1b:xx:xx inet6 fe80::92ec:77ff:fe1b:xxxx%igc3 prefixlen 64 scopeid 0x4 inet 10.0.0.1 netmask 0xffffff00 broadcast 10.0.0.255 media: Ethernet autoselect (1000baseT <full-duplex>) status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
My WAN port is set on igc3 and the connection is via PPPoE. There are no other ports in use on the igc block and the LANs/VLAN are all on ix1 with a SFP+ DAC running at 10 GbE.
What trick am I missing (and why is pfSense so sensitive when interface settings are manipulated) to get a full 2.5 GbE link?
️
-
@robbiett You say your 6100 crashes in some circumstances.
Do you have a crashdump or info.0 file from the crash that says what happened?Perhaps a MCA (Machine Check Architecture) fault on the CPU?
I have that happening unexpectedly on 23.01, and support says it’s dead hardware, but I never had an issue on 22.05. So I was just wondering if this kind of error can be introduced from NIC drivers, as I’m using SFPs that is outside the normal 6100 tests when validating 23.01 stability. -
@keyser - Some of the hangs/crashes do produce an info.0 file, along with the yellow warning banner at the top of the dashboard on the next boot. Almost all are zero-byte files though.
The only exception was this one earlier today:
Dump header from device: /dev/nvd0p3 Architecture: amd64 Architecture Version: 4 Dump Length: 247808 Blocksize: 512 Compression: none Dumptime: 2023-03-20 12:22:05 +0000 Hostname: Router-8.xxxxxx.me Magic: FreeBSD Text Dump Version String: FreeBSD 14.0-CURRENT #0 plus-RELENG_23_01-n256037-6e914874a5e: Fri Feb 10 20:30:29 UTC 2023 root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-23_01-main/obj/amd64/VDZvZksF/var/j Panic String: page fault Dump Parity: 3372578104 Bounds: 0 Dump Status: good
I've no idea if the above is a true representation of the fault. In less formal words, I find that changing interface settings can cause the 6100 to hang. These events can be reduced by taking the interface off-line before changing a setting and re-enabling it. These events are independent of the link speed negotiation when trying to achieve a 2.5 GbE link but they can get in the way.
Of course, I am having these issues with the regular copper ports, as opposed to your issue on an SFP port.
Is there anywhere else in the logs I should look at or extra logs to enable (I am very new to pfSense)?
️
-
@robbiett Yes, the yellow crash banner is the key. When you open it, it will display the crashdump files that you can also save as a archive by clicking download crashdump.xx.x.
My first info.0 contained the MCA error in my case, but the second crash did not - it was a unrecoverable machine check option instead. But If you search through the crashdump it will show lines like:
Unrecoverable machine check exception
and
MCA: Bank 5, Status 0xba00000028000402
MCA: Bank 5, Status 0xba00000028000402
MCA: Bank 5, Status 0xba00000028000402
MCA: Bank 5, Status 0xba00000028000402
MCA: Global Cap 0x0000000000000c09, Status 0x0000000000000004
MCA: Global Cap 0x0000000000000c09, Status 0x0000000000000004
MCA: Vendor "GenuineIntel", ID 0x506f1, APIC ID 16
MCA: CPU 2 UNCOR EN PCC internal error 2
MCA: Misc 0x0
panic: Unrecoverable machine check exception
cpuid = 2If you save the crashdump the upper one will be in panic.txt, the lover is part of msgbuf.txt
Very interested to hear if you are seeing this as a part of your crashes when they happen.
-
@robbiett But since yours is a page fault - I doubt you are seeing the same issue as me.
Perhaps it is just my hardware that coincidentally decided to die off the same week I installed 23.01
-
We'd need to see the actual crash report not just the info file to know more there. It should be linked in the same place. Obviously it shouldn't happen though.
One thing to be aware of there is that the igc NICs can only link using auto-negotiation. They cannot use a fixed speed/duplex. The available settings there restrict the link speeds to allows in the negotiation so can limit it to a single speed but it still has to negotiate it. So if it fails to negotiate 2.5G without any restrictions it will probably still fail when set to 2.5G only.
Have you confirmed the ONT actually can link at 2.5G with something else? I think this is the first time we are seeing 2.5G Openreach supplied device.
Steve
-
@stephenw10 Thanks Steve - that is all I got at the time, unless it is hiding somewhere I can access post-event.
Yes, the 2.5 GbE ONTs are very new here. The first few went out on a trial in Oct & Nov last year and they only started shipping to some customers in late January. I happened to be one of the first.
When the ONT arrived I did bung it directly into a de-populated switch (either a UniFi US-24 Pro or an XG-24) just to confirm that it linked at 2.5 GbE. I was minded to do that again, just to re-validate the cable run. Not had chance yet but thinking afresh I do have a Netgear NBase-T switch hiding somewhere, so I should try that...
I also tried power-cycling the ONT to see if a cold-boot would help it achieve a 2.5 GbE link to the 6100 on autoselect.
I didn't appreciate the all-or-nothing 'autoselect' feature for the igc. On the ix ports I had to fix 10 GbE at both ends in order for my switch to establish a link. The 'autoselect' refused to work, with the port lights showing the attempts and brief links and the UniFi switch alternating between '10 GbE' or 'SFP+ Port Down'. That said, the UniFi SFP+ cages can be fickle in their own right.
I will look through the 'network hoarder's' pile of old network kit and find the 2.5 GbE capable Netgear switch to answer your question above.
️
-
Can I assume that, like other supplied ADVA devices, there is no customer access to it's management interface?
If you have switches available for testing I would try putting one between the 6100 and the ONT. That would confirm it's a link negotiation issue.
Unfortunately I imagine it will be a while until those units appear on ebay at a price I can justify for testing.
-
- Tried the Netgear switch, fresh cable and connected to the ONT:
1GbE - Tried ONT boot with Netgear switch connected:
1GbE
So I went back to my original working 2.5GbE link on my main 2.5GbE UniFi Switch and that result has changed, with only a 1GbE link established.
I can only presume that a subsequent ONT firmware update has removed the 2.5GbE capability, at least for now. Bummer. The joys of Openreach and their strapped-down approach to everything.
️
- Tried the Netgear switch, fresh cable and connected to the ONT:
-
Ah, OK. Yeah sucks that there's no access. Yet.
Edit: Checked my usual source for this sort of thing; you're already there.
-
@stephenw10 said in Netgate 6100 and ONT Link Negotiation:
there
Ahhh yes, the halcyon days of 2 months ago, when the ONT did link at 2.5GbE. Times change, scream if you want to go faster!
-
Mmm, I can only dream of the day Openreach finally decide to put fibre where I am. Despite the fact I'm only ~300m from the exchange and was in one of the original FTTC trial areas. Grrr.
Stuck at the slow end of g.fast for now. -
@stephenw10 - I hear you. Having enjoyed rapid roll-outs when I lived in the south I was rather shocked to see the state of things when I moved up here.
Having been on the rump of every Openreach infrastructure change (including the debacle of ECI G.fast) you could have knocked me down with a feather when they announced proper fibre plans for my area. Apparently we were on a list of 'arse-end areas we usually ignore'* and attracted additional funding under a lacklustre government scheme.
Small mercies.
️
*perhaps not a direct quote but you get the idea
-
-
They are doing Fidium fiber in our area. Their website states we can't use our own equipment even our own WiFi. But someone from the city did a post on Reddit it shows you can. I wonder if I got a SFP adaptors for my SG-2100 if I would not need the ONT modem and have the ability to authicate directly to the 2100 too? Or just use the ONT as the fiber converter to RJ45 cable and go that route. Who knows. I got a letter that they are discontinuing DSL soon for our area and to update now. I had a manager come to my house a couple times asking why I haven't upgraded yet as they installed the neighborhood main distribution connection on the power pole in my backyard a couple months ago. I showed him my PfSense snort logs and explained my past work world and the reason I have a firewall and all the issues in the past without one. He confirmed that yes I can use your own router it's not an issue to have the installer call him if he tells me he can't. So when they disconnect I will update you on my adventure with ONT or just using their modem without the Zytel box. I just like my DSL from Consolidated, it has worked perfectly for years. I am kind of waiting it out until I am forced to update.
-
My connection with fidium has been great ... 1g plan and they credit me $10/month for having my own router. 2.5g rj45 on the ont to my 6100.