Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    tun_wg0 reports (through snmp) some amount of Ierr's and Oerrs (mostly Oerrs) and triggers nagios-like warnings

    Scheduled Pinned Locked Moved WireGuard
    4 Posts 1 Posters 537 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • lvrmscL
      lvrmsc
      last edited by lvrmsc

      On an existing setup of two remote pfSense+ with Wireguard tunnelling between sites, over fibre, which runs flawlessly (no issues ever suspected), we changed the monitoring of our resources to "checkmk" (remotely related to nagios / forked as far as I understand).

      The new monitoring has started to trigger warnings and sometimes critical alerts, as well as a hint that the interface might be "flapping", only on the tun_wg0 interface.

      On the human visible side of things, everything is still running fine.
      In fact, SNMP polling from pfSense reports some error counts on tun_wg0 that are slightly above the warning and sometimes critical limits.

      I've checked everything I can think of and can't find the cause of these oerrs and some ierrs on the tun_wg0 software interface. They certainly don't match similar errors on the underlying physical interfaces.

      How do I go about debugging this? Does anyone recognise this as a known problem? Are these really packets issues or false error counts?

      I will probably set things up on the "checkmk" side to record the error channels, but not to trigger alerts about them (on that interface). If there is something "real" I could tweak to "fix" these errors, assuming they are real, it would probably be better.

      lvrmscL 2 Replies Last reply Reply Quote 0
      • lvrmscL
        lvrmsc @lvrmsc
        last edited by lvrmsc

        Could it be that normal discards on the synthetic Wireguard interfaces such as tun_wg0 are incorrectly counted in the error counts?

        [tun_wg0]
        Operational state: up
        Speed: unknown
        In: 78.3 kB/s
        Out: 12.9 kB/s
        Errors in: 0%
        Discards in: 0 packets/s
        Multicast in: 0 packets/s
        Broadcast in: 0 packets/s
        Unicast in: 130.38 packets/s
        Non-unicast in: 0 packets/s
        Errors out: 0.202% (warn/crit at 0.01%/0.1%)CRIT
        Discards out: 0 packets/s
        Multicast out: 0 packets/s
        Broadcast out: 0 packets/s
        Unicast out: 80.84 packets/s
        Non-unicast out: 0 packets/s

        1 Reply Last reply Reply Quote 0
        • lvrmscL
          lvrmsc @lvrmsc
          last edited by

          Here are two graphs of the snmp reported data.
          First the wan interface over which the wireguard tunnel goes through, among other trafic. No errors at all.
          wan.png

          Next is the tun_wg0 interface (wg), which shows those outgoing errors, with some pattern by the way.
          two.png

          I cannot make sense of it.

          lvrmscL 1 Reply Last reply Reply Quote 0
          • lvrmscL
            lvrmsc @lvrmsc
            last edited by lvrmsc

            What's the theory here? If a packet enters pfSense through, let's say, a LAN interface with an MTU of 1500 and ends up being routed through the Wireguard interface (MTU 1432 for example) like tun_wg0 to reach the other side of the tunnel? Are the oversized packets properly fragmented or are they considered errors at this point? Possibly returning unreachable/oversized ICMP to the LAN interface origin? I mean, what if the packets counted as errors on the tun_wg0 interface are not actually errors (and should not be counted as such)? Any PMTUD attempt from the LAN to the remote destination through Wireguard would then accumulate "errors" in those counters, when it shouldn't?
            Pure conjecture. I'm just trying to make sense of it.

            1 Reply Last reply Reply Quote 0
            • First post
              Last post
            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.