Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    dpinger not reliable - ping request/replies

    Scheduled Pinned Locked Moved Routing and Multi WAN
    9 Posts 3 Posters 901 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      siegmarb
      last edited by siegmarb

      Dear Users,

      The running dpinger process, seems to be stall for no reason, hence reporting the gateway as down, but it is not:

      root 62188 0.0 0.0 17736 2808 - Is Sat13 0:27.94 /usr/local/bin/dpinger -S -r 0 -i GW_KD -B 10.8.0.2 -p /var/run/dpinger_GW_KD_DH~10.8.0.2~1.1.1.1.pid -u /var/run/dpinger_GW_KD_DH~10.8.0.2~1.1.1.1.sock -C /etc/rc.gateway_alarm -d 1 -s 500 -l 2000
      
      # tcpdump -ni vtnet2 host 1.1.1.1
      07:31:08.014179 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 37365, seq 53675, length 9
      07:31:08.523369 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 37365, seq 53676, length 9
      07:31:09.033452 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 37365, seq 53677, length 9
      07:31:09.544022 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 37365, seq 53678, length 9
      07:31:10.053855 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 37365, seq 53679, length 9
      07:31:10.563363 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 37365, seq 53680, length 9
      

      Saving the gateway configuration without any changes, "re-starts" the dpinger and pings are working again.

      07:31:10.826086 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 23910, seq 0, length 9
      07:31:10.841466 IP 1.1.1.1 > 10.8.0.2: ICMP echo reply, id 23910, seq 0, length 9
      07:31:11.333371 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 23910, seq 1, length 9
      07:31:11.347465 IP 1.1.1.1 > 10.8.0.2: ICMP echo reply, id 23910, seq 1, length 9
      07:31:11.843354 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 23910, seq 2, length 9
      07:31:11.902847 IP 1.1.1.1 > 10.8.0.2: ICMP echo reply, id 23910, seq 2, length 9
      07:31:12.353358 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 23910, seq 3, length 9
      07:31:12.369040 IP 1.1.1.1 > 10.8.0.2: ICMP echo reply, id 23910, seq 3, length 9
      

      Checking the fw states before saving the gw settings again, shows:

      3127d333-494e-4ec3-83d1-2043249f2cb4-2025-04-07_09-28.png

      After saving the gw settings again:

      9a589d6b-0df6-4786-9b69-28fb2b37d0d7-2025-04-07_09-34.png

      Somehow, the existing fw-state is stall.

      any helps is greatly appreciated.

      GertjanG 1 Reply Last reply Reply Quote 0
      • GertjanG
        Gertjan @siegmarb
        last edited by

        @siegmarb

        10.8.0.2 is your pfSense WAN interface ?
        Static setup or DHCP ?

        Ones you've saved, and everything is fine, when does it start to fail ?
        What was going on at that, or just before, moment ? (system logs)

        No "help me" PM's please. Use the forum, the community will thank you.
        Edit : and where are the logs ??

        S 1 Reply Last reply Reply Quote 0
        • S
          siegmarb @Gertjan
          last edited by

          @Gertjan

          correct. 1.8.0.2 is our Pfsense WAN interface. Static setup.

          It starts randomly to fail. This is the log, right after i hit 'Save' again:

          Apr 7 09:31:10	dpinger	37780	exiting on signal 15
          Apr 7 09:31:10	dpinger	89446	send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr 1.1.1.1 bind_addr 10.8.0.2 identifier "GW_KD_DH "
          Apr 8 07:07:12	dpinger	89446	GW_KD_DH 1.1.1.1: Alarm latency 27627us stddev 14386us loss 21%
          Apr 9 11:07:14	dpinger	89446	GW_KD_DH 1.1.1.1: Clear latency 22638us stddev 53358us loss 5%
          Apr 15 01:00:42	dpinger	89446	GW_KD_DH 1.1.1.1: Alarm latency 21509us stddev 5210us loss 22%
          Apr 15 01:06:52	dpinger	89446	GW_KD_DH 1.1.1.1: Alarm latency 1293341us stddev 881246us loss 95%
          Apr 15 01:07:07	dpinger	89446	GW_KD_DH 1.1.1.1: Alarm latency 243671us stddev 600909us loss 70%
          Apr 15 01:07:48	dpinger	89446	GW_KD_DH 1.1.1.1: Clear latency 93734us stddev 349188us loss 5%
          Apr 15 06:30:21	dpinger	89446	GW_KD_DH 1.1.1.1: Alarm latency 28365us stddev 7087us loss 21%
          Apr 15 06:34:35	dpinger	89446	GW_KD_DH 1.1.1.1: Clear latency 28547us stddev 86447us loss 5%
          Apr 16 10:44:07	dpinger	89446	GW_KD_DH 1.1.1.1: Alarm latency 28632us stddev 35032us loss 22%
          

          I see nothing else special in the logs. It's our primary firewall and aside from the dpinger issue, behaves "normally":

          Uptime 118 Days 04 Hours 27 Minutes 09 Seconds

          GertjanG 1 Reply Last reply Reply Quote 0
          • GertjanG
            Gertjan @siegmarb
            last edited by

            @siegmarb said in dpinger not reliable - ping request/replies:

            Apr 8 07:07:12 dpinger 89446 GW_KD_DH 1.1.1.1: Alarm latency 27627us stddev 14386us loss 21%
            Apr 9 11:07:14 dpinger 89446 GW_KD_DH 1.1.1.1: Clear latency 22638us stddev 53358us loss 5%
            Apr 15 01:00:42 dpinger 89446 GW_KD_DH 1.1.1.1: Alarm latency 21509us stddev 5210us loss 22%
            Apr 15 01:06:52 dpinger 89446 GW_KD_DH 1.1.1.1: Alarm latency 1293341us stddev 881246us loss 95%
            Apr 15 01:07:07 dpinger 89446 GW_KD_DH 1.1.1.1: Alarm latency 243671us stddev 600909us loss 70%
            Apr 15 01:07:48 dpinger 89446 GW_KD_DH 1.1.1.1: Clear latency 93734us stddev 349188us loss 5%
            Apr 15 06:30:21 dpinger 89446 GW_KD_DH 1.1.1.1: Alarm latency 28365us stddev 7087us loss 21%
            Apr 15 06:34:35 dpinger 89446 GW_KD_DH 1.1.1.1: Clear latency 28547us stddev 86447us loss 5%
            Apr 16 10:44:07 dpinger 89446 GW_KD_DH 1.1.1.1: Alarm latency 28632us stddev 35032us loss 22%

            dpinger not reliable - ping request/replies

            You can remove the word 'not'. 😊
            Test for yourself : Go here : Diagnostics > Packet Capture
            and select (Capture Options) your WAN interface,
            Set "View Options" to High,
            Set PROTOCOL to PING, and ETHERTYPE to IPv4.
            Hit the green Start.

            From now on, you'll see that "ICMP echo requests" are send. It's the dpinger process that pings ^^
            These "ICMP echo requests" are send to an upstream gateway (you'll see the IP in the capture also) and if all goes well, and answer "ICMP echo reply" comes back.
            The duration between the moment a packet was send and the answer comes back is known as :

            89ddd9ad-0612-4ee8-afa5-c61c05cdd4cb-image.png

            You'll se the avarage time it took, and te variation.
            The simple fact that packets did come back is n enough to mark the interface as "Online".

            So, now you know dpinger is reliable ^^
            Less reliable is probably your connection, as you've shown yourself : ICMP packets are (always) send, but not all come back. That said, a couple over several days ... that not that bad.
            Or : maybe the gateway to where the packets where send to was very busy and missed a packet, so it didn't reply back.

            Be aware that the ICMP packets have less priority as other TCP or UDP packets, so if a ICMP gets discarded, then that's not the end of the world. It can happen.
            If your connection is saturated, then its normal that you see that a ICMP packet didn't make it back.

            No "help me" PM's please. Use the forum, the community will thank you.
            Edit : and where are the logs ??

            S 1 Reply Last reply Reply Quote 0
            • S
              siegmarb @Gertjan
              last edited by

              @Gertjan

              thank you for your answer. Further debugging shows, that dpinger does not correctly recover:

              I restarted dpinger and replies are there again:

              10:00:07.359483 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 28786, seq 35848, length 9
              10:00:07.836358 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 8356, seq 0, length 9
              10:00:07.886237 IP 1.1.1.1 > 10.8.0.2: ICMP echo reply, id 8356, seq 0, length 9
              

              After ~ 10 hours:

              07:56:44.384429 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 23931, seq 29764, length 9
              07:56:44.404683 IP 1.1.1.1 > 10.8.0.2: ICMP echo reply, id 23931, seq 29764, length 9 
              07:56:44.894107 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 23931, seq 29765, length 9
              07:56:44.916906 IP 1.1.1.1 > 10.8.0.2: ICMP echo reply, id 23931, seq 29765, length 9
              07:56:45.433620 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 23931, seq 29766, length 9
              07:56:45.942312 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 23931, seq 29767, length 9
              07:56:46.454289 IP 10.8.0.2 > 1.1.1.1: ICMP echo request, id 23931, seq 29768, length 9
              

              dpinger detects higher latency and loss, but does not recover:

              Apr 24 07:56:59	dpinger	23931	GW_KD_DH 1.1.1.1: Alarm latency 23815us stddev 7223us loss 21%
              

              Pinging at the same time manually from the pfsense shows, that 1.1.1.1 is reachable:

              /root: ping 1.1.1.1
              PING 1.1.1.1 (1.1.1.1): 56 data bytes
              64 bytes from 1.1.1.1: icmp_seq=0 ttl=56 time=21.769 ms
              64 bytes from 1.1.1.1: icmp_seq=1 ttl=56 time=12.738 ms
              64 bytes from 1.1.1.1: icmp_seq=2 ttl=56 time=27.216 ms
              64 bytes from 1.1.1.1: icmp_seq=3 ttl=56 time=12.617 ms
              64 bytes from 1.1.1.1: icmp_seq=4 ttl=56 time=13.614 ms
              64 bytes from 1.1.1.1: icmp_seq=5 ttl=56 time=22.943 ms

              Still looks like a dpinger issue to me.

              patient0P 1 Reply Last reply Reply Quote 0
              • patient0P
                patient0 @siegmarb
                last edited by

                @siegmarb what pfSense version are you working with?

                What I'm a bit surprised is that the source and destination ICMP ID is the same. Nothing wrong with it but not standard, have you set it on purpose?

                10.0.8.2:23910 -> 1.1.1.1:23910
                ...
                10.0.8.2:37365 -> 1.1.1.1:37365
                

                For me the source ID/port is random:

                WAN 	icmp 	<WAN IP>:12790 -> <monitoring IP>:8 	0:0 	211.625K / 211.625K 	5.85 MiB / 5.85 MiB
                
                S 1 Reply Last reply Reply Quote 0
                • S
                  siegmarb @patient0
                  last edited by

                  @patient0

                  2.7.2-RELEASE (amd64)
                  built on Fri Dec 8 21:55:00 CET 2023
                  FreeBSD 14.0-CURRENT

                  no, i did not set the id manually.

                  GertjanG patient0P 2 Replies Last reply Reply Quote 0
                  • GertjanG
                    Gertjan @siegmarb
                    last edited by

                    @siegmarb

                    Right now, tens (hundreds) of thousands of pfSense installs use "2.7.2". Not saying that this is a proof it's 'perfect', but if for every pfSense the WAN is flaky at best, then at the end of this year, pfSense won't exist anymore.
                    The good news is : it's your setup ^^

                    What about this : 2.8.0 is out, true, it's beta. It's out there for nearly a month now, and there are no big issues. So : go 2.8.0.

                    And again : you can disable the dpinger action, so it won't touch your WAN connection anymore. If the interface still goes down, it wasn't dpinger doing so. dpinger will still "ping", and this is just so stats get generated and "on-line" gets shown on the dashboard.

                    No "help me" PM's please. Use the forum, the community will thank you.
                    Edit : and where are the logs ??

                    1 Reply Last reply Reply Quote 0
                    • patient0P
                      patient0 @siegmarb
                      last edited by

                      @siegmarb said in dpinger not reliable - ping request/replies:

                      no, i did not set the id manually

                      Ok, seeing the same on 2.7.2 (I'm on 25.03-BETA on prod), that's normal then.

                      1 Reply Last reply Reply Quote 0
                      • First post
                        Last post
                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.