Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Gateway WAN keeps on having packet lost

    Scheduled Pinned Locked Moved General pfSense Questions
    55 Posts 6 Posters 8.1k Views 6 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S Offline
      stephenw10 Netgate Administrator
      last edited by

      Ok can we assume that without any other traffic using the connection the WAN gateways show as up and without packet loss?

      If you connect via ssh and run top -aSH and the command line do you see any CPU cores being run at 100%? (idle processes at or close to 0%).

      That CPU appears to be well capable of 1Gbps but it's single thread performance is not fantastic and em NICs run with 1 queue. I still wouldn't expect anything like this though.

      C 1 Reply Last reply Reply Quote 0
      • C Offline
        cheapie408 @stephenw10
        last edited by

        @stephenw10
        The WAN interface never really get reported as being down which is very weird, but on the gateway status, it does show off. It also seems that IPv4 shows to be off more often than IPV6.

        With nothing else in the mix, I still experience packet loss.

        top -aSH didn't pull up any CPU info.

        Diagnostics/System activity it does show that the WCPU cores are all running at or close too 100%
        11 root 155 ki31 0B 64K CPU0 0 87:04 98.88% [idle{idle: cpu0}]
        11 root 155 ki31 0B 64K RUN 3 87:12 98.68% [idle{idle: cpu3}]
        11 root 155 ki31 0B 64K CPU1 1 87:31 95.36% [idle{idle: cpu1}]
        11 root 155 ki31 0B 64K CPU2 2 87:14 93.07% [idle{idle: cpu2}]

        1 Reply Last reply Reply Quote 0
        • stephenw10S Offline
          stephenw10 Netgate Administrator
          last edited by

          top -aSH at the command line looks like:

          last pid: 89077;  load averages:  0.35,  0.55,  0.54                                            up 0+01:13:06  19:15:26
          179 threads:   3 running, 150 sleeping, 26 waiting
          CPU:  0.8% user,  0.0% nice,  1.4% system,  0.0% interrupt, 97.9% idle
          Mem: 55M Active, 976M Inact, 368M Wired, 195M Buf, 521M Free
          Swap: 1894M Total, 1894M Free
          
            PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
             11 root        155 ki31     0B    32K RUN      0  71:51  99.09% [idle{idle: cpu0}]
             11 root        155 ki31     0B    32K CPU1     1  71:32  96.63% [idle{idle: cpu1}]
           1720 root         52    0   132M    55M accept   1   0:05   0.16% php-fpm: pool nginx (php-fpm)
              0 root        -76    -     0B   464K -        1   0:05   0.09% [kernel{if_config_tqg_0}]
          84138 root         20    0    13M  4028K CPU0     0   0:00   0.07% top -aSH
             12 root        -60    -     0B   416K WAIT     0   0:01   0.06% [intr{swi4: clock (0)}]
          80075 root         20    0    28M  9636K kqread   0   0:01   0.02% nginx: worker process (nginx)
             18 root        -16    -     0B    16K pftm     1   0:01   0.02% [pf purge]
              0 root        -76    -     0B   464K -        0   0:00   0.01% [kernel{if_io_tqg_0}]
          41729 root         20    0    11M  2840K select   1   0:00   0.01% /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log
          

          You can see both CPU cores are mostly idle there. If anything was using a lot of CPU it would show above that.

          C 1 Reply Last reply Reply Quote 0
          • C Offline
            cheapie408 @stephenw10
            last edited by cheapie408

            @stephenw10

            NVM it's cash sensitive it seems. Here's my result

            alt text

            1 Reply Last reply Reply Quote 0
            • stephenw10S Offline
              stephenw10 Netgate Administrator
              last edited by

              Hmm, that was whilst you are passing traffic? Like running a speedtest?

              I expect to see far more CPU usage than that. Nothing there looks like an issue though.

              C 1 Reply Last reply Reply Quote 0
              • C Offline
                cheapie408 @stephenw10
                last edited by

                @stephenw10

                sorry that was it idling here's one while I'm running a speed test. Noticed how I drop to around 500Mbps now

                alt text

                1 Reply Last reply Reply Quote 0
                • stephenw10S Offline
                  stephenw10 Netgate Administrator
                  last edited by

                  Mmm, nothing unusual there either. No CPU core maxed out.

                  I guess I would be running a packet capture on the WAN at this point to see what's actually happening. Are there a load of retransmissions or packet fragments etc.

                  Steve

                  C 1 Reply Last reply Reply Quote 0
                  • C Offline
                    cheapie408 @stephenw10
                    last edited by cheapie408

                    @stephenw10
                    i'll try to do that...how long should I be running the packet capture for?

                    here's a screen capture of the entire speedtest process.

                    https://vimeo.com/manage/videos/661294436/f74c230e65

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S Offline
                      stephenw10 Netgate Administrator
                      last edited by

                      I would start with 1000 packets showing the beginning of the test. If there's something broken there it should show pretty quickly.

                      Steve

                      C 1 Reply Last reply Reply Quote 0
                      • C Offline
                        cheapie408 @stephenw10
                        last edited by

                        @stephenw10 Here's what it captured. Not sure how to decode this information.

                        packet capture.txt

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S Offline
                          stephenw10 Netgate Administrator
                          last edited by

                          You need to look at the actual pcap file in Wireshark to see anything useful really.

                          C 1 Reply Last reply Reply Quote 0
                          • C Offline
                            cheapie408 @stephenw10
                            last edited by

                            @stephenw10 attached is the cap file. I can see that there are icmp errors and some extremely long response time in the traffic but is not smart enough to analyze it to identify the exact issue. :(

                            https://drive.google.com/file/d/1l-6VkFO8zfGs8sUBnX7Spltxgp10trQo/view?usp=sharing

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S Offline
                              stephenw10 Netgate Administrator
                              last edited by

                              Hmm, well the WAN was quite busy at that time. The 1000 packets covered only 2.3s.
                              Load of random port UDP traffic in there. Clients behind torrenting maybe?

                              The WAN sent 5 pings in that time to what I assume is the gateway IP and no replies came back. Did you set the monitoring back to the gateway IP?

                              Did you enable promiscuous mode when doing that? That's usually a good idea as some things can be hidden otherwise.

                              Overall apart from the lack of ping responses it doesn't look too terrible.

                              Are you able to retry that with the WAN in promisc mode and preferably without the LAN side client spewing UDP traffic?

                              Steve

                              C ? 2 Replies Last reply Reply Quote 0
                              • C Offline
                                cheapie408 @stephenw10
                                last edited by

                                I'll have to try again in the morning when wife and kids don't need the internet. When they're up every one is online

                                1 Reply Last reply Reply Quote 0
                                • ? Offline
                                  A Former User @stephenw10
                                  last edited by

                                  @stephenw10, look at the capture and I am seeing windows update In Execution, in my case this has come to cause a problem

                                  C 1 Reply Last reply Reply Quote 0
                                  • C Offline
                                    cheapie408 @Guest
                                    last edited by

                                    @silence But half of this morning no one was online and even when my computer was the only thing connected.

                                    And then it still doesn't explain that I don't get ping timed out when I bypass the offense box

                                    1 Reply Last reply Reply Quote 0
                                    • C Offline
                                      cheapie408
                                      last edited by cheapie408

                                      Took the Xfiniti router out fo bridge mode and let the whole house run off of it. no time on out ping to any external IP's. it also resolved the issue MyQ not staying online.

                                      So I've factory reset the PFsense box to default no fix, tested all ports for both wan and lan and only would fail when pinging outside so that means my NIC is good and all ports or good or else I would be singing failed pings when I ping the gateway as well and not just external IP's.

                                      really leaves it to being a software issue.

                                      Do you guys think if I completely reinstall the image is any different than doing a factory reset from the device?

                                      1 Reply Last reply Reply Quote 0
                                      • C Offline
                                        cheapie408
                                        last edited by

                                        With a fresh install and zero changed to the default setting, it would first boot up with everything looking good but about 5 minutes it would start dropping packet on IPv4 again.

                                        At this point is it picking up a new NIC to see if the problem is still there?

                                        What would be a good NIC?
                                        I currently have the NIC below
                                        https://www.amazon.com/IBM-39Y6138-1000-Server-Adapter/dp/B016YK2NAY

                                        1 Reply Last reply Reply Quote 0
                                        • stephenw10S Offline
                                          stephenw10 Netgate Administrator
                                          last edited by

                                          I would look for something newer that uses igb based NICs at least. em NICs only use a single queue so don't utilise your CPU as well as igb devices would.

                                          Steve

                                          C 1 Reply Last reply Reply Quote 0
                                          • C Offline
                                            cheapie408 @stephenw10
                                            last edited by cheapie408

                                            @stephenw10 I spent a good 6 or 7 hours migrating all my static IP devices to the Xfinity gateway. It was a PITA I hate anything provided by the ISP but this time, it is the only thing that works. :(

                                            I've spent enough time on this, going to power down PFsens. everything that I need to work is currently working. Going to go enjoy new years.

                                            I might get a 10gbe NIC if I do decide to spin up the PFsense box again.

                                            BTW wanted to add that I ran part of this morning with the onboard NIC as wan and still experience the symptoms. I doubt a new NIC will fix my issues.

                                            Thanks you for everyone's efforts

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.