Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Wan inbound stalls

    Scheduled Pinned Locked Moved General pfSense Questions
    21 Posts 7 Posters 2.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      markn62
      last edited by

      I've tried two separate adapter sets with no improvement.  I don't see much value in trying a third pair of adapters.  I've been shot gunning this for 6 months and have got nowhere.  That is the premise of this post, to hopefully learn a better way to troubleshoot this more statistically rather than add'l shots in the dark.  It's already burned up an inordinate amount of time. Changing out the MB is a substantial effort, again no statistics or logs to suggest it's the problem.

      Btw, I've also tried removing all rules and shapers, no help. And removing all packages, no help. Also replaced the GigE Lan switch, no help.

      Is it possible a "System: Advanced: System Tunable" may be responsible?  Attached is a PRTG image showing the drop occur.  Occurs about 3 to 10 times in a 24 hour period not related to traffic load or time.

      IMG_0210.PNG
      IMG_0210.PNG_thumb

      1 Reply Last reply Reply Quote 0
      • H
        Harvy66
        last edited by

        How are you measuring the drop? Remote service? Service on the firewall? Service attached to the same WAN/LAN segments?

        Do you know for a fact if it's the router or the upstream?

        1 Reply Last reply Reply Quote 0
        • M
          markn62
          last edited by

          Isolated to router using;

          PRTG throughput graphing of router lan on PC via layer-2 switch, see image 10a.
          Sessoft MultiPing session of router lan on PC via layer-2 switch, see image 11a.
          Sessoft MultiPing session @ modem gw on laptop wired to 2nd modem port bypassing router, see image 12a.

          You'll notice behind the router (PC) both throughput and latency coincide. Ahead of the router (laptop) latency is not impacted.

          ScreenShot010a.jpg
          ScreenShot010a.jpg_thumb
          ScreenShot011a.JPG
          ScreenShot011a.JPG_thumb
          ScreenShot012a.jpg
          ScreenShot012a.jpg_thumb

          1 Reply Last reply Reply Quote 0
          • KOMK
            KOM
            last edited by

            I've tried two separate adapter sets with no improvement.  I don't see much value in trying a third pair of adapters.

            My suggestion was for you to try a different PC altogether.  Rule things out one by one if you can.

            1 Reply Last reply Reply Quote 0
            • M
              markn62
              last edited by

              Gonna try another Ethernet adapter to see if it has any influence. I would have bought a quad-port pci-x but won't fit my case.  So I settled on the Intel dual-port pro/1000 MT as it's in the FreeBSD 10.1 hardware list.  It uses a different driver, CAS instead of IGB which is where I'm putting most of my hope, not in hardware.

              1 Reply Last reply Reply Quote 0
              • M
                markn62
                last edited by

                I upgraded to 2.2.4-R and changed out a dual-port Intel adapter using igb driver to a Intel PRO/1000 MT Dual Port Server Adapter (82546) using the em driver.  The new adapter jperf tcp speed tests on the Lan port @ ~300mbps both directions. Still getting Wan stalls at the same frequency as before.  Short but several throughout the day.

                Running out of ideas…

                1 Reply Last reply Reply Quote 0
                • DerelictD
                  Derelict LAYER 8 Netgate
                  last edited by

                  I would put a managed switch between the modem and WAN port on a blank VLAN, make a mirror port of the modem switch port and put a looping tcpdump capture on it and see what you see when it stalls.  If you just stop getting packets from the ISP, you know what your next call is - and you'll at least have some evidence level 2 support can use.

                  If you don't see anything on the modem port, mirror the WAN port and run the same capture.

                  Chattanooga, Tennessee, USA
                  A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                  DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                  Do Not Chat For Help! NO_WAN_EGRESS(TM)

                  1 Reply Last reply Reply Quote 0
                  • F
                    firewalluser
                    last edited by

                    @markn62:

                    I have one dual port GigE PCIX intel 350 adapter, one Intel GigE PCI adapter and another PCI GigE adapter, not sure the brand.  Also one on-board disabled.  I tried placing both Wan and Lan on the dual port adapter and drops got worse. I then tried placing the Wan/Lan on the other two PCI adapters and drops still occur albeit less frequently.  All cards test @ +350mbps up/down with jperf.

                    I tried an unfiltered packet capture but the 60Gig SSD fills up in ~15 minutes so it's difficult to capture the event.  No logs or diagnostics that I've tried relate to the timing of the events.  Sure looks like a buffer or cache is backing up, stalling, then recovering.

                    Any other troubleshooting ideas?

                    Tcpdump piped to another machine with a 4TB/bigger hard drive, might give you some breathing space, something like this.

                    http://socpuppet.blogspot.co.uk/2013/05/using-netcat-to-push-dumped-traffic-to.html

                    Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                    Asch Conformity, mainly the blind leading the blind.

                    1 Reply Last reply Reply Quote 0
                    • DerelictD
                      Derelict LAYER 8 Netgate
                      last edited by

                      You can tell tcpdump to use a certain number of files of a certain size then overwrite them in a loop.  The "buffer" only has to be long enough for you to stop the dump after a drop occurs to get the info you need.

                      tcpdump -C 20 -w filename -W 100 -i eth0

                      Save 100 20Mbyte files named filenameXXX in a rolling capture

                      Initially, the file size should be something wireshark can comfortably load for you.  Later, when you know what you're looking for, you can capture larger files and filter them with tcpdump for the info you're looking for.

                      Chattanooga, Tennessee, USA
                      A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                      DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                      Do Not Chat For Help! NO_WAN_EGRESS(TM)

                      1 Reply Last reply Reply Quote 0
                      • M
                        markn62
                        last edited by

                        Great suggestions guys, of them I like Derelict's tcpdump loop. I'll have to give it a try.  I did find that the latest managed switch firmware now supports a mirror port and will soon support a packet header only mirror, so it remains an option.

                        But before I go this route, I had a recent discovery I thought I had ruled out but appears relevent.  I've kept an eye for drop patterns and see now that, although random, it hits on 15 minute increments such as 4:22pm, 4:37pm, 5:07, 6:07, etc.  Although nothing in the log corresponds. However in CRON is only one 0,15,30,45 and that's /etc/rc.filter_configure_sync .  I changed the interval to */60 and now the drops don't occur more than once every 60 minutes.  So what is this for and does it have to be on such frequent intervals?  Perhaps a better question is how might it cause the drop so I can modify or remove the root cause?

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.