Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Wan inbound stalls

    Scheduled Pinned Locked Moved General pfSense Questions
    21 Posts 7 Posters 2.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      markn62
      last edited by

      Not sure exactly when it began as the modem and switch were first suspect.  With them ruled out I don't know where to start to debug Wan inbound stalls that occur 5-10 times a day.  Drops real time traffic like VoIP etc.  Doesn't affect outbound or LAN. Looked at system.log and interface stats and nothing glaring. The drops last about 5-10 seconds and takes off again at no time related to cron or anything else I can see, rather random.  But does tend to occur more often when throughput is above 50mbps.  Running basic NAT with a few QOS and other rules.  A few packages.  On ver 2.2 amd 64.

      Any pointers where else to look would be most helpful.

      1 Reply Last reply Reply Quote 0
      • L
        lmartinez073
        last edited by

        Try disabling the QOS

        1 Reply Last reply Reply Quote 0
        • KOMK
          KOM
          last edited by

          What's in your System log (Status - System logs) when this happens?  Is this virtual or physical, appliance or PC?

          1 Reply Last reply Reply Quote 0
          • M
            markn62
            last edited by

            Drops still occur with the shaper disabled and the limiter enabled.  ICMP replies stall in concert with the throughput stalls (drop) through PfSense. Physical PC w/ AMD A6-6400K.  Nothing in the system log ever coincides in time.  Troubleshooting is somewhat constrained given it's a production box.  Stumped where to begin.

            1 Reply Last reply Reply Quote 0
            • KOMK
              KOM
              last edited by

              Replace the WAN NIC and see if the problem persists.

              1 Reply Last reply Reply Quote 0
              • M
                markn62
                last edited by

                Ethernet stats dont indicate any errors or dropped packets but I can try a replacement.  Currently using an Intel Pro/1000 PCI-X dual port adapter. Isnt there a more detailed debug log that can be enabled?

                1 Reply Last reply Reply Quote 0
                • KOMK
                  KOM
                  last edited by

                  Anything in your Gateway log?  Anything in Status - RRD Graphs - Quality?  If nothing, I'm out of ideas other than a NIC & a prayer.

                  1 Reply Last reply Reply Quote 0
                  • M
                    markn62
                    last edited by

                    A prayer or lots of luck. Intermittents are a PITA. Graphs & logs are clean.  Guess I'll try a card swap.

                    1 Reply Last reply Reply Quote 0
                    • H
                      hda
                      last edited by

                      @markn62:

                      …
                      Isnt there a more detailed debug log that can be enabled?

                      Or Diagnostics: Packet Capture ?

                      Is the problem related to throughput speed ? Can all NIC's handle 50Mbps easily ?

                      1 Reply Last reply Reply Quote 0
                      • M
                        markn62
                        last edited by

                        I have one dual port GigE PCIX intel 350 adapter, one Intel GigE PCI adapter and another PCI GigE adapter, not sure the brand.  Also one on-board disabled.  I tried placing both Wan and Lan on the dual port adapter and drops got worse. I then tried placing the Wan/Lan on the other two PCI adapters and drops still occur albeit less frequently.  All cards test @ +350mbps up/down with jperf.

                        I tried an unfiltered packet capture but the 60Gig SSD fills up in ~15 minutes so it's difficult to capture the event.  No logs or diagnostics that I've tried relate to the timing of the events.  Sure looks like a buffer or cache is backing up, stalling, then recovering.

                        Any other troubleshooting ideas?

                        1 Reply Last reply Reply Quote 0
                        • KOMK
                          KOM
                          last edited by

                          Grab an El Cheapo PC from a landfill or your neighbour's basement and try with that.  Isolate the problem as best you can.  Maybe a noisy bus on your mb is giving your NICs the sharts.

                          1 Reply Last reply Reply Quote 0
                          • M
                            markn62
                            last edited by

                            I've tried two separate adapter sets with no improvement.  I don't see much value in trying a third pair of adapters.  I've been shot gunning this for 6 months and have got nowhere.  That is the premise of this post, to hopefully learn a better way to troubleshoot this more statistically rather than add'l shots in the dark.  It's already burned up an inordinate amount of time. Changing out the MB is a substantial effort, again no statistics or logs to suggest it's the problem.

                            Btw, I've also tried removing all rules and shapers, no help. And removing all packages, no help. Also replaced the GigE Lan switch, no help.

                            Is it possible a "System: Advanced: System Tunable" may be responsible?  Attached is a PRTG image showing the drop occur.  Occurs about 3 to 10 times in a 24 hour period not related to traffic load or time.

                            IMG_0210.PNG
                            IMG_0210.PNG_thumb

                            1 Reply Last reply Reply Quote 0
                            • H
                              Harvy66
                              last edited by

                              How are you measuring the drop? Remote service? Service on the firewall? Service attached to the same WAN/LAN segments?

                              Do you know for a fact if it's the router or the upstream?

                              1 Reply Last reply Reply Quote 0
                              • M
                                markn62
                                last edited by

                                Isolated to router using;

                                PRTG throughput graphing of router lan on PC via layer-2 switch, see image 10a.
                                Sessoft MultiPing session of router lan on PC via layer-2 switch, see image 11a.
                                Sessoft MultiPing session @ modem gw on laptop wired to 2nd modem port bypassing router, see image 12a.

                                You'll notice behind the router (PC) both throughput and latency coincide. Ahead of the router (laptop) latency is not impacted.

                                ScreenShot010a.jpg
                                ScreenShot010a.jpg_thumb
                                ScreenShot011a.JPG
                                ScreenShot011a.JPG_thumb
                                ScreenShot012a.jpg
                                ScreenShot012a.jpg_thumb

                                1 Reply Last reply Reply Quote 0
                                • KOMK
                                  KOM
                                  last edited by

                                  I've tried two separate adapter sets with no improvement.  I don't see much value in trying a third pair of adapters.

                                  My suggestion was for you to try a different PC altogether.  Rule things out one by one if you can.

                                  1 Reply Last reply Reply Quote 0
                                  • M
                                    markn62
                                    last edited by

                                    Gonna try another Ethernet adapter to see if it has any influence. I would have bought a quad-port pci-x but won't fit my case.  So I settled on the Intel dual-port pro/1000 MT as it's in the FreeBSD 10.1 hardware list.  It uses a different driver, CAS instead of IGB which is where I'm putting most of my hope, not in hardware.

                                    1 Reply Last reply Reply Quote 0
                                    • M
                                      markn62
                                      last edited by

                                      I upgraded to 2.2.4-R and changed out a dual-port Intel adapter using igb driver to a Intel PRO/1000 MT Dual Port Server Adapter (82546) using the em driver.  The new adapter jperf tcp speed tests on the Lan port @ ~300mbps both directions. Still getting Wan stalls at the same frequency as before.  Short but several throughout the day.

                                      Running out of ideas…

                                      1 Reply Last reply Reply Quote 0
                                      • DerelictD
                                        Derelict LAYER 8 Netgate
                                        last edited by

                                        I would put a managed switch between the modem and WAN port on a blank VLAN, make a mirror port of the modem switch port and put a looping tcpdump capture on it and see what you see when it stalls.  If you just stop getting packets from the ISP, you know what your next call is - and you'll at least have some evidence level 2 support can use.

                                        If you don't see anything on the modem port, mirror the WAN port and run the same capture.

                                        Chattanooga, Tennessee, USA
                                        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                                        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                                        Do Not Chat For Help! NO_WAN_EGRESS(TM)

                                        1 Reply Last reply Reply Quote 0
                                        • F
                                          firewalluser
                                          last edited by

                                          @markn62:

                                          I have one dual port GigE PCIX intel 350 adapter, one Intel GigE PCI adapter and another PCI GigE adapter, not sure the brand.  Also one on-board disabled.  I tried placing both Wan and Lan on the dual port adapter and drops got worse. I then tried placing the Wan/Lan on the other two PCI adapters and drops still occur albeit less frequently.  All cards test @ +350mbps up/down with jperf.

                                          I tried an unfiltered packet capture but the 60Gig SSD fills up in ~15 minutes so it's difficult to capture the event.  No logs or diagnostics that I've tried relate to the timing of the events.  Sure looks like a buffer or cache is backing up, stalling, then recovering.

                                          Any other troubleshooting ideas?

                                          Tcpdump piped to another machine with a 4TB/bigger hard drive, might give you some breathing space, something like this.

                                          http://socpuppet.blogspot.co.uk/2013/05/using-netcat-to-push-dumped-traffic-to.html

                                          Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

                                          Asch Conformity, mainly the blind leading the blind.

                                          1 Reply Last reply Reply Quote 0
                                          • DerelictD
                                            Derelict LAYER 8 Netgate
                                            last edited by

                                            You can tell tcpdump to use a certain number of files of a certain size then overwrite them in a loop.  The "buffer" only has to be long enough for you to stop the dump after a drop occurs to get the info you need.

                                            tcpdump -C 20 -w filename -W 100 -i eth0

                                            Save 100 20Mbyte files named filenameXXX in a rolling capture

                                            Initially, the file size should be something wireshark can comfortably load for you.  Later, when you know what you're looking for, you can capture larger files and filter them with tcpdump for the info you're looking for.

                                            Chattanooga, Tennessee, USA
                                            A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                                            DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                                            Do Not Chat For Help! NO_WAN_EGRESS(TM)

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.