Upstream traffic good. Downstream traffic bad. Ideas?



  • At each of six locations, we have the following setup: Our ISP has a Cisco Catalyst switch on our premises that we plug a pfSense router into. Except for one of our locations, this works great. At the problematic location, incoming traffic is only about 1/6th of what it should be (3 vs 24Mb/s). The ISP swears that the problem lies on our end of their switch and that they are seeing dropped packets. I'm trying to confirm that fact.

    We plugged a workstation into their switch and found that incoming traffic tests went back to what we would expect (24Mb/s). So now, I'm scratching my head. From my end, looking at the pfSense traffic graphs and such all I see is 3 Mb/s getting through. I see no dropped packets. But I wouldn't know where to look for drops due to badly transmitted traffic.

    The weird thing is that the upstream traffic is fine and outgoing traffic benchmarks at 24Mb/s, just like it should.

    Any ideas? All I can think of is a bad cable, but I'm a three hour drive away and it just started snowing.

    Except for this problem, pfSense has been working very well for us over the last 18 months.

    By the way, we're running pfSense 2.0.2 on fairly decent Supermicro hardware with two built-in NICs plus an additional dual-port NIC. All the NICs are Intel. At least two of our locations are on identical hardware, but running pfSense 2.0.1 (2.0.2 is on my TODO list).



  • Status -> Interfaces will give you error counters for your interfaces.

    Their might be a mismatch between the pfsense interface and the switch: for example switch in full duplex, pfSense in half-duplex.

    pfSense shell commands (these probably won't work through the pfSense GUI)systat -ipand```
    systat -tcp

    
    > :q


  • Thanks Wally, that will be the first thing I look at when I get in.

    Woke up early this morning mulling this problem over. If we didn't have freezing rain all the way between myself and the site, I'd already be making the (long) drive.

    The ISP swears that we are running into their policer. Basically, if we try to draw (or push) more than a certain rate, the policer starts throwing away packets. That rate is 24Mb/s. The thing is, I'm only seeing a small fraction of that on my side. If the policer isn't smart about throwing away packets (ie. it throws away SYN packets), I guess it could be causing a huge loss of…goodput. But I'm at a loss as to how I would go about detecting this sort of thing. Hopefully the interface error counters will give me something useful to look at.

    Luckily, the site is empty today, so I can mess around with little risk of annoyed users.



  • @wallabybob:

    Status -> Interfaces will give you error counters for your interfaces.
    […]

    And there you have it:

    In/out errors: 12962098/0

    I'm thinking we have a bad cable, a bad NIC on the pfSense box, or a bad port on our ISP's switch. Very interesting.

    EDIT: Actually, I'm not sure that it's a bad cable, NIC, or port. I'm thinking that we are trying to draw in too much traffic and the policer is throwing away packets. Because of the way Cisco's policer works, throwing away packets indiscriminately, the amount of goodput drops well below what it would otherwise be.


  • Netgate Administrator

    @Rural:

    the way Cisco's policer works, throwing away packets indiscriminately

    Really? That seems incredibly crude.
    You could easily test this theory by using pfSenses traffic shaping features to limit the outgoing traffic to, say, 20Mbps. Pretty sure that does not just through packets away at random!  ;)

    Steve



  • Yup. It is crude. The model of switch our ISP uses is one of the cheapest that Cisco offered at the time (but it's still Cisco, so it probably costs three times as much as our pfSense boxes). The switch isn't capable of acting reasonably (ie. using a small buffer and giving priority to packets that are more important from a protocol sense). The result is we get decent throughput right up until we trigger the policer, then everything falls off a cliff. With a small number of users, each TCP stack throttles itself somewhat reasonably. Unfortunately, at our sites with many clients, this isn't the case.

    Doing traffic shaping properly is one of the big two reasons we looked into pfSense. Fortunately for us, our ISP increased our cap to the point that we haven't had to worry about the problem. As demand increases, we can expect it to return soon enough.

    I definitely understand what you are suggesting, but it's not the outgoing traffic that's the problem. It's the incoming. However, I believe this is within pfSense's capabilities as well. Looking into this has been on my TODO list for about 18 months.

    @stephenw10:

    […]
    Really? That seems incredibly crude.
    You could easily test this theory by using pfSenses traffic shaping features to limit the outgoing traffic to, say, 20Mbps. Pretty sure that does not just through packets away at random!  ;)

    Steve



  • By the way, we think the problem is just a cabling issue. I remember wanting a couple more inches of cable when I hooked it up…three weeks ago.


Locked