High ping and packet loss in local network



  • @NOYB:

    I have a similar problem.  Though I've never been able to produce it with a manual ping.

    What I see is that the gateway monitor of a local router will suddenly begin having intermittent packet loss after pfSense has be online anywhere from a few days to couple weeks.  Restarting pfSense, with no action taken on the target gateway, fixes it, for awhile.
    […]
    2.1-RELEASE  (i386)
    built on Wed Sep 11 18:16:50 EDT 2013

    FreeBSD 8.3-RELEASE-p11

    I think that you problem isn't a real problem, but virtual: when you have packet loss, your latency is almost good! in 2.1 there is a problem with fake packet loss. look this thread: http://forum.pfsense.org/index.php?topic=66328.0. If you wont to be sure that packet loss is fake, try smokeping utlity on another machine.

    Also - your pings there in the .03 ms range to 2.1 - you sure you were not pinging yourself??  That is really FAST response even for a LAN..

    Yes, sorry, my error with VM ::)
    this is the correct ping

    ping -S 192.168.2.10 -c10 192.168.2.1
    PING 192.168.2.1 (192.168.2.1) 56(84) bytes of data.
    64 bytes from 192.168.2.1: icmp_req=1 ttl=64 time=0.603 ms
    64 bytes from 192.168.2.1: icmp_req=2 ttl=64 time=0.636 ms
    64 bytes from 192.168.2.1: icmp_req=3 ttl=64 time=0.637 ms
    64 bytes from 192.168.2.1: icmp_req=4 ttl=64 time=0.645 ms
    64 bytes from 192.168.2.1: icmp_req=5 ttl=64 time=0.653 ms
    64 bytes from 192.168.2.1: icmp_req=6 ttl=64 time=0.617 ms
    64 bytes from 192.168.2.1: icmp_req=7 ttl=64 time=0.604 ms
    64 bytes from 192.168.2.1: icmp_req=8 ttl=64 time=0.633 ms
    64 bytes from 192.168.2.1: icmp_req=9 ttl=64 time=0.969 ms
    64 bytes from 192.168.2.1: icmp_req=10 ttl=64 time=0.602 ms
    
    --- 192.168.2.1 ping statistics ---
    10 packets transmitted, 10 received, 0% packet loss, time 9001ms
    rtt min/avg/max/mdev = 0.602/0.659/0.969/0.110 ms
    

    My hardware configuration is very simple, that's because i think the problem is pfsense.
    scheme in attachments
    This isn't a simplification, but my real network (wan side) where every arrow represent a cable

    It just seems strange to me that ping from pfsense to X that response time would point to issue on pfsense.

    pfsense puts ping on wire, there is a response..  Would pfsense having issues modify these times?  Lets say the response comes in .5 seconds (500ms) how would pfsense as it moves it up the stack change that time to 10 ms?

    it's strange for mee too, but I think it's the most likely thing. Maybe pfsense increase it's X time to put ping on the wire and consequently latency increase..




  • hi to all. Today i see improvements.

    This is my actual load

    
                        /0   /1   /2   /3   /4   /5   /6   /7   /8   /9   /10
         Load Average
    
          Interface           Traffic               Peak                Total
         em1_vlan13  in     12.636 KB/s          1.346 MB/s          946.988 MB
                     out     9.207 KB/s        101.088 KB/s           69.776 MB
    
         em0_vlan10  in      1.307 KB/s         58.705 KB/s           21.818 MB
                     out     7.329 KB/s          3.202 MB/s          517.675 MB
    
                lo0  in      0.065 KB/s          0.464 KB/s           47.049 KB
                     out     0.065 KB/s          0.464 KB/s           47.049 KB
    
                em3  in     22.275 KB/s        151.768 KB/s          181.590 MB
                     out    39.036 KB/s          1.423 MB/s            2.242 GB
    
                em2  in      3.303 KB/s         11.335 KB/s            3.329 MB
                     out     3.437 KB/s         38.118 KB/s            4.351 MB
    
                em1  in     37.779 KB/s          1.889 MB/s            2.460 GB
                     out    17.342 KB/s        127.957 KB/s          161.994 MB
    
                em0  in      4.159 KB/s        141.876 KB/s           34.396 MB
                     out     6.426 KB/s          1.606 MB/s          267.269 MB
    
    

    i have also modified (yesterday) my sysctl and my loader.conf.local, but probably the benefits that i'm seeing are caused from the very low traffic.

    /etc/systctl.conf

    
    kern.ipc.somaxconn=1024  # (default 128)
    kern.ipc.maxsockbuf=16777216
    net.inet.tcp.mssdflt=1460  # (default 536)
    net.inet.tcp.sendbuf_max=16777216
    net.inet.tcp.recvbuf_max=16777216
    
    net.inet.tcp.sendbuf_inc=262144  # (default 8192 )
    net.inet.tcp.recvbuf_inc=262144  # (default 16384)
    
    net.inet.tcp.cc.algorithm=htcp
    
    # Reduce the amount of SYN/ACKs we will re-transmit to an unresponsive client.
    net.inet.tcp.syncache.rexmtlimit=1  # (default 3)
    
    # Lessen max segment life to conserve resources
    # ACK waiting time in milliseconds
    # (default: 30000\. RFC from 1979 recommends 120000)
    net.inet.tcp.msl=5000
    
    # As of 15 Apr 2009\. Igor Sysoev says that nolocaltimewait has some buggy implementaion.
    # So disable it or now till get fixed
    net.inet.tcp.nolocaltimewait=0
    
    # Protocol decoding in interrupt thread.
    # If you have NIC that automatically sets flow_id then it's better to not
    # use direct_force, and use advantages of multithreaded netisr(9)
    # If you have Yandex drives you better off with `net.isr.direct_force=1` and
    # `net.inet.tcp.read_locking=0` otherwise you may run into some TCP related
    # problems.
    # Note: If you have old NIC that don't set flow_ids you may need to
    # patch `ip_input` to manually set FLOW_ID via `nh_m2flow`.
    #
    # FreeBSD 8+
    net.isr.direct=1 
    
    net.isr.direct_force=1 
    # Explicit Congestion Notification
    # (See http://en.wikipedia.org/wiki/Explicit_Congestion_Notification)
    #net.inet.tcp.ecn.enable=1 
    
    # Flowtable - flow caching mechanism
    # Useful for routers
    net.inet.flowtable.enable=1
    net.inet.flowtable.nmbflows=65535
    vm.pmap.shpgperproc=2048
    
    net.inet.tcp.recvspace=1024000
    net.inet.tcp.sendspace=1024000
    
    net.inet.ip.forwarding=1      # (default 0)
    net.inet.ip.fastforwarding=1  # (default 0)
    
    # General Security and DoS mitigation.
    net.inet.ip.check_interface=1         # verify packet arrives on correct interface (default 0)
    net.inet.ip.portrange.randomized=1    # randomize outgoing upper ports (default 1)
    net.inet.ip.process_options=0         # IP options in the incoming packets will be ignored (default 1)
    net.inet.ip.random_id=1               # assign a random IP_ID to each packet leaving the system (default 0)
    net.inet.ip.redirect=0                # do not send IP redirects (default 1)
    net.inet.ip.accept_sourceroute=0      # drop source routed packets since they can not be trusted (default 0)
    net.inet.ip.sourceroute=0             # if source routed packets are accepted the route data is ignored (default 0)
    net.inet.ip.stealth=1                 # do not reduce the TTL by one(1) when a packets goes through the firewall (default 0)
    net.inet.icmp.bmcastecho=0            # do not respond to ICMP packets sent to IP broadcast addresses (default 0)
    net.inet.icmp.maskfake=0              # do not fake reply to ICMP Address Mask Request packets (default 0)
    net.inet.icmp.maskrepl=0              # replies are not sent for ICMP address mask requests (default 0)
    net.inet.icmp.log_redirect=0          # do not log redirected ICMP packet attempts (default 0)
    net.inet.icmp.drop_redirect=1         # no redirected ICMP packets (default 0)
    net.inet.icmp.icmplim=10              # number of ICMP/RST packets/sec to limit returned packet bursts during a DoS. (default 200)
    net.inet.icmp.icmplim_output=1        # show "Limiting open port RST response" messages (default 1)
    net.inet.tcp.drop_synfin=1            # SYN/FIN packets get dropped on initial connection (default 0)
    net.inet.tcp.ecn.enable=0            # explicit congestion notification (ecn) warning: some ISP routers abuse it (default 0)
    net.inet.tcp.fast_finwait2_recycle=1  # recycle FIN/WAIT states quickly (helps against DoS, but may cause false RST) (default 0)
    net.inet.tcp.icmp_may_rst=0           # icmp may not send RST to avoid spoofed icmp/udp floods (default 1)
    #net.inet.tcp.maxtcptw=15000          # max number of tcp time_wait states for closing connections (default 5120)
    net.inet.tcp.msl=3000                 # 3s maximum segment life waiting for an ACK in reply to a SYN-ACK or FIN-ACK (default 30000)
    net.inet.tcp.path_mtu_discovery=0     # disable MTU discovery since most ICMP type 3 packets are dropped by others (default 1)
    net.inet.tcp.rfc3042=0                # disable limited transmit mechanism which can slow burst transmissions (default 1)
    net.inet.tcp.sack.enable=1            # TCP Selective Acknowledgments are needed for high throughput (default 1)
    net.inet.udp.blackhole=1              # drop udp packets destined for closed sockets (default 0)
    net.inet.tcp.blackhole=2              # drop tcp packets destined for closed ports (default 0)
    #net.route.netisr_maxqlen=4096        # route queue length (rtsock using "netstat -Q") (default 256)
    security.bsd.see_other_uids=0         # only allow users to see their own processes. root can see all (default 1)
    

    /boot/loader.conf.local

    
    legal.intel_wpi.license_ack=1 #accetta la licenza intel
    legal.intel_ipw.license_ack=1
    
    aio_load="YES"                     # Async IO system calls
    autoboot_delay="3"                 # reduce boot menu delay from 10 to 3 seconds. 
    
    cc_htcp_load="YES" 
    
    kern.ipc.nmbclusters="262144"
    kern.ipc.somaxconn="4096"
    kern.ipc.maxsockets="204800"
    
    hw.em.rxd="4096"
    hw.em.txd="4096"
    hw.em.fc_setting="0"
    hw.em.num_queues="4"
    
    kern.sched.slice="1"
    
    # inizio nuovo
    
    # Some useful netisr tunables. See sysctl net.isr
    net.isr.maxthreads=4
    net.isr.defaultqlimit=10240
    net.isr.maxqlimit=10240
    # Bind netisr threads to CPUs
    net.isr.bindthreads=1
    
    # Also for my notebook, but may be used with Opteron
    #device         amdtemp
    # Same for Intel processors
    device         coretemp
    

    and this is the ping from Pfsense

    PING 192.168.2.1 (192.168.2.1) from 192.168.2.5: 56 data bytes
    64 bytes from 192.168.2.1: icmp_seq=0 ttl=64 time=0.203 ms
    64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=0.149 ms
    64 bytes from 192.168.2.1: icmp_seq=2 ttl=64 time=0.142 ms
    64 bytes from 192.168.2.1: icmp_seq=3 ttl=64 time=20.249 ms
    64 bytes from 192.168.2.1: icmp_seq=4 ttl=64 time=1.627 ms
    64 bytes from 192.168.2.1: icmp_seq=5 ttl=64 time=0.177 ms
    64 bytes from 192.168.2.1: icmp_seq=6 ttl=64 time=0.158 ms
    64 bytes from 192.168.2.1: icmp_seq=7 ttl=64 time=0.101 ms
    64 bytes from 192.168.2.1: icmp_seq=8 ttl=64 time=0.219 ms
    64 bytes from 192.168.2.1: icmp_seq=9 ttl=64 time=0.149 ms
    
    --- 192.168.2.1 ping statistics ---
    10 packets transmitted, 10 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.101/2.317/20.249/5.993 ms
    
    PING 192.168.1.254 (192.168.1.254) from 192.168.1.130: 56 data bytes
    64 bytes from 192.168.1.254: icmp_seq=0 ttl=64 time=0.858 ms
    64 bytes from 192.168.1.254: icmp_seq=1 ttl=64 time=0.821 ms
    64 bytes from 192.168.1.254: icmp_seq=2 ttl=64 time=0.686 ms
    64 bytes from 192.168.1.254: icmp_seq=3 ttl=64 time=0.805 ms
    64 bytes from 192.168.1.254: icmp_seq=4 ttl=64 time=0.672 ms
    64 bytes from 192.168.1.254: icmp_seq=5 ttl=64 time=0.667 ms
    64 bytes from 192.168.1.254: icmp_seq=6 ttl=64 time=1.909 ms
    64 bytes from 192.168.1.254: icmp_seq=7 ttl=64 time=6.646 ms
    64 bytes from 192.168.1.254: icmp_seq=8 ttl=64 time=0.638 ms
    64 bytes from 192.168.1.254: icmp_seq=9 ttl=64 time=0.767 ms
    
    --- 192.168.1.254 ping statistics ---
    10 packets transmitted, 10 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.638/1.447/6.646/1.769 ms
    

    We will inform you of what is happening with more load Monday.


  • LAYER 8 Global Moderator

    So this is physical wires and devices involved anywhere in this setup - or is this all virtual networks and vms?

    Sorry but even 2 boxes connected together with a wire..  These just seem to low.

    64 bytes from 192.168.2.1: icmp_seq=0 ttl=64 time=0.203 ms
    64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=0.149 ms
    64 bytes from 192.168.2.1: icmp_seq=2 ttl=64 time=0.142 ms

    .15 to .2 ms is freakishly FAST..  And then bounces to 20ms ??
    64 bytes from 192.168.2.1: icmp_seq=3 ttl=64 time=20.249 ms

    This seems more realistic for normal lan pings - me pinging box on my network.
    From 192.168.1.99: bytes=60 seq=0001 TTL=64 ID=54ef time=0.494ms
    From 192.168.1.99: bytes=60 seq=0002 TTL=64 ID=54f0 time=0.415ms
    From 192.168.1.99: bytes=60 seq=0003 TTL=64 ID=54f1 time=0.407ms
    From 192.168.1.99: bytes=60 seq=0004 TTL=64 ID=54f2 time=0.404ms

    Ok low 3's – but sub .2  -- I don't think I have ever seen such speeds.

    64 bytes from 192.168.2.1: icmp_seq=7 ttl=64 time=0.101 ms

    You sure your not pinging your own IP address again? ;)

    Here is my box pinging itself
    From 192.168.1.100: bytes=60 seq=0001 TTL=128 ID=29d0 time=0.122ms
    From 192.168.1.100: bytes=60 seq=0002 TTL=128 ID=29d2 time=0.159ms
    From 192.168.1.100: bytes=60 seq=0003 TTL=128 ID=29d4 time=0.144ms
    From 192.168.1.100: bytes=60 seq=0004 TTL=128 ID=29d6 time=0.142ms

    Now sure I can understand those speeds pinging your own IP.

    So here is question for you - are you having actual operational issues with actual applications having issues with packet loss.. Or are you just seeing weird stuff when your pinging?



  • So this is physical wires and devices involved anywhere in this setup - or is this all virtual networks and vms?

    I've used VM (virtual box on windows) only for testing ping from another linux box(i hate windows ping), my pFsense configuration is hardware!

    64 bytes from 192.168.2.1: icmp_seq=0 ttl=64 time=0.203 ms
    64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=0.149 ms
    64 bytes from 192.168.2.1: icmp_seq=2 ttl=64 time=0.142 ms

    .15 to .2 ms is freakishly FAST..  And then bounces to 20ms ??
    64 bytes from 192.168.2.1: icmp_seq=3 ttl=64 time=20.249 ms

    You finally hit the problem!! ;D How is it possible?!  :o
    I don't know why is so fast, but this is a production server, with good cable (cat 5e, 6) and decent switch (but not excellent).

    Ok low 3's – but sub .2  -- I don't think I have ever seen such speeds.

    2.1 box is a zeroshell router, before putting it i haven't this low ping.

    You sure your not pinging your own IP address again? ;)

    I'm sure because of this:

    PING 192.168.2.1 (192.168.2.1) from 192.168.2.5: 56 data bytes
    64 bytes from 192.168.2.1: icmp_seq=0 ttl=64 time=0.203 ms
    64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=0.149 ms

    and:

    Interface configuration:

     WAN (wan)       -> em1        -> v4: 192.168.1.130/24
     DMZ (lan)       -> em0        -> v4: 172.16.30.5/24
     WAN2 (opt1)     -> em1_vlan13 -> v4: 192.168.2.5/24
     WIBRI (opt2)    -> em3        -> v4: 192.168.168.5/24
     SEDE (opt3)     -> em0_vlan10 -> v4: 192.168.132.5/24
     WAN3 (opt4)     -> em2        -> v4: 217.xx.xx.30/27
    

    So here is question for you - are you having actual operational issues with actual applications having issues with packet loss.. Or are you just seeing weird stuff when your pinging?

    I've started to monitoring my ping because of issue with VOIP, that is more suscettibile of packet loss and high latency. I've got a lot of drop calls or very low calling quality.
    Then i've PHYSICAL switch WAN3 (where the VOIP go out) from em0 (em0, em1 are on the same physical network, a dual port) to em2 (motherboard's NIC ) AND i've moved other traffic that exit from wan3 to wan1, SO this network now work good. At this point i've thinked that i've a broken or not properly working NIC, but i've tried another hardware configuration, but the issue persists. So i've tried, on my production server, to put all the wan on the NIC that i was sure working good: em3. I've created and configured 2 wan on 2 vlan + 1 wan without it: DISASTER  :o ALL the wan had the same problem, even worse. I've tried also to change switch!

    With my actual configuration i haven't problem with VOIP, but the internet navigation is worse and slower because of this lag/packet loss.

    PS. thanks for your interest  :D


  • LAYER 8 Global Moderator

    And what macs are you seeing on those IPs from arp -a on 2.5 pinging 2.1

    I will do some testing at work from highend cisco switch connected to another highend cisco switch..  Its just sub <.2ms just seems like one screaming LAN or your just pinging yourself..

    And your sub .2 and then out of the blue 20ms – then next ping back to sub .2, that seems just not right.


  • Netgate Administrator

    Those ping times look OK to me, aside from the sudden jump to 20ms.
    My test pfSense box is setup behind my home pfSense box connected directly by a 0.5m cat5e cable. The 'normal' ping responce is <0.2ms. See attached RRD graph.
    That is attached to some bridged ports on the pfSense box also so I would expect that to add some time.

    Though your ping times are still lower than mine and you are using a switch.

    In your diagram above you seem to have two boxes labelled 192.168.1.254. Typo? Just indicating the subnet? My not understanding your diagram?

    Steve




  • @johnpoz:

    And what macs are you seeing on those IPs from arp -a on 2.5 pinging 2.1

    I will do some testing at work from highend cisco switch connected to another highend cisco switch..  Its just sub <.2ms just seems like one screaming LAN or your just pinging yourself..

    And your sub .2 and then out of the blue 20ms – then next ping back to sub .2, that seems just not right.

    For short cable runs on gigabit, that is rather normal.  I usually run ping tests after setting up structured cabling (loss packets are more indicative of issues than throughput tests that vary largely).  Sub .3ms is very normal even for longer runs (>50m).  The varying factor is usually the load on the end point systems.
    e.g.
    This is a ping test for a Gigabit connected access point that is currently actively streaming HD videos over WLAN:

    --- 192.168.0.11 ping statistics ---
    5 packets transmitted, 5 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.463/4.956/9.644/3.571 ms
    

    This for a 10/100 connected access point that is currently inactive (no clients connected):

    --- 192.168.0.10 ping statistics ---
    5 packets transmitted, 5 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.257/0.275/0.327/0.026 ms
    

    And this is a ping test for my Gigabit connected PC that's hardly doing much other than streaming a youtube video or two:

    --- 192.168.0.2 ping statistics ---
    5 packets transmitted, 5 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.247/0.262/0.312/0.025 ms
    

  • Netgate Administrator

    Forgot to say that my boxes above are both using fxp 10/100 NICs.  ;)

    Steve



  • It's really normal to see <0.5ms pings on a local network run (even through dumb switches).  My point being that the latencies will vary largely based on the end point devices and their load.  A short spike may just indicate that the end device is under load at the time.

    In this case, the devices the OP is pinging may simply be under load at the time (since they are technically routers doing their job).



  • @dreamslacker:

    My point being that the latencies will vary largely based on the end point devices and their load.  A short spike may just indicate that the end device is under load at the time.
    In this case, the devices the OP is pinging may simply be under load at the time (since they are technically routers doing their job).

    It might seem correct, but if it were i should saturate 100mbit or the end point devices should have 100% or something like cpu using. This is impossible also because when i have this latency from pfsense if i ping from another box the same router at the SAME TIME, i get a fast and stable ping time.

    @stephenw10:

    In your diagram above you seem to have two boxes labelled 192.168.1.254. Typo? Just indicating the subnet? My not understanding your diagram?

    Sry, my error, em1: 192.168.1.130/24, not 192.168.1.254.



  • @bullet92:

    It might seem correct, but if it were i should saturate 100mbit or the end point devices should have 100% or something like cpu using. This is impossible also because when i have this latency from pfsense if i ping from another box the same router at the SAME TIME, i get a fast and stable ping time.

    Is this machine on the same network as pfSense?

    Presumably, pfSense is pinging the said router on its 'LAN' or 'DMZ' interface.  Is the machine you're using also attached to the same interface or a different interface?

    BTW, do you have traffic shaping or QOS enabled on pfSense or the target router(s)?


  • LAYER 8 Global Moderator

    "It's really normal to see <0.5ms pings on a local network run"

    Agreed..  .4 to .5 very common I see this all the time in the lan and expect it..  Is just .1 to .2, I don't see that – I wouldn't call our switches over worked or anything but only time I recall seeing such low numbers is pinging local..

    When at work trmw going to ping around the datacenter seeing what kind of low times I can find ;)



  • @dreamslacker:

    Is this machine on the same network as pfSense?
    Presumably, pfSense is pinging the said router on its 'LAN' or 'DMZ' interface.  Is the machine you're using also attached to the same interface or a different interface?
    BTW, do you have traffic shaping or QOS enabled on pfSense or the target router(s)?

    Yes, i've putted this machine on the same switch ( so the same router interface ) of pfSense. Yes, traffis shaping is currently enabled on pfSense, disabled on the target routers.



  • @bullet92:

    @dreamslacker:

    Is this machine on the same network as pfSense?
    Presumably, pfSense is pinging the said router on its 'LAN' or 'DMZ' interface.  Is the machine you're using also attached to the same interface or a different interface?
    BTW, do you have traffic shaping or QOS enabled on pfSense or the target router(s)?

    Yes, i've putted this machine on the same switch ( so the same router interface ) of pfSense. Yes, traffis shaping is currently enabled on pfSense, disabled on the target routers.

    Try prioritizing icmp using the floating rules and see what you get. In practical terms, it does little but if it works then you know what to do to get the best effect on your setup.



  • @johnpoz: Don't bother to do so on my account. Lol. I do see 0.2 ms roundtrip on wired connections now and then during line testing but this is with fully idle systems. I.e. Idling system ping to smart switch on the other end without and other connected devices.
    Nevertheless, a lightly loaded system should still give 0.4-0.5 ms pings.


  • LAYER 8 Global Moderator

    talking about pinging say the console or mgmt switches from say another switch - neither would be very active – idle sucking juice is about all they wold be doing.. Now in a DC where the run might be 100ft, or quite possible they are next to each other in the rack but again patch panels to connect them most likely.

    I wouldn't be doing it for anyone other than myself - I don't recall ever seeing those kinds of speed in any network I have worked on in 20+ years..  But then again maybe I just never pinged anything directly connected ;)  Quite possible.



  • I only do that to quickly test structured runs really.  It's not indicative of real world practical applications but any major issues or interference generally shows.



  • @dreamslacker:

    Try prioritizing icmp using the floating rules and see what you get. In practical terms, it does little but if it works then you know what to do to get the best effect on your setup.

    @192.168.2.1-QOS_ENABLED:

    TEST1
    ping -c50 192.168.2.1
    –- 192.168.2.1 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.117/1.591/14.669/3.278 ms
    TEST2
    –- 192.168.2.1 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.132/1.868/19.644/3.629 ms

    @192.168.1.254-QOS_ENABLED:

    TEST1
    ping -c50 192.168.1.254
    –- 192.168.1.254 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.640/2.117/10.636/2.417 ms
    TEST2
    –- 192.168.1.254 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.609/1.666/12.005/1.936 ms

    @192.168.2.1-QOS_DISABLED:

    TEST1
    –- 192.168.2.1 ping statistics ---
    50 packets transmitted, 48 packets received, 4.0% packet loss
    round-trip min/avg/max/stddev = 0.105/4.136/46.755/8.910 ms

    TEST2
    –- 192.168.2.1 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.116/2.888/20.880/5.121 ms

    @192.168.1.254-QOS_DISABLED:

    TEST1
    ping -c50 192.168.1.254
    –- 192.168.1.254 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.614/3.516/78.450/10.933 ms
    TEST2
    –- 192.168.1.254 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.646/2.797/30.649/4.756 ms

    @johnpoz:

    And what macs are you seeing on those IPs from arp -a on 2.5 pinging 2.1

    arp -a
    ? (192.168.2.1) at 00:0d:61:79:54:d8 on em1_vlan13 expires in 1175 seconds [vlan]
    ? (192.168.2.5) at 00:26:55:e3:3f:67 on em1_vlan13 permanent [vlan]

    Yesterday i've also updated to snapshot 2.1.1, but no changes.
    I think that the prioritization give some benefits, but the proble are not the ping (obviously) but the internet traffic that go through this links wich results slower.. So if QoS is working, there is a bottleneck somewhere in my box? It seems strange load is low and the connections too..
    Moreover, even if QoS reduces the problem, this is still present, is inconceivable go from 0.117 ms latency to 14.669 ms. :/

    P.S. At this time (12.00am) every day ping and packet loss are worsen

    PING 192.168.2.1 (192.168.2.1): 56 data bytes
    64 bytes from 192.168.2.1: icmp_seq=0 ttl=64 time=49.547 ms
    64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=54.244 ms
    64 bytes from 192.168.2.1: icmp_seq=2 ttl=64 time=51.494 ms
    64 bytes from 192.168.2.1: icmp_seq=3 ttl=64 time=27.504 ms
    64 bytes from 192.168.2.1: icmp_seq=4 ttl=64 time=119.251 ms
    64 bytes from 192.168.2.1: icmp_seq=5 ttl=64 time=43.744 ms
    64 bytes from 192.168.2.1: icmp_seq=7 ttl=64 time=8.241 ms
    64 bytes from 192.168.2.1: icmp_seq=8 ttl=64 time=0.118 ms
    64 bytes from 192.168.2.1: icmp_seq=9 ttl=64 time=90.099 ms
    
    --- 192.168.2.1 ping statistics ---
    10 packets transmitted, 9 packets received, 10.0% packet loss
    round-trip min/avg/max/stddev = 0.118/49.360/119.251/35.273 ms
    

    top

    last pid: 55487;  load averages:  0.00,  0.03,  0.01    up 0+19:20:19  12:23:35
    39 processes:  1 running, 38 sleeping
    CPU:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
    Mem: 63M Active, 213M Inact, 143M Wired, 1052K Cache, 112M Buf, 1570M Free
    

    Swap:

    vmstat -i

    interrupt                          total       rate
    irq1: atkbd0                           3          0
    irq14: ata0                           57          0
    irq19: uhci1+                     135782          1
    cpu0: timer                     27847076        399
    irq256: em0                      4073157         58
    irq257: em1                     44002413        631
    irq258: em2                       818545         11
    irq259: em3                     40608091        583
    cpu3: timer                     27847053        399
    cpu2: timer                     27847052        399
    cpu1: timer                     27847052        399
    Total                          201026281       2886
    

    /0  /1  /2  /3  /4  /5  /6  /7  /8  /9  /10
        Load Average

    Interface          Traffic              Peak                Total
        em1_vlan13  in    216.004 KB/s        233.244 KB/s            2.379 GB
                    out    10.831 KB/s        11.537 KB/s            1.078 GB

    em0_vlan10  in      0.146 KB/s          2.091 KB/s          175.861 MB
                    out    1.430 KB/s        13.097 KB/s            3.474 GB

    lo0  in      0.000 KB/s          0.000 KB/s          957.725 KB
                    out    0.000 KB/s          0.000 KB/s          957.725 KB

    em3  in    126.926 KB/s        165.866 KB/s            2.608 GB
                    out  316.054 KB/s        316.054 KB/s          702.344 MB

    em2  in      0.600 KB/s          0.600 KB/s          70.714 MB
                    out    0.744 KB/s          0.744 KB/s          286.328 MB

    em1  in    316.593 KB/s        316.593 KB/s            2.063 GB
                    out  123.062 KB/s        129.154 KB/s            2.080 GB

    em0  in      1.521 KB/s          4.879 KB/s          376.255 MB

    netstat -i -b -n -I em1_vlan13

    Name               Mtu Network       Address              Ipkts Ierrs Idrop     Ibytes    Opkts Oerrs     Obytes  Coll
    em1_vlan13        1496 <link#11>     00:26:55:e3:3f:67 14875307     0     0 2561511047  8441171 72643 1157981020     0
    em1_vlan13        1496 fe80::226:55f fe80::226:55ff:fe        0     -     -          0        2     -        152     -
    em1_vlan13        1496 192.168.2.0/2 192.168.2.5           7945     -     -     508860       20     -       1680     -</link#11>
    

    netstat -i -b -n -I em1

    Name               Mtu Network       Address              Ipkts Ierrs Idrop     Ibytes    Opkts Oerrs     Obytes  Coll
    em1               1500 <link#2>      00:26:55:e3:3f:67 32198966     0     0 2232366523 19473909     0 2238003408     0
    em1               1500 fe80::226:55f fe80::226:55ff:fe        0     -     -          0        1     -         96     -
    em1               1500 192.168.1.0/2 192.168.1.130        25921     -     -    3280518       10     -        840     -</link#2>
    


  • @bullet: I meant prioritizing the ping packets rather than to disable the qos as a whole. That is, for the purpose of testing, set floating rules on each interface with direction out, protocol icmp, source address of the interface and place in the highest priority queue.

    Edit:  By any chance, do you have Upperlimit, or Limiter, or PowerD with P4TCC, or any combination of those enabled?



  • @dreamslacker
    yes, i understand that, infact is what i've done!

    I've setted the traffic shaper with the wizard and it's automatically set a Upperlimit.

    So now i tried to remove all the shaper and this is what i get:

    –- 192.168.2.1 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.100/0.174/0.251/0.039 ms

    SOLVED!!!!  ;D ;D ;D ;D

    ps: So now I only need to configure correctly the traffic shaping


  • LAYER 8 Global Moderator

    Glad you got your issue solved.. I want to do some more testing in my own DC, trying to find these sub .2 ms response times..  So couple of cisco switches yesterday I pinged from one to the other.. Now the IPs were SVI on a vlan, but switches not doing anything and directly connected was seeing .4 ms roughly..  Which is bit of difference from .1 ms ;)

    I will be keeping an eye out.. I just personally don't recall seeing such low response times even just local lan with directly connected equipment unless you were pinging loopback, local IP, etc.

    But I will be paying more attention in the future on the hunt ;)



  • @bullet92:

    @dreamslacker
    yes, i understand that, infact is what i've done!

    I've setted the traffic shaper with the wizard and it's automatically set a Upperlimit.

    So now i tried to remove all the shaper and this is what i get:

    –- 192.168.2.1 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.100/0.174/0.251/0.039 ms

    SOLVED!!!!  ;D ;D ;D ;D

    ps: So now I only need to configure correctly the traffic shaping

    I dislike the wizard, it never gives an optimal solution (nor should it since every use-case is different).

    Nevertheless, you can run the wizard but remove all upperlimits in all queues once it's done.  I've found that it increases latencies across the board (usually when it's active on the queue and occasionally at low loads) for some unknown reason.

    I've settled on manually setting up my own queues and settings damn nearly since day 1 (pfSense 1.2rc2).


  • Netgate Administrator

    Good call.  :)

    Time for a low ping contest Johnpoz?

    Steve



  • @dreamslacker:

    I dislike the wizard, it never gives an optimal solution (nor should it since every use-case is different).
    Nevertheless, you can run the wizard but remove all upperlimits in all queues once it's done.  I've found that it increases latencies across the board (usually when it's active on the queue and occasionally at low loads) for some unknown reason.
    I've settled on manually setting up my own queues and settings damn nearly since day 1 (pfSense 1.2rc2).

    I will try your advice, thanks ;)

    Glad you got your issue solved.. I want to do some more testing in my own DC, trying to find these sub .2 ms response times..  So couple of cisco switches yesterday I pinged from one to the other.. Now the IPs were SVI on a vlan, but switches not doing anything and directly connected was seeing .4 ms roughly..  Which is bit of difference from .1 ms ;)
    I will be keeping an eye out.. I just personally don't recall seeing such low response times even just local lan with directly connected equipment unless you were pinging loopback, local IP, etc.
    But I will be paying more attention in the future on the hunt ;)

    Thx :) I advice you to try to ping a zeroshell box! Just out of curiosity :D


Log in to reply