Packet loss and jitter issue



  • Hello and first off, apologies if this has been answered before, I could not find any with a quick search.

    So, I have two issues, the first being packet loss. Almost exactly every hour on the minute, I get packet loss.


    It's not the end of the world, We're literally talking 40 out of almost 22000 that fail to come through, what bothers me is how clockwork precise it seems.
    I noticed that at 12:30 I didn't have any, which is odd. It's also the only time I was truly active on the computer that ran the test.

    I'm very new to optimizing networks and I ended up turning to pfsense in the first place because I have a 4G connection and 7 people active on the network. I needed a bandwidth limit per person scheduled to make it even remotely usable since the speeds vary a lot over the day. (80Mbit/s in the mornings, 10Mbit/s during the afternoon), my D-link router was incapable of doing this and my Huawei Router failed as well. Had a computer standing around so I set up the Huawei to act more like a modem and just let everything through to the Pfsense Computer, using that as a router instead and I've had pretty good results.

    However…
    As I said, the packet loss above hasn't really posed any issue that I've noticed, but I have had some jitter issues.
    Practically every game I play is telling me "Yo, your latency is really up and down, might wanna have that looked at". The two primary ones being Overwatch and BF4 which both have fancy symbols that pop up in game to give you a heads up that stuff isn't working properly. Both are telling me that I have severe ping variations. Now I wouldn't mind just saying "4G" and being done with it, but before I got pfsense I can't say I noticed it happening.
    I can however also say that I don't think it has impacted me a lot during online gaming, since I am running 4G i'm used to having 50-60 latency whereas other people in my town can run 20-30. That's... Just 4G.
    Unfortunately, fiber hasn't quite reached my area yet and the standard ADSL could only supply roughly 8Mbit/1Mbit, whereas the 4G can at least do 10-80Mbit down and 15-30Mbit Up.
    First year or so I had huge stability issues on the 4G, but buying an external antenna helped and eventually my ISP got things to a decent place and I've not really had any stability issues from their side(as far as I can tell) since.

    Sorry for the sidetrack.
    TL;DR
    Have almost scheduled Packet Loss, any ideas?
    My latency seems to have become somewhat unstable as of getting pfsense. Ideas?

    Bonus points:
    Any way to truly prioritise TS3 traffic(as a Client, not server)?
    It seems to work poorly at best.

    Extra information:
    The Dashboard System Information:

    Dashboard Gateway widget:

    I've run the traffic shaping wizard attempting to prioritize the games I play, as well as the TS3 misc option.
    I have a limiter that uses 5 schedules to make the network usable for everyone during the day.
    I do not notice any difference on the pingplotter when I use "my allowance". Whether I max it or use a few KB/s doesn't seem to impact it.

    Just ask if you need/want more information.
    Thanks in advance.

    I noticed the images are huge on here. Please tell me if you want me to just post links instead of images and I'll sort that ASAP.


  • LAYER 8 Netgate

    Coincidence does not prove causation.

    I don't see anything in the graphs you posted to indicate anything occurring on the hour as you suggest. To assert such you have to toss out all your samples that are not on the hour that are higher latency.

    If you are seeing something else, please share.



  • I believe that the poster's "on the hour" refers to the period between packet loss events rather that wall clock time. The periodic nature of the loss events looks pretty clear on the graph.



  • Dennypage is correct. There's no increase in latency every hour, but the red bar is packet loss, and it's pretty spot on every hour.

    Edit: I can see the confusion now. English is not my first language, sorry. What I meant with "On the hour" was 'With an hour between' each event. My bad.
    It seems to occur around minute 32 of each hour. That is to say, 10:32, 11:32, etc.

    Edit2:
    To add, here's another screenshot that shows the packet loss.

    (made this one just a link to prevent cluttering the thread too much.
    The timing is almost exactly the same on the other hours. Minute 31, seconds 30-45 it drops a few packages.
    I ping every 2 seconds by the way.
    So far I have had the pingplotter running for ~13 hours and it seems to drop 3-4 pings every hour. (48 dropped on 12 events, the 12:30 one seemingly absent for some odd reason)
    http://image.prntscr.com/image/031dab9411044ca0be8b64a08c9bc80f.png


  • LAYER 8 Netgate

    What packages are you running?



  • I have no additional packages installed.

    I have had Darkstat and RRD_Summary installed before, but not any more.(And they haven't been on since I started logging(by which I mean pingplotting))


  • LAYER 8 Netgate

    I could install a thousand different networks and not see what you are seeing.

    I would Diagnostics > Packet capture on LAN limited to the IP address of the pinging host (Host Address) and Protocol ICMP and see what it shows. Set the Count so something like 100000.

    With it so regular it out to be simple to capture the event. Perhaps the packets aren't even making it to the firewall for some reason.



  • Is this correct? Do I just wait now?
    http://image.prntscr.com/image/35aa487775714dc3a08cdc8b0f214a9a.png

    Edit: While looking I also noticed that I've had a second period where the latency is much better.
    http://image.prntscr.com/image/797c4eae213c494895992b3fa4d16edb.png
    For some reason the latency drops down to much better values and remain there for several minutes. I have no idea what caused this either. (Not that I mind, it's amazing and I would love to make it the norm.)


  • LAYER 8 Netgate

    I don't know how about you attach images to the post like everyone else?

    What are you pinging and from where? What is 192.168.1.1? Since that is a device outside WAN then change your packet capture to be on WAN with a Host address of 192.168.1.1 protocol ICMP.

    Since that is also your gateway monitoring IP address you will probably need to disable gateway monitoring to eliminate those pings from your capture or pull the capture file into wireshark to make sense of it.



  • My apologies, I use a screenshotting service so I rarely bother saving the images to my computer, I will do so and attach them from now on.

    192.168.1.1 is my 4G router and I'm pinging it from my computer.
    Network is set up as follows:

    4Grouter ->pfsense router -> network(including 192.168.0.10 which is where I am pinging from).

    I have never used wireshark, but I will download it and check it out.

    I attached the images from my previous post.
    Edit: actually, the second image is with the new settings you proposed.

    Appreciate the attempt to help me a lot. Being new to all this makes it rather daunting.






  • So, I ran wireshark and pingplotter side by side and noticed something odd, so I decided to pull the capture on pfsense early to try and figure it out and instead I got more confused.

    At 23:31:38 I had another packet loss event… Pingplotter lost all packages sent, but... The live capture on wireshark doesn't even have the packages there? It just jumps from 23:31:38 to 23:31:50.
    So I figured I'd check what the pfsense packet capture says and.... apparently it kept pinging?
    The confusion for me is that Pingplotter says the packages were lost. The Wireshark livecapture seems to say they were never sent and never expected a reply and yet the pfsense capture seems to say that the pinging continued, it just didn't pass back down the line...

    Unfortunately Wireshark crashed soon after so I lost the livecapture data.

    I feel like I'm way out of my depth, but I'm very willing to learn if anyone would like to guide me through this?
    Am I missing something or doing something wrong?

    edit:
    for clarification.

    192.168.1.1 is the 4G router. 192.168.1.2 is the pfsense router(also 192.168.0.1).
    192.168.0.10 is the pinging machine.







  • LAYER 8 Netgate

    Those TTL exceeded means the packet was likely ping-ponging between two devices that each think 192.168.0.10 is on the other device. Definitely not normal.

    What interface were those captures taken on?



  • Let's see. The one that mentions the TTL exceeding is Wireshark running livecapture on the pinging machine(192.168.0.10) itself. It's not from pfsense's packet capture.
    Guess that would mean it's running on LAN? This is where me being out of my depth really doesn't help.

    The TTL exceed occurs from the pfsense to the pinging machine. Here's another screencap of Wireshark running livecapture.
    I've named the IP's and limited it to only ICMP's. Other than that, Wireshark is running all default.

    Edit:
    Another weird thing happened around 00:35(About an hour ago from this post).
    Added another pingplotter screenshot that shows it.
    The average ping time dropped significantly.
    Before it was running between 0-8ms with an average of ~4ms…
    After 00:35 it just drops down to 0-3ms with an average of ~1ms

    Edit2:
    I try to name the attachments to make their source more obvious, but here's a key.

    LiveCaptureWireshark is wireshark running on 192.168.0.10 (the machine that is doing the pinging, also referred to as "Pinging Machine")
    PfsenseCapture is Pfsense's packet Capture log that I opened in wireshark. This runs on the WAN interface, ICMP protocol and Host Address 192.168.1.1(The 4G Router)
    The PingPlotCapture is pingplotter running on 192.168.0.10(Pinging Machine)






  • So, I've found out why the latency of the pings gets better…
    It hasn't made me any wiser though.
    If I start a file transfer, the latency of the pings goes down. If I pause the download, the latency goes up.
    Seems like it fasttracks the ping packages when there's other traffic, but for some reason doesn't when there isn't.

    Yeah... I have not a clue. I can replicate this though. Soon as I noticed I decided to try it by starting and pausing a file transfer and I can observe the ping latency drop when i start and increase when I pause, which is exactly the opposite of what I would expect.

    Edit:
    It is now 02:30 where I live so I will be heading to bed. I'll try to respond to any ideas first thing in the morning. Thanks so far.



  • LAYER 8 Netgate

    If livecapture is running on the pinging host then it is that machine that did not send pings between :38 and :50 in the capture above. If it does not send the requests, there will be no replies.

    You have something funky going on.



  • It might be counterintuitive at first, but the decrease in latency under load is normal behavior. As a general rule, a system processing a small number of packets per second will show much higher latency than a system processing a larger number of packets. There are a couple of reasons for this. The first reason is power save on the components involved in processing packets. This can be CPU, memory, bus, network  chip, etc. If there is nothing to do, components are generally placed into a lower power state to save energy. Once a component enters a lower power state, it takes time to bring the component back out to resume processing.

    The other reason is the use of short term spin locks in drivers. It is common these days to spin for a few microseconds after processing a packet to see if another packet arrives before returning to dependence upon asynchronous signaling methods such as interrupts. If another packet arrives within the duration of the spin, it's a huge win in terms of latency, and only a minor energy cost if it doesn't.

    None of this is likely related to your packet loss issue.

    @Einkil:

    So, I've found out why the latency of the pings gets better…
    It hasn't made me any wiser though.
    If I start a file transfer, the latency of the pings goes down. If I pause the download, the latency goes up.
    Seems like it fasttracks the ping packages when there's other traffic, but for some reason doesn't when there isn't.

    Yeah... I have not a clue. I can replicate this though. Soon as I noticed I decided to try it by starting and pausing a file transfer and I can observe the ping latency drop when i start and increase when I pause, which is exactly the opposite of what I would expect.



  • Mate I am using this application Kill Ping. It shows you how much jitter or ping has been reduced during any session on any game. I play overwatch so it shows results for overwatch.

    I hope this helps you. I know VPN should be the last resort but if nothing is helping you this should get you through.



  • @dennypage:

    It might be counterintuitive at first, but the decrease in latency under load is normal behavior. As a general rule, a system processing a small number of packets per second will show much higher latency than a system processing a larger number of packets. There are a couple of reasons for this. The first reason is power save on the components involved in processing packets. This can be CPU, memory, bus, network  chip, etc. If there is nothing to do, components are generally placed into a lower power state to save energy. Once a component enters a lower power state, it takes time to bring the component back out to resume processing.

    The other reason is the use of short term spin locks in drivers. It is common these days to spin for a few microseconds after processing a packet to see if another packet arrives before returning to dependence upon asynchronous signaling methods such as interrupts. If another packet arrives within the duration of the spin, it's a huge win in terms of latency, and only a minor energy cost if it doesn't.

    None of this is likely related to your packet loss issue.

    That's really interesting, didn't know that. I assumed that they would work independently of each other.

    I find programs that make bold claims quite sketchy and I doubt that routing is my issue, but I will have a look, cheers Imzhell.

    In other weird news… I viewed the full pfsense package capture... I can't find any anomalies in it. It starts just after midnight and continues to around 05:48. During those hours I had packet loss around minute 31, second 40 just as before, but I can't find anything that would indicate it at all in the log. The log seems to say that the pings kept on ticking despite pingplotter(the program that does the pinging) saying that it failed.
    I tried to attach it to the post, but it's too big. I can't find any option to cut it into bits?

    I also kept having packet loss until around 05:30... after which it disappeared for about 8 hours... and came back around 13:35... It's only 13:44 now, so I will keep my eyes open to see if it continues or if this last one was a coincidence.




  • I'm starting to believe that it isn't on my side…
    I ran the pingplotter for ~24 hours, I did some packet capturing etc and I couldn't find anything in particular so I tried to restart pfsense, tried to muck about with a few things to see if I could get any sort of response(I didn't) and I'm still getting packet loss at the same minute. Not a few minutes later or any other delay.
    Exact same minute. Minute 31, seconds 30-45.

    Think I'm going to call my ISP tomorrow and see if they can spot anything.


Log in to reply