PFsense Blocking Some Traffic



  • OK i have been through the "connectivity troubleshooting" page as well as the "unable to access some websites" Page
    nothing looks out of the ordinary, everything seems ok.

    the issue.....

    • unable to access amazon.co.uk from wifes phone via wifi but amazon.com loads (pc and other phones work fine

    • unable to access virgin tv go services from android tv box, all other phones/pc's work fine

    • unable to load ebay's app landing/home page on wifes phone via wifi but searching for items as well as watching/buying lists all load ok, all other phones/pc's are fine

    my phone loads virginmedia home page very slugishly as do ALL devices

    all the above issues vanish when using 4g data on phones (cant test on hardwired devices)

    traceroute all resolve to the same IP for amazon and ebay and virgintvgo from all devices.
    ping responds from all devices when pinging amazon,virgintvgo and ebay

    network is set up as such....
    virgin superhub3 in modem mode into netgear switch
    pfsense thin client info netgear switch

    the above 2 ports vlan'd (only 1 NIC on pfsense hardware)

    all pc's and android tv boxes are hardwired to netgear switch.
    TP link wifi access point hardwired to netgear switch
    wifi is ONLY used for phones/tablets and printer

    everything seems to be running fine apart from the above issues, these issues ONLY start when PFsense is running (if i unplug pfsense and run straight from superhub3 back in router mode none of these issues happen)

    things i have tried:

    • updating to pfsense 2.4.4.1 (no change to issues)

    • rebooting pfsense and modem and all devices (no change to issues)

    • fresh clean install of pfsense in its default configuration (no change to issues)

    • using googles DNS servers instead of virgin media's standard/autodetect ones (no change to issues)

    • checked MTU, both wan and lan listed as 1500

    • used grc dns benchmark tool as requested in another thread, no issues reported everything resolved

    pfsense hardware:
    10zig 58xx thin client, intel atom d2550, 2gb ram, 16gb ssd

    the only thing ive been able to find that could be causing this is the WAN quality graphs0_1544734271241_graph.jpg
    this seems to have improved after a wipe and clean install but still shows issues?0_1544734366848_after reload.jpg

    any other logs/info i can give you guys to help???


  • LAYER 8 Global Moderator

    Dude nothing is goiing to work with packet loss like that!!



  • @johnpoz funny you should say that... as everything is actually working.. speed tests, website browsing. video streaming, youtube it all works fine.......

    so how do i find out whats causing that packet loss?


  • LAYER 8 Global Moderator

    If your saying its working then might be just dropping pings to your gateway... But when quality sees high loss and thinks internet is down it can reset all your states..

    That is not good condition.. Are you pinging just your gateway - or did you change it to something else.



  • if standard is to ping the gateway then thats all dpinger is doing, i have a log running from think-broadband pinging my public IP address and its not showing the same packet loss that pfsense is :/
    the strange thing is that i was surfing the web and watching youtube (and mrs on facebook) while pfsense was claiming 100% packet loss for about 3 mins solid!
    so is this a "false flag"? that pfsense is reporting high packet loss because its pinging its own gateway?
    from what i can tell i can still browse the internet no problems although pfsense is reporting 100% packet loss :/


  • LAYER 8 Global Moderator

    While it might be false flag - ie pfsense gateway just not answering the PING..

    But when that happens pfsense can think internet is down and reset all states - then you loose connection.

    Under Advanced, Misc
    0_1544784586517_flushstates.png

    Or set gateway to always be up and disable monitoring.



  • @johnpoz ill check that tonight and see if its flushing the states.
    0_1544797787436_012bc9b315492987c25d811eae73d37560abd658-14-12-2018.png
    the spike of packet loss at around 10pm was before i allowed ICMP ping requests as a rule and the 2nd spike at around 11pm was when i rebooted pfsense

    never used thinkbroadband monitor before so no idea if those ping spikes/results are normal for my line (speedtest.net is ALWAYS below 20ms ping and 2ms jitter but that test is with an idle connection so best case scenario)
    think broadband pings once every second no matter the load on bandwidth



  • It doesn't flush/reset all states (box is unticked) and I've disabled gateway monitoring.
    But that was more of a side quest.....
    Anyone got any ideas about the original issue? It's still happening.
    Not being able to watch virgin TV go app on my android TV box is a killer (for the Mrs) and if I can't resolve it I'll have to rip out pfsense and ditch it.



  • Anyone got any other ideas?



  • @noob said in PFsense Blocking Some Traffic:

    Anyone got any other ideas?

    Yeah, a few questions:

    You are saying that your configuration is: modem->switch->pfSense? And other devices are connected to the switch? Just want to confirm this.

    What or where is the gateway monitor? Is it the modem? Google's DNS servers? Someplace else?

    The issues you're experiencing are more than likely DNS-related issues. How do you have DHCP and DNS configured in pfSense? Please post screen shots.



  • Yes virgin superhub3 into switch. Pfsense into switch (vlan'd) so all traffic passes through pfsense.
    DHCP is handled exclusively by pfsense.
    DNS is also via pfsense.
    Pfsense currently has virgin media's own DNS servers set.... 194.168.4.100 and 194.168.8.100
    These were filled in automatically (not by me)
    I have tried manually changing them to Google's DNS servers 8.8.8.8 and 8.8.4.4
    Same issues remained.

    Trying travertine and pinging the servers that wouldn't load via wife's phone work fine when pinging/tracing via pfsense.

    If it's a DNS issue why would the websites load on some devices but not others??
    I'll post screen shots later as I've had to take pfsense down for now



  • @tim-mcmanus 0_1545600362573_Interfaces.jpg 0_1545600370051_DHCP.jpg
    i hope these screen shots contain the info you requested?
    the DNS servers listed have all been automatically assigned (i assume via DHCP from the modem)

    the gateway listed, belongs to virgin media and again has been automatically filled in, however this is not my modems public IP address.
    i had to disable gateway monitoring as it was throwing up false information, claiming my gateway was offline yet i was still online with no issues (i'm guessing because it was monitoring virgin media's gateway and not my public IP?)

    MTU is automatically set to 1500 on both wan/lan.

    my virgin media superhub 3 is in modem only mode so DHCP is not active, neither is NAT or any form of firewalling



  • Thanks for posting those screen shots.

    What I didn't see was which DNS servers your DHCP server is giving out.

    Also, when you say pfSense is doing DNS, are you running the DNS resolver?



  • @tim-mcmanus 0_1545601740130_services.jpg 0_1545601744691_dns servers.jpg
    the DNS resolver is running (i have not changed this, so default configuration must be to have this on)

    did you need a screen shot of the DNS resolver general settings page?



  • No, this is good. Can you go to Diagnostics->DNS Lookup and run some queries for the sites you are having problems with? I am interested to see if Resolver (127.0.0.1) is timing out on any of those lookups.

    What can happen is this: Your ISP may be blocking DNS lookups to the root servers, which pfSense would normally do. That delay can cause a client timeout when looking for a site, and that client won't be able to get to that site temporarily. You'd need to do a second lookup, and then the query would be caches for any additional client lookups.

    What's happening in your situation, if I understand correctly, only some devices have a problem, and it's sporadic. It could be a symptom of lookups failing or timing out, and then the next device gets a working/cached DNS result from a subsequent and successful lookup.



  • @tim-mcmanus

    alt text
    alt text
    alt text

    all the DNS lookups i have tried show similar results, 127.0.0.1 being quicker than the others



  • When you have a device that cannot connect, run a DNS query from that device.

    This is an elusive issue.


  • LAYER 8 Global Moderator

    Its going to be VERY elusive if you have such high packet loss.. Sorry but dns is going to be crapshoot over such a connection because its going to be hit or miss..

    Most dns is always going to be UDP.. So you throw the ball over the fence and hope the person catches it but you don't know... Unless you get an answer - and which such a high loss connection he might of answered but you never get it.

    What is the average packet loss your seeing... Look on your quality graph..

    0_1545648946537_quaity.png

    And yeah pulling from a local cache is always going to be way faster then doing an actual query to some remote NS...

    With such high packet loss - I would expect horrible everything.. Sure tcp will retrans, but its going to be a horrible experience overall with such high packet loss if it actually is loss and just not your gateway answering pings... Do a sniff on your wan traffic.. Are you seeing lots of retransmits?



  • @johnpoz the packet loss from another post was a red herring, pfsense was not monitoring my modem, it was monitoring virgin media's gateway which is way out of my control.
    i setup "think broadband" to monitor my public IP (and so monitoring my own gateway) and packet loss was 0.11% max
    i have disabled pfsense gateway monitor, as it was monitoring the wrong thing and giving irrelevant into, and was easier than getting pfsense to monitor the correct gateway


  • LAYER 8 Global Moderator

    Well if you believe the problem is dns related.. look at your timing and any loss in unbound... Dump your stats

    unbound-control -c /var/unbound/unbound.conf stats_noreset

    What is your recursion time average, etc..
    What sort of % hit on cache are you getting, etc. etc.
    if your not getting a high amount of cache hits, you prob want to turn on prefetch and zero ttl. These can help with problems with long recursion times and or timeouts.

    total.recursion.time.avg=0.158804
    total.recursion.time.median=0.0505461

    total.num.queries=126887
    total.num.cachehits=110479

    So Im at about 87% cache hit rate...

    Look at the stats page in the gui.
    Status / DNS Resolver

    Are you seeing timeouts? You really should have all Zeros



  • @noob said in PFsense Blocking Some Traffic:

    @johnpoz the packet loss from another post was a red herring, pfsense was not monitoring my modem, it was monitoring virgin media's gateway which is way out of my control.
    i setup "think broadband" to monitor my public IP (and so monitoring my own gateway) and packet loss was 0.11% max
    i have disabled pfsense gateway monitor, as it was monitoring the wrong thing and giving irrelevant into, and was easier than getting pfsense to monitor the correct gateway

    I actually have pfSense monitoring a point on the Internet, not my modem. When I am experiencing issues, I want to test a point off of my ISP's network. Yes, on occasion it will trigger some false-positives, but generally speaking, I won't "feel" that issue on the network. When I am suspicious that my network is having issues, then I can check the monitor to see if/what the loss is.

    If you want to monitor the quality of your connection, try this smokeping tool: https://www.dslreports.com/smokeping


  • LAYER 8 Global Moderator

    heheeh - yeah I think I know how to monitor my connection... But thanks ;)



  • @johnpoz said in PFsense Blocking Some Traffic:

    heheeh - yeah I think I know how to monitor my connection... But thanks ;)

    Not you, the other guy. Although, I didn't want to assume... ;)



  • @johnpoz total.recursion.time.avg=0.125316
    total.recursion.time.median=0.0505173
    just booted up pfsense as it took it down again last night
    total.num.queries=55
    total.num.cachehits=4

    i have just turned prefetch on to see what difference it makes

    DNS Reseolver timeout A, timeout AAAA and timeout other are all zero's


  • LAYER 8 Global Moderator

    Well stats right after it boots not going to point to any sort of problem.



  • @johnpoz how long would you like me to leave it before re-posting stats?
    hours, days? i dont know how long ill need to collect data for before it becomes of any use for fault diagnosis


  • LAYER 8 Global Moderator

    After you have been seeing a dns related problem.

    Total number of queries 55.. There is nothing on your network doing anything at that point.. Notice mine was 126,000



  • @johnpoz

    im not sure this is even a DNS issue, i have no idea what is causing the issue, that smokeping test tool posted further up is currently reporting 0.00% packet loss accross all 3 servers that are pinging me.

    the issue with ebay app landing [page not loading on wifes phone is constant (it never loads unless we switch to 3g/4g data or remove pfsense) all the other pages/search/buy functions work all the time.... ebay app loads on other devices every single time no issues.
    the issue with the android tv box and virgin tv go allowing me to login, loading menu's and previews and up to date live tv guide but not playing actual programs is a constant while pfsense is running, but virgin tv go on all other devices works even with pfsense in place.

    i cant see anything in the logs to suggest traffic is being blocked, makes no sense as to why i would be blocking only certain devices

    total.num.queries=1047
    total.num.queries_ip_ratelimited=0
    total.num.cachehits=158
    total.num.cachemiss=889
    total.num.prefetch=14
    total.num.zero_ttl=0
    total.recursion.time.avg=0.138458
    total.recursion.time.median=0.0890953

    i would love to learn more about pfsense (which is why i got it to start with) but these issues dont seem to make any sense.

    i did notice you had over 100k queries but i have no idea how long your box has been up and running, could be months

    from the first few mins of booting pfsence up to now 2 hours uptime, the hit rate seems to be hovering steady at 14-16%

    is there anything i should be looking at on the devices in question??


  • Netgate Administrator

    You might want to grab a packet capture onb the LAN filtered by the IP of the offending device.
    Try to do as little as possible on the phone just to minimise the traffic. Once the menus have failed to load check the pcap.

    Steve


  • LAYER 8 Global Moderator

    Current uptime: 22 Days 03 Hours 48 Minutes 58 Seconds

    That would of been since updated to p1, current stats show... But that is not always related to when unbound restarted..

    1047 queries is not a lot of queries.. Do you have not have your stuff pointing to pfsense? Do you only have like 1 device on your network or something? How long has unbound been up to get your 1047 queries?



  • It was only me on my pc and phone last night.
    The system uptime was 2 hours 12 mins when I saw the 1000 queries which is approx 500 per hour.
    Your 126k decided by 22 days up time is approx 240 per hour.
    I have no idea how long unbound was running, I was just going by system up time.

    All traffic should be going through pfsense as all traffic to and from the modem is tagged via vlan
    I've taken pfsense down again at the moment, will boot it up again tonight and leave it running for a few days (and do the Lan side packet sniffing sujested above)
    How can I find inbounds "uptime" if it's different from system uptime?



  • @noob said in PFsense Blocking Some Traffic:

    How can I find inbounds "uptime" if it's different from system uptime?

    Easy : check the DNS log ! Or ask the system : ps ax | grep 'unbound'
    unbound is a service that is restarted rather often.


  • LAYER 8 Global Moderator

    time up will also be in the stats
    time.up=81609.360209

    Which would be in seconds.

    And to be honest most everything on my network points to downstream pihole, so that reduces the number of queries unbound sees because pihole only asks unbound for stuff that has not been blocked, and also it caches.. So if say 3 things asked for xyz.com unbound would only see the 1 from pihole, then piehole would serve the answer up to the clients via its cache.


Log in to reply