PfBlockerNG SoNewConn Issues



  • Hi all,

    Since the latest patch to pfSense I've been getting these errors:

    
    Jan 5 12:07:51	kernel		sonewconn: pcb 0xfffff80158f7a740: Listen queue overflow: 1537 already in queue awaiting acceptance (228 occurrences)
    
    

    They occur once a minute, and start at some interval x after a reboot depending on how high I set sonewconn. I usually can tell when it starts happening because pfSense DNS resolution dies; it takes minutes to load a website. I got all the way up to 15k and it would take like 3 days before they would start occurring but I couldn't get rid of them except by turning off pfBlockerNG.

    It took me a long time to isolate this issue to pfBlockerNG as there is nothing in Google search that points to it, and doing netstat on that pcb shows no entries. I had to go back to the basics and just start turning off plugins one at a time to see what isolated it, and it's pfBlockerNG.

    System Build:
    Version 2.4.2-RELEASE-p1 (amd64)
    CPU Type Intel(R) Atom(TM) CPU C2758 @ 2.40GHz
    System Super Micro C2758

    pfBlockerNG net 2.1.2_2

    I'm using the 4 onboard NICs as well as an addon 520-DA1 with vLANs and configured per wiki for ix and igb ethernet interfaces. Other plugins in use: ACME, HAProxy, Cron, and pfBlockerNG; no others.





  • I'm aware of that glitch but the gateway itself is still accessible on my end, especially if I use IP addresses. I don't actually get a timeout, just long loading times over DNS when a name is requested I'm guessing because the listen queues are filled. Turning off pfBlockerNG immediately solves the issue. I am not convinced this is the same issue as the 502 gateway bug mentioned in the other thread..



  • Did you look at pfblockerng logs ? System, Resolver logs? etc

    What's the size of you DNSBL db in regards to your memory?
    I have about 1M DNSBL entries with 8 GB  of memory. When I was running on a 2.5GB system, I had to limit to about 400K.

    What is your Resolver configuration?



  • I have 32GB of memory, that is not an issue ( though I do have 2.5M ish entries in DNSBL ).

    The other logs do not show anything with this particular error. Certainly no smoking guns. The sonewconn error does take up the whole of that particular log file though since it basically prints once a minute.



  • Do you have the Dashboard open all the time?
    Or the pfblockerNG alerts tab with auto-refresh ?



  • I don't have it open all the time; I hardly look at pfSense. Well, except for times like now where it gives me grief, but lately the screens I've been swapping back and forth are the system tunables, reboot, and logs. The alerts tab does have auto refresh and autoresolve ticked, but does that matter if you're not on the page?

    Scrolling again through all the logs, the only oddity I can find ( most of them are empty with the last entries from the reboot date or the daily / dhcp /dns tasks ) is in the DNSBL log specifically there are multiple repeated entries and nothing else:

    
    DNSBL Reject HTTPS,Jan 05 20:34:51,events.gfe.nvidia.com
    DNSBL Reject HTTPS,Jan 05 20:34:52,events.gfe.nvidia.com
    DNSBL Reject HTTPS,Jan 05 20:34:53,events.gfe.nvidia.com
    DNSBL Reject HTTPS,Jan 05 20:34:56,events.gfe.nvidia.com
    DNSBL Reject HTTPS,Jan 05 20:34:57,events.gfe.nvidia.com
    DNSBL Reject HTTPS,Jan 05 20:34:58,events.gfe.nvidia.com
    DNSBL Reject HTTPS,Jan 05 20:34:59,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 05 20:34:59,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 05 20:34:59,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 05 20:34:59,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 05 20:34:59,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 05 20:34:59,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 05 20:35:01,events.gfe.nvidia.com
    DNSBL Reject HTTPS,Jan 05 20:35:02,events.gfe.nvidia.com
    DNSBL Reject HTTPS,Jan 05 20:35:03,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 05 20:35:03,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 05 20:35:03,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 05 20:35:03,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 05 20:35:03,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 05 20:35:03,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 05 20:35:03,events.gfe.nvidia.com
    DNSBL Reject HTTPS,Jan 05 20:35:06,events.gfe.nvidia.com
    
    

    This continues ad nauseum for days as far as I can tell. Is it possible these repeaters / phone homes are causing the issue? There is nothing of note in the error or pfblockerng logs, they just state the last reload / refresh was successful.


  • Moderator

    That Domain is from MS Outlook mobile app:

    https://social.technet.microsoft.com/Forums/ie/en-US/c29a50e7-9433-4fa1-b2f3-24ee93299810/urls-needed-for-office-2016-online-help?forum=Office2016setupdeploy

    Not sure if it should be blocked or not. But if it's hitting DNSBL so frequently. You could add an unbound host override and point it to 127.0.0.1. This will bypass DNSBL completely.



  • I don't have any mobile apps for Office but I do have an Office install. Nvidia is hitting it just as frequently so I added it as well and removed my sonewconn system tunable edits. It usually takes hours for the issue to make itself known though so I won't know for a bit if this fixes it.



  • So this made it about yay long before the error came back. Interestingly enough the DNSBL just terminates on the 7th; ie, service is still running but no further log entries are made. Also, this log file must be gigantic because trying to view it from the webgui slows the browser to a crawl if it doesn't just outright crash it.

    
    DNSBL Reject HTTPS,Jan 07 05:45:03,events.gfe.nvidia.com
    DNSBL Reject HTTPS,Jan 07 05:45:03,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 07 05:45:03,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 07 05:45:03,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 07 05:45:04,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 07 05:45:04,mobile.pipe.aria.microsoft.com
    DNSBL Reject HTTPS,Jan 07 05:45:04,mobile.pipe.aria.microsoft.com
    
    

    The other thing is, these entries came back? I have the unbound host overrides still configured:

    Domain Overrides
    Domain Lookup Server IP Address Description Actions
    mobile.pipe.aria.microsoft.com 127.0.0.1 Override pfBlockerNG
    events.gfe.nvidia.com 127.0.0.1 Override pfBlockerNG

    Again, turning off pfBlockerNG immediately fixes the issue.

    Edit:

    Something weird happened while I'm troubleshooting. For now I'm trying to ascertain if it's a set of DNSBL entries that are causing this. Turning pfBlockerNG back on using only EasyList throws this:

    
    Jan 9 19:55:56	php-fpm	90038	[pfBlockerNG] Starting cron process.
    Jan 9 19:55:56	php-fpm	90038	/pkg_edit.php: The command '/sbin/ifconfig 'ix0.520' delete '10.10.10.1'' returned exit code '1', the output was 'ifconfig: ioctl (SIOCDIFADDR): Can't assign requested address'
    
    

    pfBlockerNG is setup with firewall rules on vLANs 1, 520, and 540. The listening interface is set to vLAN 520.


  • Moderator

    Set the host override to "0.0.0.0" instead of "127.0.0.1"…



  • Did that, had no affect. 0.0.0.0 or 127.0.0.1.

    Well, did figure something out at least. For anyone else who runs into this issue it has to be a new block add in my block lists. They worked fine in the past but something got updated in a list somewhere and now it's causing this issue. Can't rule out the nvidia or MS repeat events being the culprit but adding them to Unbound doesn't have any affect. I stripped pfBlockerNG all the way down to just EasyList content and have not had any issues for the past week. Since it takes like 1-3 days before it starts happening though it's going to be a real bugbear to troubleshoot which particular list(s) are the culprits.

    My DNSBL feeds are basically all from Firebog (wally3k.github.io) for reference.


  • Moderator

    I sent you a PM…


Log in to reply