[SOLVED ]Squid 0.4.44_25 / assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd)


  • Hi all,

    I know, never change a running system but I've upgraded yesterday to the recent version of Squid (Squid 0.4.44_25) and Squid is now crashing every few minutes with the message:

    assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd)

    This bug is known since (I guess 2018) but the bug tracker of squid-cache.org is down but from other related posts on mailingslists I learned that the reason is unclear.

    Is there a way to downgrade squid to the last version? Or any other workaround than disable HTTPS-Support?

    Thanks in advance

    Chris

    My current config:

    ===group

    This file is automatically generated by pfSense

    Do not edit manually !

    http_port 10.41.0.6:3128
    http_port 127.0.0.1:3128 intercept
    icp_port 0
    digest_generation off
    dns_v4_first on
    pid_filename /var/run/squid/squid.pid
    cache_effective_user squid
    cache_effective_group proxy
    error_default_language de
    icon_directory /usr/local/etc/squid/icons
    visible_hostname fw.hahn-gasfedern.de
    cache_mgr it-support@hahn-gasfedern.de
    access_log /var/squid/logs/access.log
    cache_log /var/squid/logs/cache.log
    cache_store_log none
    netdb_filename /var/squid/logs/netdb.state
    pinger_enable on
    pinger_program /usr/local/libexec/squid/pinger

    logfile_rotate 365
    debug_options rotate=365
    shutdown_lifetime 3 seconds

    Allow local network(s) on interface(s)

    acl localnet src 10.41.0.0/16
    forwarded_for off
    via off
    httpd_suppress_version_string on
    uri_whitespace strip
    dns_nameservers 10.41.40.22 10.41.40.23
    acl dynamic urlpath_regex cgi-bin ?
    cache deny dynamic

    cache_mem 1024 MB
    maximum_object_size_in_memory 256 KB
    memory_replacement_policy heap GDSF
    cache_replacement_policy heap LFUDA
    minimum_object_size 0 KB
    maximum_object_size 4 MB
    cache_dir ufs /var/squid/cache 100 16 256
    offline_mode off
    cache_swap_low 90
    cache_swap_high 95
    acl donotcache dstdomain '/var/squid/acl/donotcache.acl'
    cache deny donotcache
    cache allow all

    Add any of your own refresh_pattern entries above these.

    refresh_pattern ^ftp: 1440 20% 10080
    refresh_pattern ^gopher: 1440 0% 1440
    refresh_pattern -i (/cgi-bin/|?) 0 0% 0
    refresh_pattern . 0 20% 4320

    #Remote proxies

    Setup some default acls

    ACLs all, manager, localhost, and to_localhost are predefined.

    acl allsrc src all
    acl safeports port 21 70 80 210 280 443 488 563 591 631 777 901 7377 3128 3129 1025-65535
    acl sslports port 443 563 7377

    acl purge method PURGE
    acl connect method CONNECT

    Define protocols used for redirects

    acl HTTP proto HTTP
    acl HTTPS proto HTTPS
    acl allowed_subnets src 10.41.0.0/16
    acl unrestricted_hosts src '/var/squid/acl/unrestricted_hosts.acl'
    acl whitelist dstdom_regex -i '/var/squid/acl/whitelist.acl'
    acl block_reply_mime_type rep_mime_type -i '/var/squid/acl/block_reply_mime_type.acl'
    http_access allow manager localhost

    http_access deny manager
    http_access allow purge localhost
    http_access deny purge
    http_access deny !safeports
    http_access deny CONNECT !sslports

    Always allow localhost connections

    http_access allow localhost

    request_body_max_size 0 KB
    delay_pools 1
    delay_class 1 2
    delay_parameters 1 -1/-1 -1/-1
    delay_initial_bucket_level 100

    Do not throttle unrestricted hosts

    delay_access 1 deny unrestricted_hosts
    delay_access 1 allow allsrc

    Reverse Proxy settings

    Package Integration

    url_rewrite_program /usr/local/bin/squidGuard -c /usr/local/etc/squidGuard/squidGuard.conf
    url_rewrite_bypass off
    url_rewrite_children 100 startup=10 idle=4 concurrency=0

    Custom options before auth

    acl noSSLInterception ssl::server_name_regex knownsite.com

    acl ssl_bypassIP dst xxx.200.255.96/32

    These hosts do not have any restrictions

    http_access allow unrestricted_hosts

    Always allow access to whitelist domains

    http_access allow whitelist

    Block access with mime type in the reply

    http_reply_access deny block_reply_mime_type
    acl sglog url_regex -i sgr=ACCESSDENIED
    http_access deny sglog

    Setup allowed ACLs

    Allow local network(s) on interface(s)

    http_access allow allowed_subnets
    http_access allow localnet

    Default block all to be sure

    http_access deny allsrc

    ===


  • Well this was a real challenge but finally we solved it. We asked Netgate Support (Enterprise) for assistance and those guys were pretty responsive but in the end, we were on our own to make this thing work. They checked our setup, found no issues and told us to wait for the next update of Squid.

    We've installed a new box with PFsense and made almost everything from scratch, except for the users and certificates (which was on it's own a *itch to complete) which we restored from the old box.

    And a few minutes were were back on, oh heck, we had the same crashes, was pretty close to start working is a butcher or gardener.

    The logs were full with errors like... Tons of it...

    1591073788.173 0 10.41.49.28 NONE/409 4015 CONNECT init.itunes.apple.com:443 - HIER_NONE/- text/html
    1591073788.173 0 10.41.49.28 NONE/000 0 NONE error:transaction-end-before-headers - HIER_NONE/- -
    1591073788.188 0 10.41.49.28 NONE/200 0 CONNECT 104.125.6.88:443 - HIER_NONE/- -
    1591073788.188 0 10.41.49.28 NONE/409 4015 CONNECT init.itunes.apple.com:443 - HIER_NONE/- text/html
    1591073788.188 0 10.41.49.28 NONE/000 0 NONE error:transaction-end-before-headers - HIER_NONE/- -

    We stopped HTTPS/SSL Intercetion and I was pretty close to forget it but it kept my up all night and I started digging in the mud. After looking for the "NONE/409"-errors I found in the Netgate Docs the remarks about DNS which lead us to rest and in the end we got this baby, more or less, pretty smooth up and running.

    This is not fixing the existing Bug in Squid but changes the environment that it won't escalate to a way where Squid is crashing.

    We had 3 issues:

    1. DNS

    2. DNS Resolver

    3. Squid: ACL regexp errors & missing options

    4. DNS (the most important thing)

    We used Google DNS (8.8.8.8/8.8.4.4) and 2 other ones from OpenDNS before as our main Namesevers. We changed them to the ones from our ISP. If you have a lot of traffic to CDN-Providers this is key.

    We have also 2 DNS-Servers within in our AD, we switched off DNS-Cache via the Registry, disabled all DNS-caching via GPO on the clients. The "do not cache" policy doesn't work well Win2k8R2 so we created a script and delete them via the scheduler every 5 minutes.

    Both AD-DNS use now DNS-Resolver as forwarder. Make sure that DNS-Resolver and the AD-DNS-Servers are always sync and respond with the same IP-Addresses on Queries. Verify that your client has DNS-caching disabled and is resolving the same IP as your PFSense & AD-DNS.

    1. DNS Resolver:

    Make sure that DNS-Resolver is working and utilized by your internal DNS-Servers. Since CDNs are changing IP-Adresses almost every few minutes we set the "Minimum TT" to 43200, this value is maybe to high but we made a pretty good experience with it so far. Enable "DNS Query Forwarding" so your upstream DNS-Servers are used.

    1. Squid

    We used "Splice Whitelist, Bump otherwise" as MITM-Mode and had a bunch of domains listed in ACL/Whitelist area in the style like ".whatsapp.com". This almost never worked and when I took a look in the whitelist.acl-file it was empty. It was not really empty but all the lines starting with the . were not visible in vi. After we changes the domains to (^|.)whatsapp.com$ it looked like the files was much better working.


    We are now in "Custom Mode" with the following config (added to Custom Options SSL/MITM):

    acl DiscoverSNIHost at_step SslBump1
    acl step1 at_step SslBump1

    acl noSSLInterception ssl::server_name_regex (^|.)apple.com$
    acl noSSLInterception ssl::server_name_regex (^|.)cdn-apple.com$
    acl noSSLInterception ssl::server_name_regex (^|.)icloud.com$
    acl noSSLInterception ssl::server_name_regex (^|.)icloud-content.com$
    acl noSSLInterception ssl::server_name_regex (^|.)itunes.com$
    acl noSSLInterception ssl::server_name_regex (^|.)mzstatic.com$
    ...
    acl noSSLip dst xxx.123.xxx.96/32
    acl noSSLip dst 149.xxx.xxx.0/22
    acl noSSLip dst xxx.xxx.172.0/22
    ...
    ssl_bump peek step1
    ssl_bump splice noSSLInterception
    ssl_bump splice noSSLip
    ssl_bump peek DiscoverSNIHost
    ssl_bump bump all


    And in Custom Options (before auth)

    client_persistent_connections off

    Also add 127.0.0.1 to "use alternate DNS Servers for the Proxy Server". DNS-Resolver should respond on 127.0.01

    Enable "Resolve DNS IPv4 first".


    After those changes Squid is working almost as usual, the SSL-Errors are gone and our business gets no longer interrupted. Even the Apple-App store, Gotomeeting, Teams and Adobe-CC are working without flaws. Awesome.

    It might not be the perfect setup but we are still do testing and improve the settings.

    Chris


  • Also:

    if you are using SquidGuard, disable "Clean Advertising" when you PFsense-GUI is running on HTTPS. Squidguard is replacing advertisings with a pixel which is loaded from the PFsense box like "http://[IP of your box]/sgerror.php...

    This breaks HTTPS and if your WEBif is running on a non-standard port nginx is reporting errors in your system log.

    Chris


  • @CaliPilot said in [SOLVED ]Squid 0.4.44_25 / assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd):

    Squid

    We used "Splice Whitelist, Bump otherwise" as MITM-Mode and had a bunch of domains listed in ACL/Whitelist area in the style like ".whatsapp.com". This almost never worked and when I took a look in the whitelist.acl-file it was empty. It was not really empty but all the lines starting with the . were not visible in vi. After we changes the domains to (^|.)whatsapp.com$ it looked like the files was much better working.

    Thanks for the info, redmine issue created: https://redmine.pfsense.org/issues/10654


  • @CaliPilot said in [SOLVED ]Squid 0.4.44_25 / assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd):

    We used "Splice Whitelist, Bump otherwise" as MITM-Mode and had a bunch of domains listed in ACL/Whitelist area in the style like ".whatsapp.com". This almost never worked and when I took a look in the whitelist.acl-file it was empty. It was not really empty but all the lines starting with the . were not visible in vi. After we changes the domains to (^|.)whatsapp.com$ it looked like the files was much better working.

    Fixed in the latest Squid pkg
    Please update


  • @CaliPilot
    Not sure if you have already read through this but here it is
    https://forum.netgate.com/topic/100342/guide-to-filtering-web-content-http-and-https-with-pfsense-2-3

    to prevent these issue you need to use the following
    WPAD (or manual set)
    Transparent Proxy to catch http traffic the WPAD misses
    SSL Man In the Middle Filtering SPLICE ALL catch https traffic the WPAD misses


  • @aGeekhere Sorry for the late response. I have my setup now running for weeks without WPAD or anything like that and i have no issues. The key was to have solid DNS settings on PFsense, Windows DNS and on our clients and now it works like charm. Sometimes we see SSL-Errors on sites running on Akamai (or other CDNs) but only for a few minutes.

    Chris


  • I have the same problem and it is driving me nuts. Every day when office hours begins, squid crashes with this error. On 2.4.4p3 suid was rock solid...:/
    The only thing i could do from UI is to delete the cache and then squid starts, otherwise it will not start from services.
    I have no DNS issues.

    
    2020-07-31 08:46:56 [45559] loading dbfile /var/db/squidGuard/Misc/domains.db
    2020-07-31 08:46:56 [45559] logfile not allowed in acl other than default
    2020/07/31 09:02:56 kid1| assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd)"
    2020/07/31 09:02:56 kid1| Starting Squid Cache version 4.10 for amd64-portbld-freebsd11.3...
    2020/07/31 09:02:56 kid1| Service Name: squid
    2020-07-31 09:02:56 [53246] (squidGuard): can't write to logfile /var/log/squidGuard/squidGuard.log
    2020-07-31 09:02:56 [53246] New setting: logdir: /var/squidGuard/log
    2020-07-31 09:02:56 [53246] New setting: dbhome: /var/db/squidGuard
    2020-07-31 09:02:56 [53246] init domainlist /var/db/squidGuard/blk_blacklists_ads/domains
    2020-07-31 09:02:56 [53246] loading dbfile /var/db/squidGuard/blk_blacklists_ads/domains.db
    2020-07-31 09:02:56 [53246] init urllist /var/db/squidGuard/blk_blacklists_ads/urls
    
    
    
    Jul 31 09:02:56 	kernel 		pid 43401 (squid), jid 0, uid 100: exited on signal 6
    Jul 31 09:02:57 	kernel 		pid 52412 (squid), jid 0, uid 100: exited on signal 6
    Jul 31 09:02:58 	kernel 		pid 55101 (squid), jid 0, uid 100: exited on signal 6
    Jul 31 09:02:59 	kernel 		pid 58638 (squid), jid 0, uid 100: exited on signal 6
    Jul 31 09:03:00 	kernel 		pid 61188 (squid), jid 0, uid 100: exited on signal 6
    Jul 31 09:03:01 	kernel 		pid 63750 (squid), jid 0, uid 100: exited on signal 6
    Jul 31 09:03:17 	Squid_Alarm 	68674 	Squid has exited. Reconfiguring filter.
    Jul 31 09:03:17 	Squid_Alarm 	68975 	Attempting restart...
    Jul 31 09:03:20 	Squid_Alarm 	71372 	Reconfiguring filter...
    Jul 31 09:03:20 	check_reload_status 		Reloading filter
    Jul 31 09:03:22 	php-fpm 	28232 	/rc.filter_configure_sync: [squid] Installed but not started. Not installing 'nat' rules.
    Jul 31 09:03:22 	php-fpm 	28232 	/rc.filter_configure_sync: [squid] Installed but not started. Not installing 'pfearly' rules.
    Jul 31 09:03:22 	php-fpm 	28232 	/rc.filter_configure_sync: [squid] Installed but not started. Not installing 'filter' rules. 
    

    Help please.. 😰


  • @madalacu were you able to resolve this, I am started getting this issue, became a nightmare for me, please help


  • @vijay7 Try to update the squid package to the latest version and see...
    For me the problem remains but squid threads are able to restart in the latest version.. so it is working...


  • Tried that already, no difference, atleast twice a day squid service is stopping.


  • Hello all

    This is absolutely NOT a solved problem. Someone should change this. I have several netgate devices with SQUID and SQUIDGARD installed. All of them has this problem. The SQUID service along with the SQUIDGARD service stops several times a day. I have been using SQUID /SQUIDGARD since 2015 . This problem started in 2019 after an upgrade. With the latest upgrade of Pfsense firmware and SQUID /SQUIDGARD it has become terrible. I have to manually start the services several times a day. For us using pfsense without SQUID is not an option and my staff is really questioning why we continue with netgate. The above solutions was not a solution for us. Its still the same.
    The error message in SQUID logs
    assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd):
    Does anyone has any idea of this. I would hate for the first time in a very long time to be forced to go to other routers.

    /Toby


  • Again last week was terrible for me..same problem...i was gessing sites and blindly blacklisted them trying to solve this problem...it's a never ending storry.
    Very very annoying and time consuming!


  • Can you test it on the latest 2.5 snapshot?


  • Finally, i have moved to standalone squid proxy, and I am not getting any issues in standalone, but we have another machine running pfsense and squid, same in my case as well, our company is asking about this issue, don't know why everyday morning 9 AM squid will be down, I had to put someone in a day to continuously monitor because even the watchdog is not able to start this.


  • My Squid also start crashing.
    In Friday 13 :) evening Squid and SquidGuard services stopped working.
    When i tried to run it from services - webpages was opening, but after few seconds Squid services stopped again.
    I have rebooted server, but no luck - after few seconds Squid stop working and pages don't load.
    Then i disabled MITM and Squid falling is stopped.
    We are using 2.4.5-RELEASE-p1 of Pfsense and 0.4.44_35 Squid with 1.16.18_9 SquidGuard.
    Where/which logs i should check for find what cause this crashing?


  • Now this morning the problem is really bad. Several times SQUID is stopping. We did not have any other choice then reinstalling our old Fortigate and pay the licenses for the proxy. Anyone heard of a solution of this issue ?
    Someone mentioned to test a 2.5 snapshot. Maybe a bit risky in a production environment


  • Today enabled MITM mode. After 2 hours Squid is stop working. Which logs check?


  • Is this issue fixed yet?



  • @tobyswe where did you see this errors? in which logs? can you write? I want check do i have the same errors or is different...


  • This post is deleted!

  • Hi,

    After I updated the squid to the 0.4.44_36 version it started to crash and show the same message inside the cache.log file.

    I may be wrong, but I saw a correlation between IP 13.224.211.126 (Amazon) and the problem. So, I bypassed this network from the proxy to see if the problem can be fixed.

    pfsense: 2.4.5-RELEASE-p1
    squid: 0.4.44_36

    2021/01/12 13:46:33 kid1| assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd)"
    2021/01/12 13:46:35 kid1| assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd)"
    2021/01/12 13:46:37 kid1| assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd)"
    2021/01/12 13:46:38 kid1| assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd)"
    2021/01/12 14:58:10 kid1| assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd)"
    2021/01/12 14:58:12 kid1| assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd)"
    2021/01/12 14:58:14 kid1| assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd)"
    2021/01/12 14:58:16 kid1| assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd)"
    2021/01/12 14:58:17 kid1| assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd)"
    2021/01/12 14:58:19 kid1| assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd)"

    [Tue Jan 12 13:46:35 2021].337 438 10.15.31.22 NONE/200 0 CONNECT 13.224.211.105:443 - ORIGINAL_DST/13.224.211.105 -
    [Tue Jan 12 13:46:38 2021].682 475 10.15.31.22 NONE/200 0 CONNECT 13.224.211.105:443 - ORIGINAL_DST/13.224.211.105 -
    [Tue Jan 12 14:58:12 2021].412 477 10.15.31.22 NONE/200 0 CONNECT 13.224.211.126:443 - ORIGINAL_DST/13.224.211.126 -
    [Tue Jan 12 14:58:16 2021].011 424 10.15.31.22 NONE/200 0 CONNECT 13.224.211.126:443 - ORIGINAL_DST/13.224.211.126 -


  • @volnei hey what made to think, that amazon IP is the culprit?


  • So ... as I said, I'm not sure about that. It was more due to the coincidence in access times.
    for example:
    2021/01/12 13:46:38 kid1 | assertion failed: http.cc:1533: "! Comm :: MonitorsRead (serverConnection-> fd)"
    [Tue Jan 12 13:46:38 2021] .682 475 10.15.31.22 NONE / 200 0 CONNECT 13.224.211.105:443 - ORIGINAL_DST / 13.224.211.105 -

    When I bypass the proxy network, the problem no longer occurred.


  • @volnei I have the same problem. The problem, even if it was solved, surfaced again in the latest squid versions 0.4.44_35 - 0.4.44_36.

    2021/01/14 09:30:03 kid1| assertion failed: http.cc:1533: "!Comm::MonitorsRead(serverConnection->fd)"
    2021/01/14 09:30:03 kid1| Starting Squid Cache version 4.10 for amd64-portbld-freebsd11.3...
    2021/01/14 09:30:03 kid1| Service Name: squid


  • @a18g3 could you test it on the latest pfSense 2.5 snapshot?
    it uses squid 4.13