• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

So many filterdns instances…

2.1 Snapshot Feedback and Problems - RETIRED
10
57
18.8k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D
    dhatz
    last edited by Jan 5, 2013, 11:36 PM Jan 5, 2013, 11:06 PM

    Seems that running latest snapshot filterdns still has some issues

    clog system.log | tail

    Jan  6 00:58:26 fw php: : Creating rrd update script
    Jan  6 00:58:28 fw php: : Forcefully reloading IPsec racoon daemon
    Jan  6 00:58:28 fw php: : Restarting/Starting all packages.
    Jan  6 00:58:30 fw dhclient[17095]: DHCPREQUEST on em0 to x.y.z.w port 67
    Jan  6 00:58:30 fw dhclient[17095]: DHCPACK from x.y.z.w
    Jan  6 00:58:30 fw dhclient: RENEW
    Jan  6 00:58:30 fw dhclient: Creating resolv.conf
    Jan  6 00:58:30 fw dhclient[17095]: bound to x.y.z.201 – renewal in 43200 seconds.
    Jan  6 00:58:31 fw php: : Resyncing OpenVPN instances for interface WAN.
    Jan  6 00:58:31 fw kernel: pid 50069 (filterdns), uid 0: exited on signal 11 (core dumped)
    Jan  6 00:58:32 fw php: : IPSEC: One or more IPsec tunnel endpoints has changed its IP. Refreshing.
    Jan  6 00:58:34 fw login: login on ttyv0 as root
    Jan  6 00:58:36 fw check_reload_status: Updating all dyndns
    Jan  6 00:58:36 fw check_reload_status: Restarting ipsec tunnels
    Jan  6 00:58:36 fw check_reload_status: Restarting OpenVPN tunnels/interfaces
    Jan  6 00:58:36 fw check_reload_status: Reloading filter
    Jan  6 00:58:43 fw php: : IPSEC: One or more IPsec tunnel endpoints has changed its IP. Refreshing.
    Jan  6 00:58:47 fw kernel: pid 87410 (filterdns), uid 0: exited on signal 11 (core dumped)
    Jan  6 01:01:22 fw php: /firewall_rules.php: Successful login for user 'admin' from: 192.168.100.12
    Jan  6 01:01:22 fw php: /firewall_rules.php: Successful login for user 'admin' from: 192.168.100.12

    uname -a

    FreeBSD fw.localdomain 8.3-RELEASE-p5 FreeBSD 8.3-RELEASE-p5 #1: Sat Jan  5 13:23:58 EST 2013     root@snapshots-8_3-i386.builders.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8  i386

    It had the same issue with previous snapshots:

    Jan  5 00:17:03 fw kernel: pid 48375 (filterdns), uid 0: exited on signal 11 (core dumped)
    Jan  5 03:25:34 fw kernel: pid 45341 (filterdns), uid 0: exited on signal 11 (core dumped)
    Jan  5 03:36:13 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
    Jan  5 03:46:55 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
    Jan  5 03:57:37 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
    Jan  5 04:08:19 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
    Jan  5 04:19:01 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
    Jan  5 04:29:44 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
    Jan  5 04:40:26 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
    Jan  5 04:51:08 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
    Jan  5 05:02:15 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
    Jan  6 00:58:31 fw kernel: pid 50069 (filterdns), uid 0: exited on signal 11 (core dumped)
    Jan  6 00:58:47 fw kernel: pid 87410 (filterdns), uid 0: exited on signal 11 (core dumped)
    Jan  6 01:08:57 fw kernel: pid 24930 (filterdns), uid 0: exited on signal 11 (core dumped)

    ls -la /filterdns.core

    -rw–-----  1 root  wheel  4661248 Jan  6 01:08 /filterdns.core

    1 Reply Last reply Reply Quote 0
    • P
      phil.davis
      last edited by Jan 6, 2013, 6:12 AM

      2.1-BETA1 (i386)
      built on Sat Jan 5 17:06:02 EST 2013
      FreeBSD 8.3-RELEASE-p5
      Now I should definitely have all the recent filterdns code changes. Still have the same symptoms, the table gets the correct 11 IP addresses translated from the names at boot. 5 minutes later, filterdns dies:

      [2.1-BETA1][admin@imp-rt-01.imp.infn]/var/log(6): clog system.log | grep filterdns
      Jan  6 11:55:27 imp-rt-01 kernel: pid 27624 (filterdns), uid 0: exited on signal 11
      
      

      As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
      If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

      1 Reply Last reply Reply Quote 0
      • E
        eri--
        last edited by Jan 6, 2013, 5:30 PM

        Hrm strange that you see that.
        5 minutes is the default update interval for rechecking names.

        I have run test here with 5 seconds and 10 second update intervals but no issues in that regard!
        That makes still thing the snaps do not have the latest version of filterdns.

        Can you make a md5 of your filterdns ?

        1 Reply Last reply Reply Quote 0
        • D
          dhatz
          last edited by Jan 6, 2013, 6:50 PM

          @ermal:

          Can you make a md5 of your filterdns ?

          MD5 (/usr/local/sbin/filterdns) = b25470f1942956d6f887ff87c99761c4

          1 Reply Last reply Reply Quote 0
          • P
            phil.davis
            last edited by Jan 7, 2013, 2:26 AM

            2.1-BETA1 (i386)
            built on Sun Jan 6 11:15:50 EST 2013
            FreeBSD 8.3-RELEASE-p5

            MD5 (/usr/local/sbin/filterdns) = b25470f1942956d6f887ff87c99761c4
            

            5 minutes after startup:

            [2.1-BETA1][admin@rt-01.mydomain]/root(2): clog /var/log/system.log | grep filterdns
            Jan  7 08:07:02 rt-01 kernel: pid 28781 (filterdns), uid 0: exited on signal 11
            
            

            As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
            If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

            1 Reply Last reply Reply Quote 0
            • D
              dhatz
              last edited by Jan 11, 2013, 10:18 PM

              Just bumping up this thread, since filterdns is still exiting + dumping core (note: I had just updated to latest 2.1-BETA1 snapshot)

              1 Reply Last reply Reply Quote 0
              • P
                phil.davis
                last edited by Jan 14, 2013, 7:22 AM

                Bump from me also, now on:
                2.1-BETA1 (i386)
                built on Sun Jan 13 19:34:21 EST 2013
                FreeBSD 8.3-RELEASE-p5
                and still getting:

                Jan 14 12:09:19 imp-rt-01 kernel: pid 34114 (filterdns), uid 0: exited on signal 11 (core dumped)
                

                As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                1 Reply Last reply Reply Quote 0
                • P
                  phil.davis
                  last edited by Jan 16, 2013, 7:16 PM

                  Some more information. filterdns only crashes if SIGHUP is received and it goes through the "Cleaning up previous hostnames" code:

                  Jan 16 08:57:26 imp-rt-01 filterdns: Received signal SIGHUP(1).
                  Jan 16 08:57:26 imp-rt-01 filterdns: Cleaning up previous hostnames
                  

                  This happens as various interfaces and OpenVPN links come up during startup - filter reloads happen a few times, and are fed to filterdns. It dies with sig 11 at the next scheduled 5 minute wakeup.
                  Something in the reload of filterdns.conf and attempted preservation of existing threads, removal of threads no longer needed, and addition of threads to monitor new IPs, is freeing memory that is still needed. In filterdns.c, merge_config calls clear_config:

                  static void
                  clear_config(struct thread_list *thrlist)
                  {
                  	struct thread_data *thr;
                  
                  	pthread_mutex_lock(&sig_mtx);
                  	while ((thr = TAILQ_FIRST(thrlist)) != NULL) {
                  		if (debug >= 4)
                  			syslog(LOG_ERR, "Cleaning up hostname %s", thr->hostname);
                  		TAILQ_REMOVE(thrlist, thr, next);
                  		if (thr->thr_pid != 0)
                  			pthread_cancel(thr->thr_pid);
                  		clear_hostname_addresses(thr);
                  		if (thr->hostname)
                  			free(thr->hostname);
                  		if (thr->tablename)
                  			free(thr->tablename);
                  		free(thr);
                  	}
                  	pthread_rwlock_unlock(&main_lock);
                  }
                  
                  

                  merge_config sets thr_pid to 0 for threads that should continue on (do not need to be cancelled). But clear_config frees various data for the thread (hostname and tablename) and the thread data itself, even when the thread is not cancelled.
                  When the thread awakes in check_hostname at the 5 minute timer, it will have lost its thr data structure - reference to it will cause sig 11.
                  Perhaps it just needs this code for clear_config:

                  static void
                  clear_config(struct thread_list *thrlist)
                  {
                  	struct thread_data *thr;
                  
                  	pthread_mutex_lock(&sig_mtx);
                  	while ((thr = TAILQ_FIRST(thrlist)) != NULL) {
                  		if (debug >= 4)
                  			syslog(LOG_ERR, "Cleaning up hostname %s", thr->hostname);
                  		TAILQ_REMOVE(thrlist, thr, next);
                  		if (thr->thr_pid != 0) {
                  			pthread_cancel(thr->thr_pid);
                  			clear_hostname_addresses(thr);
                  			if (thr->hostname)
                  				free(thr->hostname);
                  			if (thr->tablename)
                  				free(thr->tablename);
                  			free(thr);
                  		}
                  	}
                  	pthread_rwlock_unlock(&main_lock);
                  }
                  

                  Also, "pthread_rwlock_unlock(&main_lock);" at the end seems odd. Shouldn't it be "pthread_mutex_unlock(&sig_mtx);" - to match the "pthread_mutex_lock(&sig_mtx);" at the start of the routine?
                  @ermal: I don't have an environment to compile in, but this might give enough clues for you to track this down.

                  As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                  If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                  1 Reply Last reply Reply Quote 0
                  • E
                    eri--
                    last edited by Jan 16, 2013, 9:32 PM

                    Thanks for the analysis pushed a fix.

                    1 Reply Last reply Reply Quote 0
                    • P
                      phil.davis
                      last edited by Jan 17, 2013, 6:32 AM

                      Thanks, now it doesn't crash. But somewhere in the boot process, with OpenVPN links etc coming up, it has a point where it deletes all the table entries then does not recover them again. After boot, my table that should have 11 IP addresses is empty. The log indicates entries being deleted at one point.
                      As a side issue:

                      syslog(LOG_WARNING, "\t DELETED %d addresses(%d) to table %s.", io.pfrio_nadd, address->sa_family, pfd->tablename);
                      

                      should be:

                      syslog(LOG_WARNING, "\t DELETED %d addresses(%d) to table %s.", io.pfrio_ndel, address->sa_family, pfd->tablename);
                      

                      (the debug line is reporting pfrio_nadd when it needs to report pfrio_ndel)

                      If I restart filterdns (kill it by hand, then use Diagnostics:Execute Command:PHP Execute to do:

                      mwexec("/usr/local/sbin/filterdns -p {$g['varrun_path']}/filterdns.pid -i 300 -c {$g['varetc_path']}/filterdns.conf -d 10");
                      

                      It comes up nicely and puts all 11 IPs in the table.
                      After this, the entries survive when I stop and start an OpenVPN client process - the log looks good.
                      @ermal: I will PM you a log of filterdns behaviour at boot with -d 10 set.

                      As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                      If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                      1 Reply Last reply Reply Quote 0
                      • P
                        phil.davis
                        last edited by Jan 17, 2013, 7:26 AM

                        Also, in filterdns.c main, it:
                        a) reads the config, filling in thread_list
                        b) loops creating a check_hostname thread for each host
                        c) inits main_lock
                        d) creates the thread for merge_config

                        	TAILQ_FOREACH(thr, &thread_list, next) {
                        		error = pthread_create(&thr->thr_pid, &attr, check_hostname, thr);
                        		if (error != 0) {
                        			if (debug >= 1)
                        				syslog(LOG_ERR, "Unable to create monitoring thread for host %s", thr->hostname);
                        		}
                        		pthread_set_name_np(thr->thr_pid, thr->hostname);
                        	}
                        
                        	pthread_rwlock_init(&main_lock, NULL);
                        	sig_mtx = PTHREAD_MUTEX_INITIALIZER;
                                sig_condvar = PTHREAD_COND_INITIALIZER;
                        	error = pthread_create(&sig_thr, &attr, merge_config, NULL);
                        	if (error != 0) {
                        		if (debug >= 1)
                        			syslog(LOG_ERR, "Unable to create signal thread %s", thr->hostname);
                        	}
                        	pthread_set_name_np(sig_thr, "signal-thread");
                        
                        

                        But check_hostname uses main_lock. So it is possible that main_lock is not initialized when check_hostname runs the first time.
                        Maybe that could cause some early accesses to thread_list to be inconsistent?
                        Maybe:

                        pthread_rwlock_init(&main_lock, NULL);
                        

                        should be moved earlier in main.

                        As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                        If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                        1 Reply Last reply Reply Quote 0
                        • E
                          eri--
                          last edited by Jan 17, 2013, 9:35 PM

                          I did make the code correct but i think the issue was mostly related to getaddrinfo code not reporting correctly the EAGAIN error.
                          This made entries expire, though it does not explain why it does not reenter them.

                          1 Reply Last reply Reply Quote 0
                          • P
                            phil.davis
                            last edited by Jan 18, 2013, 2:49 PM

                            Upgraded to:
                            2.1-BETA1 (i386)
                            built on Fri Jan 18 03:21:43 EST 2013
                            It puts the 11 IP address entries in the table at the start, then sometime over the next few minutes, the addresses are all deleted from the table. The problem comes from when this message is reported 11 times (site names 1 to 11):

                            Jan 18 20:19:43 imp-rt-01 filterdns: Creating a new thread for host site1.dyndns-ip.com!
                            

                            It already has all 11 threads for the 11 names in the table. Then, for whatever reason, it decides to create 11 new threads. In the process, it ends up clearing out the 11 table entries and never actually putting them back.
                            @ermal: I will send another full debug log.

                            As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                            If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                            1 Reply Last reply Reply Quote 0
                            • D
                              dhatz
                              last edited by Jan 18, 2013, 5:21 PM

                              Upgraded to today's latest snapshot, I'm still getting "exited on signal 11 (core dumped)" and I see only one filterdns process running (whereas in the past there used to be more filterdns processes – for ipsec / CP / etc)

                              1 Reply Last reply Reply Quote 0
                              • I
                                iamzam
                                last edited by Jan 18, 2013, 5:50 PM

                                I have been following this thread because of similar problems with filterdns crash/core dumps and I have an observation:

                                My problem seems to be related to the filterdns that gets started through the vpn/ipsec stuff.

                                After updating to the latest snapshot today:

                                2.1-BETA1 (amd64)
                                built on Fri Jan 18 04:21:30 EST 2013
                                FreeBSD 8.3-RELEASE-p5

                                • I increased the filterdns debug level to 10 (in vpn.inc, line 984, '-d 10' switch) and clicked save on the VPN -> IPsec page to restart the filterdns process monitoring the vpn endpoints.

                                Here is the log output I get after this:

                                Jan 18 12:29:51 pfs check_reload_status: Syncing firewall
                                Jan 18 12:29:51 pfs filterdns: Found hostname vpn.net.loc with netmask 32.
                                Jan 18 12:29:51 pfs filterdns: found entry 10.5.0.6 for (null)
                                Jan 18 12:29:51 pfs filterdns: found entry 10.5.0.6 for (null)
                                Jan 18 12:29:51 pfs filterdns: entry 10.5.0.6 exists in table (null)
                                Jan 18 12:29:51 pfs filterdns: found entry 10.5.0.6 for (null)
                                Jan 18 12:29:51 pfs filterdns: entry 10.5.0.6 exists in table (null)
                                Jan 18 12:29:51 pfs filterdns: Found 1 entries for vpn.net.loc
                                Jan 18 12:29:51 pfs check_reload_status: Restarting ipsec tunnels
                                Jan 18 12:29:51 pfs filterdns: Ran command /usr/local/sbin/pfSctl -c "service reload ipsecdns" with exit status 0 because a dns change on hostname vpn.net.loc was detected.
                                Jan 18 12:29:53 pfs php: : IPSEC: One or more IPsec tunnel endpoints has changed its IP. Refreshing.
                                Jan 18 12:29:58 pfs php: : Could not determine VPN endpoint for 'WAN IPv4 IPsec Mobile Phase1 '
                                Jan 18 12:30:03 pfs php: : Could not determine VPN endpoint for 'WAN IPv4 IPsec Mobile Phase1 '
                                Jan 18 12:30:03 pfs filterdns: Received signal SIGHUP(1).
                                Jan 18 12:30:03 pfs kernel: pid 61925 (filterdns), uid 0: exited on signal 11 (core dumped)

                                This is probably not causing any real problems on my system because my remote vpn endpoint dns doesn't change or if it's related to the mobile ipsec phase1 not having an endpoint I am not sure how that would affect me, but I have noticed the core dump syslog messages and I have read that there can be up to three running filterdns processes (filter, vpn, captiveportal).

                                Hope this helps…

                                1 Reply Last reply Reply Quote 0
                                • E
                                  eri--
                                  last edited by Jan 19, 2013, 3:21 PM

                                  I think all this happens because a filter reload will clear the contents of the table with what the filter config sends in.
                                  I changed filterdns again to force update of addresses on table when a SIGHUP happens.

                                  Hopefully by monday snapshot the updated filterdns will be there.

                                  1 Reply Last reply Reply Quote 0
                                  • P
                                    phil.davis
                                    last edited by Jan 20, 2013, 5:28 PM

                                    2.1-BETA1 (i386)
                                    built on Sat Jan 19 20:44:40 EST 2013
                                    Looking good - Alix nanoBSD test system has been up 9 hours. The table that should translate 11 names to 11 IPs now has 14 IP address entries. (3 of the names have dynamically switched IP in this time.) filterdns is adding to the table and not removing old entries, but I don't really care about that (feature or bug?)

                                    As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                                    If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                                    1 Reply Last reply Reply Quote 0
                                    • B
                                      bardelot
                                      last edited by Jan 20, 2013, 6:28 PM

                                      @phil.davis:

                                      2.1-BETA1 (i386)
                                      built on Sat Jan 19 20:44:40 EST 2013

                                      There have been a few more changes after that date, you will have to try again tomorrow or so with a newer snapshot.

                                      1 Reply Last reply Reply Quote 0
                                      • D
                                        dhatz
                                        last edited by Jan 20, 2013, 11:31 PM

                                        I just upgraded to latest snapshot but still get filterdns problems:

                                        FreeBSD fw.localdomain 8.3-RELEASE-p5 FreeBSD 8.3-RELEASE-p5 #1: Sat Jan 19 21:12:44 EST 2013     root@snapshots-8_3-i386.builders.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8  i386

                                        MD5 (/usr/local/sbin/filterdns) = 6949816348947b7762586fe3c59b356e

                                        …
                                        Jan 21 00:05:28 fw kernel: pid 47308 (filterdns), uid 0: exited on signal 11 (core dumped)
                                        Jan 21 00:05:29 fw check_reload_status: Restarting ipsec tunnels
                                        Jan 21 00:05:30 fw login: login on ttyv0 as root
                                        Jan 21 00:05:36 fw php: : IPSEC: One or more IPsec tunnel endpoints has changed its IP. Refreshing.
                                        Jan 21 00:05:37 fw check_reload_status: Updating all dyndns
                                        Jan 21 00:05:37 fw check_reload_status: Restarting OpenVPN tunnels/interfaces
                                        Jan 21 00:05:38 fw check_reload_status: Reloading filter
                                        Jan 21 00:05:40 fw kernel: pid 83611 (filterdns), uid 0: exited on signal 11 (core dumped)

                                        1 Reply Last reply Reply Quote 0
                                        • E
                                          eri--
                                          last edited by Jan 21, 2013, 8:35 AM

                                          dhatz that happens probably because of upgrade is not replacing the filterdns process.
                                          Can you kill all you filterdns processes before running an upgrade and try again or
                                          extract the archive of the upgrade and install manually the filterdns binary, it is located on usr/local/sbin iirc.

                                          I am tracking even this issue of upgrade not replacing binaries at some time.

                                          1 Reply Last reply Reply Quote 0
                                          27 out of 57
                                          • First post
                                            27/57
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.