Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    So many filterdns instances…

    Scheduled Pinned Locked Moved 2.1 Snapshot Feedback and Problems - RETIRED
    57 Posts 10 Posters 20.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • E
      eri--
      last edited by

      Upgrade to today snapshot(Jan 3) it should be better maybe you caught a snap with intermediate changes.

      Also if you run top -H you should see the hostnames on each tread run for them.

      1 Reply Last reply Reply Quote 0
      • P
        phil.davis
        last edited by

        A later/Jan 3 snap is not up yet. I will upgrade and report back when Jan 3 snap appears.

        As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
        If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

        1 Reply Last reply Reply Quote 0
        • P
          phil.davis
          last edited by

          2.1-BETA1 (i386)
          built on Thu Jan 3 02:32:11 EST 2013
          FreeBSD 8.3-RELEASE-p5
          still has the same failed looking up "(null)" message every 5 minutes.
          There is another snap up now 06:39 - I'll load that now and see…

          As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
          If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

          1 Reply Last reply Reply Quote 0
          • P
            phil.davis
            last edited by

            2.1-BETA1 (i386)
            built on Thu Jan 3 19:04:10 EST 2013
            FreeBSD 8.3-RELEASE-p5

            Jan  4 08:51:17 imp-rt-01 filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
            Jan  4 08:56:17 imp-rt-01 filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
            Jan  4 09:01:18 imp-rt-01 filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
            Jan  4 09:06:18 imp-rt-01 filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
            
            

            This message is still logged every 5 minutes.

            As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
            If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

            1 Reply Last reply Reply Quote 0
            • P
              phil.davis
              last edited by

              I checked in Diagnostics:Tables.

              1. On a system that is running Mon Dec 31 12:20:48 EST 2012 snap (before the recent filterdns changes), my INF_iinet_ips table is long - it has the current 11 IP addresses that go with the 11 names in the table, and also has lots of old IP addresses that were dynamically allocated in the past.
                (I think the recent filterdns changes will now be clearing up old entries)

              2. On the system running Thu Jan 3 19:04:10 EST 2013 snap, there are exactly 11 IP addresses in the table, but they are out-of-date compared to the addresses I get with nslookup from my desktop. I rebooted and the 11 IP addresses are now current (so filterdns must be looking them up OK when it starts). I will monitor the table and see if the addresses go out-of-date over time.

              filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
              

              still in syslog every 5 minutes.

              As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
              If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

              1 Reply Last reply Reply Quote 0
              • E
                eri--
                last edited by

                Should be corrected with tomorrow snapshot.

                1 Reply Last reply Reply Quote 0
                • P
                  phil.davis
                  last edited by

                  2.1-BETA1 (i386)
                  built on Fri Jan 4 17:38:46 EST 2013
                  FreeBSD 8.3-RELEASE-p5
                  Alix 32-bit nanoBSD
                  filterdns starts at bootup and successfully fills gets the current IP addresses for the 11 names in my alias table.
                  5 minutes later it dies (when it wakes up to check again, I suppose), with this in syslog:

                  kernel: pid 24638 (filterdns), uid 0: exited on signal 11
                  

                  ps ax | grep filterdns
                  reveals that there is no filterdns process any more.
                  I rebooted, and the same behaviour is repeatable.

                  As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                  If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                  1 Reply Last reply Reply Quote 0
                  • C
                    cybercare
                    last edited by

                    You probably need to try todays snap as he said yesterday it would be in todays and you are still listing a jan 4 snap.

                    1 Reply Last reply Reply Quote 0
                    • D
                      dhatz
                      last edited by

                      Seems that running latest snapshot filterdns still has some issues

                      clog system.log | tail

                      Jan  6 00:58:26 fw php: : Creating rrd update script
                      Jan  6 00:58:28 fw php: : Forcefully reloading IPsec racoon daemon
                      Jan  6 00:58:28 fw php: : Restarting/Starting all packages.
                      Jan  6 00:58:30 fw dhclient[17095]: DHCPREQUEST on em0 to x.y.z.w port 67
                      Jan  6 00:58:30 fw dhclient[17095]: DHCPACK from x.y.z.w
                      Jan  6 00:58:30 fw dhclient: RENEW
                      Jan  6 00:58:30 fw dhclient: Creating resolv.conf
                      Jan  6 00:58:30 fw dhclient[17095]: bound to x.y.z.201 – renewal in 43200 seconds.
                      Jan  6 00:58:31 fw php: : Resyncing OpenVPN instances for interface WAN.
                      Jan  6 00:58:31 fw kernel: pid 50069 (filterdns), uid 0: exited on signal 11 (core dumped)
                      Jan  6 00:58:32 fw php: : IPSEC: One or more IPsec tunnel endpoints has changed its IP. Refreshing.
                      Jan  6 00:58:34 fw login: login on ttyv0 as root
                      Jan  6 00:58:36 fw check_reload_status: Updating all dyndns
                      Jan  6 00:58:36 fw check_reload_status: Restarting ipsec tunnels
                      Jan  6 00:58:36 fw check_reload_status: Restarting OpenVPN tunnels/interfaces
                      Jan  6 00:58:36 fw check_reload_status: Reloading filter
                      Jan  6 00:58:43 fw php: : IPSEC: One or more IPsec tunnel endpoints has changed its IP. Refreshing.
                      Jan  6 00:58:47 fw kernel: pid 87410 (filterdns), uid 0: exited on signal 11 (core dumped)
                      Jan  6 01:01:22 fw php: /firewall_rules.php: Successful login for user 'admin' from: 192.168.100.12
                      Jan  6 01:01:22 fw php: /firewall_rules.php: Successful login for user 'admin' from: 192.168.100.12

                      uname -a

                      FreeBSD fw.localdomain 8.3-RELEASE-p5 FreeBSD 8.3-RELEASE-p5 #1: Sat Jan  5 13:23:58 EST 2013     root@snapshots-8_3-i386.builders.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8  i386

                      It had the same issue with previous snapshots:

                      Jan  5 00:17:03 fw kernel: pid 48375 (filterdns), uid 0: exited on signal 11 (core dumped)
                      Jan  5 03:25:34 fw kernel: pid 45341 (filterdns), uid 0: exited on signal 11 (core dumped)
                      Jan  5 03:36:13 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
                      Jan  5 03:46:55 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
                      Jan  5 03:57:37 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
                      Jan  5 04:08:19 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
                      Jan  5 04:19:01 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
                      Jan  5 04:29:44 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
                      Jan  5 04:40:26 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
                      Jan  5 04:51:08 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
                      Jan  5 05:02:15 fw filterdns: host_dns: failed looking up "(null)": hostname nor servname provided, or not known
                      Jan  6 00:58:31 fw kernel: pid 50069 (filterdns), uid 0: exited on signal 11 (core dumped)
                      Jan  6 00:58:47 fw kernel: pid 87410 (filterdns), uid 0: exited on signal 11 (core dumped)
                      Jan  6 01:08:57 fw kernel: pid 24930 (filterdns), uid 0: exited on signal 11 (core dumped)

                      ls -la /filterdns.core

                      -rw–-----  1 root  wheel  4661248 Jan  6 01:08 /filterdns.core

                      1 Reply Last reply Reply Quote 0
                      • P
                        phil.davis
                        last edited by

                        2.1-BETA1 (i386)
                        built on Sat Jan 5 17:06:02 EST 2013
                        FreeBSD 8.3-RELEASE-p5
                        Now I should definitely have all the recent filterdns code changes. Still have the same symptoms, the table gets the correct 11 IP addresses translated from the names at boot. 5 minutes later, filterdns dies:

                        [2.1-BETA1][admin@imp-rt-01.imp.infn]/var/log(6): clog system.log | grep filterdns
                        Jan  6 11:55:27 imp-rt-01 kernel: pid 27624 (filterdns), uid 0: exited on signal 11
                        
                        

                        As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                        If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                        1 Reply Last reply Reply Quote 0
                        • E
                          eri--
                          last edited by

                          Hrm strange that you see that.
                          5 minutes is the default update interval for rechecking names.

                          I have run test here with 5 seconds and 10 second update intervals but no issues in that regard!
                          That makes still thing the snaps do not have the latest version of filterdns.

                          Can you make a md5 of your filterdns ?

                          1 Reply Last reply Reply Quote 0
                          • D
                            dhatz
                            last edited by

                            @ermal:

                            Can you make a md5 of your filterdns ?

                            MD5 (/usr/local/sbin/filterdns) = b25470f1942956d6f887ff87c99761c4

                            1 Reply Last reply Reply Quote 0
                            • P
                              phil.davis
                              last edited by

                              2.1-BETA1 (i386)
                              built on Sun Jan 6 11:15:50 EST 2013
                              FreeBSD 8.3-RELEASE-p5

                              MD5 (/usr/local/sbin/filterdns) = b25470f1942956d6f887ff87c99761c4
                              

                              5 minutes after startup:

                              [2.1-BETA1][admin@rt-01.mydomain]/root(2): clog /var/log/system.log | grep filterdns
                              Jan  7 08:07:02 rt-01 kernel: pid 28781 (filterdns), uid 0: exited on signal 11
                              
                              

                              As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                              If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                              1 Reply Last reply Reply Quote 0
                              • D
                                dhatz
                                last edited by

                                Just bumping up this thread, since filterdns is still exiting + dumping core (note: I had just updated to latest 2.1-BETA1 snapshot)

                                1 Reply Last reply Reply Quote 0
                                • P
                                  phil.davis
                                  last edited by

                                  Bump from me also, now on:
                                  2.1-BETA1 (i386)
                                  built on Sun Jan 13 19:34:21 EST 2013
                                  FreeBSD 8.3-RELEASE-p5
                                  and still getting:

                                  Jan 14 12:09:19 imp-rt-01 kernel: pid 34114 (filterdns), uid 0: exited on signal 11 (core dumped)
                                  

                                  As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                                  If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                                  1 Reply Last reply Reply Quote 0
                                  • P
                                    phil.davis
                                    last edited by

                                    Some more information. filterdns only crashes if SIGHUP is received and it goes through the "Cleaning up previous hostnames" code:

                                    Jan 16 08:57:26 imp-rt-01 filterdns: Received signal SIGHUP(1).
                                    Jan 16 08:57:26 imp-rt-01 filterdns: Cleaning up previous hostnames
                                    

                                    This happens as various interfaces and OpenVPN links come up during startup - filter reloads happen a few times, and are fed to filterdns. It dies with sig 11 at the next scheduled 5 minute wakeup.
                                    Something in the reload of filterdns.conf and attempted preservation of existing threads, removal of threads no longer needed, and addition of threads to monitor new IPs, is freeing memory that is still needed. In filterdns.c, merge_config calls clear_config:

                                    static void
                                    clear_config(struct thread_list *thrlist)
                                    {
                                    	struct thread_data *thr;
                                    
                                    	pthread_mutex_lock(&sig_mtx);
                                    	while ((thr = TAILQ_FIRST(thrlist)) != NULL) {
                                    		if (debug >= 4)
                                    			syslog(LOG_ERR, "Cleaning up hostname %s", thr->hostname);
                                    		TAILQ_REMOVE(thrlist, thr, next);
                                    		if (thr->thr_pid != 0)
                                    			pthread_cancel(thr->thr_pid);
                                    		clear_hostname_addresses(thr);
                                    		if (thr->hostname)
                                    			free(thr->hostname);
                                    		if (thr->tablename)
                                    			free(thr->tablename);
                                    		free(thr);
                                    	}
                                    	pthread_rwlock_unlock(&main_lock);
                                    }
                                    
                                    

                                    merge_config sets thr_pid to 0 for threads that should continue on (do not need to be cancelled). But clear_config frees various data for the thread (hostname and tablename) and the thread data itself, even when the thread is not cancelled.
                                    When the thread awakes in check_hostname at the 5 minute timer, it will have lost its thr data structure - reference to it will cause sig 11.
                                    Perhaps it just needs this code for clear_config:

                                    static void
                                    clear_config(struct thread_list *thrlist)
                                    {
                                    	struct thread_data *thr;
                                    
                                    	pthread_mutex_lock(&sig_mtx);
                                    	while ((thr = TAILQ_FIRST(thrlist)) != NULL) {
                                    		if (debug >= 4)
                                    			syslog(LOG_ERR, "Cleaning up hostname %s", thr->hostname);
                                    		TAILQ_REMOVE(thrlist, thr, next);
                                    		if (thr->thr_pid != 0) {
                                    			pthread_cancel(thr->thr_pid);
                                    			clear_hostname_addresses(thr);
                                    			if (thr->hostname)
                                    				free(thr->hostname);
                                    			if (thr->tablename)
                                    				free(thr->tablename);
                                    			free(thr);
                                    		}
                                    	}
                                    	pthread_rwlock_unlock(&main_lock);
                                    }
                                    

                                    Also, "pthread_rwlock_unlock(&main_lock);" at the end seems odd. Shouldn't it be "pthread_mutex_unlock(&sig_mtx);" - to match the "pthread_mutex_lock(&sig_mtx);" at the start of the routine?
                                    @ermal: I don't have an environment to compile in, but this might give enough clues for you to track this down.

                                    As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                                    If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                                    1 Reply Last reply Reply Quote 0
                                    • E
                                      eri--
                                      last edited by

                                      Thanks for the analysis pushed a fix.

                                      1 Reply Last reply Reply Quote 0
                                      • P
                                        phil.davis
                                        last edited by

                                        Thanks, now it doesn't crash. But somewhere in the boot process, with OpenVPN links etc coming up, it has a point where it deletes all the table entries then does not recover them again. After boot, my table that should have 11 IP addresses is empty. The log indicates entries being deleted at one point.
                                        As a side issue:

                                        syslog(LOG_WARNING, "\t DELETED %d addresses(%d) to table %s.", io.pfrio_nadd, address->sa_family, pfd->tablename);
                                        

                                        should be:

                                        syslog(LOG_WARNING, "\t DELETED %d addresses(%d) to table %s.", io.pfrio_ndel, address->sa_family, pfd->tablename);
                                        

                                        (the debug line is reporting pfrio_nadd when it needs to report pfrio_ndel)

                                        If I restart filterdns (kill it by hand, then use Diagnostics:Execute Command:PHP Execute to do:

                                        mwexec("/usr/local/sbin/filterdns -p {$g['varrun_path']}/filterdns.pid -i 300 -c {$g['varetc_path']}/filterdns.conf -d 10");
                                        

                                        It comes up nicely and puts all 11 IPs in the table.
                                        After this, the entries survive when I stop and start an OpenVPN client process - the log looks good.
                                        @ermal: I will PM you a log of filterdns behaviour at boot with -d 10 set.

                                        As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                                        If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                                        1 Reply Last reply Reply Quote 0
                                        • P
                                          phil.davis
                                          last edited by

                                          Also, in filterdns.c main, it:
                                          a) reads the config, filling in thread_list
                                          b) loops creating a check_hostname thread for each host
                                          c) inits main_lock
                                          d) creates the thread for merge_config

                                          	TAILQ_FOREACH(thr, &thread_list, next) {
                                          		error = pthread_create(&thr->thr_pid, &attr, check_hostname, thr);
                                          		if (error != 0) {
                                          			if (debug >= 1)
                                          				syslog(LOG_ERR, "Unable to create monitoring thread for host %s", thr->hostname);
                                          		}
                                          		pthread_set_name_np(thr->thr_pid, thr->hostname);
                                          	}
                                          
                                          	pthread_rwlock_init(&main_lock, NULL);
                                          	sig_mtx = PTHREAD_MUTEX_INITIALIZER;
                                                  sig_condvar = PTHREAD_COND_INITIALIZER;
                                          	error = pthread_create(&sig_thr, &attr, merge_config, NULL);
                                          	if (error != 0) {
                                          		if (debug >= 1)
                                          			syslog(LOG_ERR, "Unable to create signal thread %s", thr->hostname);
                                          	}
                                          	pthread_set_name_np(sig_thr, "signal-thread");
                                          
                                          

                                          But check_hostname uses main_lock. So it is possible that main_lock is not initialized when check_hostname runs the first time.
                                          Maybe that could cause some early accesses to thread_list to be inconsistent?
                                          Maybe:

                                          pthread_rwlock_init(&main_lock, NULL);
                                          

                                          should be moved earlier in main.

                                          As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                                          If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                                          1 Reply Last reply Reply Quote 0
                                          • E
                                            eri--
                                            last edited by

                                            I did make the code correct but i think the issue was mostly related to getaddrinfo code not reporting correctly the EAGAIN error.
                                            This made entries expire, though it does not explain why it does not reenter them.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.