Upgrade 2.4.0: firewall rule with alias and FQDN not working anymore



  • Can you read?

    Diagnostic->Ping is working.

    And it worked before update!



  • Information requested below:

    Alias:
    IP_Syncthing_Clients - Type: Hosts - Entries: (contains many local computer names all registered in DNS by pfSense DHCP Server) my-desktop
    IP_NAS - Type: Hosts - Entries: nas.fqdn.private
    Port_Syncthing_Server_TCP- Type: Ports - Entries: 22000
    

    Looking at the table alias for IP_Syncthing_Clients confirms that the IP address for my-desktop is in there.
    The table alias for IP_NAS says there are no entries in the table. I have tried amending both the description and added a new host name to prompt it to refresh it but still it reports there are no entries in the table.
    Port_Syncthing_Server_TCP doesn't appear in the tables list (I'm assuming only IP ones will?)

    DNS Resolver Settings:
    General:
    Enable: Ticked
    Port: Default (53)
    Network Interfaces: selected the correct interfaces (LAN and the network the NAS is on)
    Outgoing Interfaces: WAN
    System Domain Local Zone Type: Transparent (default)
    Enable Forwarding Mode: Ticked
    Register the DHCP Leases in the DNS Resolver: Ticked
    Register DHCP static mappings in the DNS Resolver: Ticked
    No Domain Overrides
    
    Advanced:
    Hide Identity: Ticked
    Hide Version: Ticked
    Everything else either unticked or left at defaults
    
    Access Lists: Empty
    
    Rule:
    Action: Pass
    Interface: LAN
    Address Family: IPv4
    Protocol: TCP
    Source: Single Host or Alias: IP_Syncthing_Clients
    Destination: Single Host or Alias: IP_NAS
    Destination Port Range: (other): Port_Syncthing_Server_TCP: (other): Port_Syncthing_Server_TCP 
    Log packets handled by this rule: Ticked
    Everything else left as default
    
    

    Packets destined to port 22000 on nas.fqdn.local from my-desktop are blocked. If I change the rule and replace IP_NAS with the IP address of NAS it works fine.

    So it looks like if the table entries are missing it won't resolve. So it looks like the upgrade is hosing some of the Alias tables (as there are a lot of empty ones). Which begs the question how to recreate the alias tables without starting from scratch.

    These rules have been in place since 19/2/16 without issue. There are also other rules I have with the same problems. This is just one.



  • Are you using Domain Overrides and query them in your alias table?



  • @ggzengel:

    Are you using Domain Overrides and query them in your alias table?

    No. As I said above there are no Domain Overrides in the DNS Resolver.

    Just to be clear as well the nas.fqdn.private and my-desktop both resolve to the correct IP when using Diagnostics -> Ping.


  • Rebel Alliance Global Moderator

    Dude post up screenshots of your alias and your diagnostic table… How and the hell is pfsense going to resolve smtp.domain.local since that is not a public..

    So your saying that is a reservation in your dhcp that your register in your forwarder?  Or your just registering dhcp clients?  If your not doing an override

    If your saying pfsense can resolve it, then it would be in the TABLE.. If its not in the table then no your alias would not work.

    Can not duplicate this.. Plain and simple.. If pfsense can resolve a fqdn, then it shows up in the table.. Be it a local entry or a public entry..




  • @johnpoz:

    Dude post up screenshots of your alias and your diagnostic table… How and the hell is pfsense going to resolve smtp.domain.local since that is not a public..

    As you're replying to ggzengel (as he has smtp.domain.local) I will let him answer. If you're referring to me then let me know.



  • Strange:
    Since update on Friday until yesterday the firewall was blocking the smtp port.
    Yesterday I saved this table entry again in the hope it would work, but it always blocked this port.
    Only changing to IP resolved this problem.
    Today after trying multiple entries with google it's working again.
    Now I have a FQDN entry and the firewall is open again. WTF?



  • @ggzengel:

    Strange:
    Since update on Friday until yesterday the firewall was blocking the smtp port.
    Yesterday I saved this table entry again in the hope it would work, but it always blocked this port.
    Only changing to IP resolved this problem.
    Today after trying multiple entries with google it's working again.
    Now I have a FQDN entry and the firewall is open again. WTF?

    Glad you got yours sorted. I only upgraded yesterday so hopefully I don't need to wait 4 days before it starts working again!

    I have a mixture of internal and external addresses that are in the aliases. All resolve through Diagnostics -> Ping so pfSense knows how to resolve them. But their tables are empty.



  • This is a outside located perimeter firewall and is connected with the core network over openvpn.
    In the core network are the smtp and the dns servers. The domain.local TLD is forwarded with Domain Override.
    This solution (openvpn, dns forward, fqdn alias) is working since years.

    I don't know what happened after update that this solution was so much disturbed.
    Normally the tables should be reloaded with interface changes and everything should be alright.

    1. guess: It didn't refresh the alias table even on saving old entries
    2. guess: It look like there was a negative DNS cache entry for the alias tables which didn't expire if it's always used. While booting the FQDN couldn't be resolved.

    Perhaps tonight I can reboot the pfsense and will see what happen.



  • Can you test a FQDN you never used before?
    Only to see if it's a caching problem.



  • You mean just to ping?

    I just tried to Diagnostics -> Ping 'hello.fqdn.private' and just 'hello' and both failed as you'd expect.

    UPDATED: Also tried this from the console itself with the same error (again as you'd expect). I rebooted pfSense earlier today and also about 15 minutes ago (in case the aliases 'spring' to life after a reboot - I can but hope).



  • What says Status/System Logs/System/DNS Resolver?

    Before update it was working:

    
    Sep 22 22:47:42 	filterdns 		adding entry 10.19.4.250 to table smtp_server on host smtp.domain.local
    Sep 22 22:42:48 	filterdns 		failed to resolve host smtp.domain.local will retry later again.
    Sep 22 22:18:56 	dnsmasq 	43335 	using nameserver 8.8.4.4#53
    Sep 22 22:18:56 	dnsmasq 	43335 	using nameserver 8.8.8.8#53
    Sep 22 22:18:56 	dnsmasq 	43335 	ignoring nameserver 127.0.0.1 - local interface 
    
    

    After update it was working to:

    
    Oct 12 20:41:53 	filterdns 		adding entry 10.19.4.250 to pf table smtp_server for host smtp.domain.local
    Oct 12 20:41:53 	filterdns 		clearing entry 10.19.4.250 from pf table smtp_server on host smtp.domain.local
    Oct 12 20:41:46 	filterdns 		adding entry 10.19.4.250 to pf table smtp_server for host smtp.domain.local
    Oct 12 20:41:46 	filterdns 		clearing entry 10.19.4.250 from pf table smtp_server on host smtp.domain.local
    Oct 12 20:41:45 	filterdns 		adding entry 10.19.4.250 to pf table smtp_server for host smtp.domain.local
    Oct 12 20:41:45 	filterdns 		failed to resolve host smtp.domain.local will retry later again.
    Oct 12 20:26:06 	dnsmasq 	860 	using nameserver 8.8.4.4#53
    Oct 12 20:26:06 	dnsmasq 	860 	using nameserver 8.8.8.8#53
    Oct 12 20:26:06 	dnsmasq 	860 	ignoring nameserver 127.0.0.1 - local interface 
    
    

    Suddently on Saturday it didn't update this entry any more:

    
    Oct 17 13:30:37 	filterdns 		adding entry 216.58.210.3 to pf table Host for host www.google.de
    Oct 17 13:30:37 	filterdns 		adding entry 10.19.4.250 to pf table Host for host smtp.domain.local
    Oct 14 06:45:01 	filterdns 		clearing entry 10.19.4.250 from pf table smtp_server on host smtp.domain.local
    Oct 14 06:30:01 	filterdns 		adding entry 10.19.4.250 to pf table smtp_server for host smtp.domain.local
    Oct 14 06:30:01 	filterdns 		clearing entry 10.19.4.250 from pf table smtp_server on host smtp.domain.local
    Oct 14 06:15:01 	filterdns 		adding entry 10.19.4.250 to pf table smtp_server for host smtp.domain.local
    Oct 14 06:15:01 	filterdns 		clearing entry 10.19.4.250 from pf table smtp_server on host smtp.domain.local
    Oct 14 06:00:01 	filterdns 		adding entry 10.19.4.250 to pf table smtp_server for host smtp.domain.local
    
    

    It only worked today as I added google too.
    Yesterday on OCT 16 I tried successfully to ping at smtp.domain.local. So why didn't he update? Did the job crash?

    I think filterdns has a problem. I have 2 running and since the second one runs I have a fresh alias table:

    
    ps aux | grep filterdns
    root   19719   0.0  0.3  21492  3184  -  Is   13:30       0:00.03 /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 300 -c /var/etc/filterdns.conf -d 1
    root   58949   0.0  0.3  12784  2616  -  Is   Thu20       0:00.35 /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 300 -c /var/etc/filterdns.conf -d 1
    root   44060   0.0  0.2  14728  2444  0  S+   15:03       0:00.00 grep filterdns
    
    


  • @ggzengel:

    
    Sep 22 22:47:42 	filterdns 		adding entry 10.19.4.250 to table smtp_server on host smtp.domain.local
    Sep 22 22:42:48 	filterdns 		failed to resolve host smtp.domain.local will retry later again.
    Sep 22 22:18:56 	dnsmasq 	43335 	using nameserver 8.8.4.4#53
    Sep 22 22:18:56 	dnsmasq 	43335 	using nameserver 8.8.8.8#53
    Sep 22 22:18:56 	dnsmasq 	43335 	ignoring nameserver 127.0.0.1 - local interface 
    
    

    Wait …
    You're asking 8.8.8.8 - 8.8.4.4 (Also known as Google) info about "smtp.domain.local" ?
    Well, yes, that will fail  ;D

    If "smtp.domain.local" your has a static IP, add it to Services => DNS Forwarder => Host Overrides and you'll be fine.


  • Rebel Alliance Global Moderator

    yeah…

    Why is it ignoring 127.0.0.1?

    "Sep 22 22:18:56 dnsmasq 43335 ignoring nameserver 127.0.0.1 - local interface"

    edit:  This is forwarder, going to have to forward somewhere ;)  I have not used the forwarder since they enabled unbound.. Well really before that when unbound was just a package.  A resolver is just so much better than a forwarder.  Not sure why anyone still uses it to be honest ;)

    In a nutshell if you have an alias that is not working, you need to check the table.  If entries not in the table then you need to figure out why the resolution of whatever FQDN is not working is not in the table.  Pfsense needs to be able to resolve the FQDN you put in there for it to be able to put in the table..

    So normally such problems just come down to name resolution troubleshooting.. Which doesn't look like any was done before bug report filed ;)



  • Sep 22 22:18:56 dnsmasq 43335 using nameserver 8.8.4.4#53
    Sep 22 22:18:56 dnsmasq 43335 using nameserver 8.8.8.8#53
    Sep 22 22:18:56 dnsmasq 43335 ignoring nameserver 127.0.0.1 - local interface

    What you see here is dnsmasq and not filterdns.
    Dnsmasq works on localhost so it could not add itself. This would give a loop.

    If filterdns is running it makes it good.

    
    Oct 13 10:29:32 	filterdns 		adding entry 10.19.4.250 to pf table smtp_server for host smtp.domain.local
    Oct 13 10:29:32 	filterdns 		clearing entry 10.19.4.250 from pf table smtp_server on host smtp.domain.local
    Oct 13 10:15:01 	filterdns 		adding entry 10.19.4.250 to pf table smtp_server for host smtp.domain.local
    Oct 13 10:15:01 	filterdns 		clearing entry 10.19.4.250 from pf table smtp_server on host smtp.domain.local
    Oct 13 10:00:01 	filterdns 		adding entry 10.19.4.250 to pf table smtp_server for host smtp.domain.local
    Oct 13 10:00:01 	filterdns 		clearing entry 10.19.4.250 from pf table smtp_server on host smtp.domain.local
    
    

    But since update it talks too much:

    
    Oct 10 01:47:52 	filterdns 		failed to resolve host smtp.domain.local will retry later again.
    Sep 22 22:47:42 	filterdns 		adding entry 10.19.4.250 to table smtp_server on host smtp.domain.local
    Sep 22 22:42:48 	filterdns 		failed to resolve host smtp.domain.local will retry later again.
    Sep 22 22:18:56 	dnsmasq 	43335 	using nameserver 8.8.4.4#53
    Sep 22 22:18:56 	dnsmasq 	43335 	using nameserver 8.8.8.8#53
    Sep 22 22:18:56 	dnsmasq 	43335 	ignoring nameserver 127.0.0.1 - local interface 
    
    


  • @ggzengel:

    What says Status/System Logs/System/DNS Resolver?

    DNS Resolver only has the 'unbound' process. There is nothing of filterdns or dnsmasq in there. There is also nothing in System|General either for either filterdns or dnsmasq.

    Are you not using DNS Forwarder service rather than DNS Resolver? I'm assuming there are different 'process' entries.

    I'm happy to check anything else out to try and resolve this.



  • edit:  This is forwarder, going to have to forward somewhere ;)  I have not used the forwarder since they enabled unbound.. Well really before that when unbound was just a package.  A resolver is just so much better than a forwarder.  Not sure why anyone still uses it to be honest ;)

    I tried to migrate to unbound last year but I had some problems: https://redmine.pfsense.org/issues/6065
    Because I have more than 40 Overrides I don't like to try it again on this pfsense.
    And there are still some unwanted effects with unbound: https://redmine.pfsense.org/issues/7884



  • @johnpoz:

    In a nutshell if you have an alias that is not working, you need to check the table.  If entries not in the table then you need to figure out why the resolution of whatever FQDN is not working is not in the table.  Pfsense needs to be able to resolve the FQDN you put in there for it to be able to put in the table..

    So normally such problems just come down to name resolution troubleshooting.. Which doesn't look like any was done before bug report filed ;)

    So what do you suggest beyond what has been done (by me)?



  • I have messages from filterdns in there. It's not dnsmasq and not unbound.

    Even with unbound on an other pfsense I get this:

    
    Oct 13 01:59:09 	filterdns 		adding entry 79.1.2.3 to ipfw table for host dummy.dyndns.org
    Oct 13 00:59:06 	filterdns 		failed to resolve host dummy.dyndns.org will retry later again.
    Oct 12 22:24:47 	filterdns 		adding entry 87.1.2.3 to ipfw table for dummy.dyndns.org
    Oct 12 22:24:45 	unbound 	74165:0 	info: start of service (unbound 1.6.6).
    Oct 12 22:24:45 	unbound 	74165:0 	notice: init module 0: iterator 
    
    


  • I still had these two filterdns running.
    I killed both. For the older one it was enough to send kill. The second one needed kill -9 to stop.

    I removed the test entry with a FQDN inside and pfsense started a new filterdns.
    Now it's not spamming any more. Only on changes (add/delete entries or changing IPs) I see filterdns entries in log.



  • @ggzengel:

    I still had these two filterdns running.
    I killed both. For the older one it was enough to send kill. The second one needed kill -9 to stop.

    I removed the test entry with a FQDN inside and pfsense started a new filterdns.
    Now it's not spamming any more. Only on changes (add/delete entries or changing IPs) I see filterdns entries in log.

    I added a new alias and a new FQDN (www.barrymanilow.com) and on query it's table (Diagnostics -> Tables) it has no entries. I get no filterdns entries in System|General or System|DNS Resolver logs.

    filterdns does exist and is running (Diagnostics -> Command -> ps -A | grep filterdns).

    I do get these errors in the System|General log file (whcih could have been therefore prior to the upgrade and they are maybe a red herring):

    Oct 17 19:29:39 dhcpleases kqueue error: unkown
    Oct 17 19:29:38 dhcpleases Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process.
    Oct 17 19:29:38 dhcpleases /etc/hosts changed size from original!

    The /var/run/unbound.pid does exist.

    I also did a 'cat /etc/hosts' and the nas.fqdn.private entry is in there. I think we can discount the 'if pfsense cannot resolve it it won't be in a table' issue as pfSense can not only resolve it, it's put it into it's Hosts file.

    So the issue is that Aliases have no table entries.

    And I'll say again what I said earlier the upgrade to 2.4 broke this.



  • So I'll update the post with the fix for this.

    The first DNS server listed in System -> General Setup was dead. It was working fine as there were anotehr 3 DNS servers in there. As soon as I replaced this one server with a working one I started seeing 'filterdns' entries in the System|DNS Resolver log. I checked the rules and they have started working now as well.

    What I don't understand is:

    1. The majority of the aliases are for internal IP's and therefore don't need external DNS resolution;
    2. The majority of the aliases are for DHCP leases and are therefore registered by the DHCP service and appear in the pfSense Hosts file so again don't need external resolution;
    3. If you have 4 listed DNS servers and one breaks then why should this stop aliases working;
    4. What has changed that this issue did not appear before the upgrade to v2.4;

    If I have a working system and I upgrade it and parts of it stops working then that's a problem. It's a bug. A bug in the upgrade. A bug in the way something works. But it's a bug. Something that should work doesn't. That's clear from this.



  • If I understand it right?

    You have local DNS entries which appear in /etc/hosts?
    And now the first of the external DNS servers is not responding and local IPs (from /etc/hosts) are not resolved in alias table?

    Normally nsswitch.conf looks like:
    hosts: files dns

    What shows yours?



  • @ggzengel:

    If I understand it right?

    You have local DNS entries which appear in /etc/hosts?

    Yes. All of them were static DHCP leases. All of them resolved using Diagnostics -> Ping  and Disagnostics -> DNS Lookup in pfSense.

    And now the first of the external DNS servers is not responding and local IPs (from /etc/hosts) are not resolved in alias table?

    They don't appear in the aliases tables. And filterdns entries were missing from the System|DNS Resolver logs.

    Normally nsswitch.conf looks like:
    hosts: files dns

    What shows yours?

    It is:

    hosts: files dns

    Exactly the same.



  • I think you should open a bug in redmine.



  • @ggzengel:

    I think you should open a bug in redmine.

    I would do normally. But there is already a bug open for this. The way it was also completely dismissed without waiting for further information and pushed back to the forum means I'm not going to waste my time going through the hoops to do it. I appreciate that there was not a lot of information given on the issue raised but the way it was handled was poor. Pre-empting an issue as not a bug 'because we don't see it here' is a naive viewpoint and a does not encourage people to feedback on their project.

    But I do appreciate your help in this ggzengel. Between the pair of us it lead me to find what I did. It's been much appreciated.



  • I restarted my pfsense and got only one filterdns and it's working.
    Now I will have a look how long it will be stable.



  • Hi All,

    I know this is an old topic, but I too, have noticed this issue occurring since an upgrade to 2.4.2. This definitely wasn't an issue previously, and very few config changes have been made since the upgrade.

    I don't fully understand the process used to build these FQDN aliases, but I'll provide as much info as possible, in the hope it helps narrow down the root cause.

    I've created a test Alias, called Host_Test, containing the FQDN 'www.test.com'.

    • Viewing the table entry for this alias shows an empty table.

    • DNS servers for the firewall are set to 8.8.8.8 and 8.8.4.4. DNS forwarders or resolver are not in use.

    • DNS resolution for this hostname is working fine for both DNS servers under status -> DNS Lookup.

    • Runninng 'ps -A | grep filterdns' shows there is a process running called filterdns.

    • If I view the log under System -> DNS Resolver, I can see that on the date of the upgrade (I assume on first boot after) there are entries such as the below, for all almost all FQDN aliases configured on the firewall. There have since been no events logged in this log.

    filterdns failed to resolve host s186.fmp12-hosting.co.uk will retry later again.

    This firewall has an HA partner, which doesn't seem to be experiencing the problem. Based on the total lack of logs since the primary firewall's initial boot, I'm wondering if the root cause is the process hanging (I assume 'filterdns' is the relevant process). Is it possible to safely kill and restart this process, or are there other considerations when doing this?



  • Quick followup. It looks like the process was hung. It's currently working after running "killall -9 filterdns" then saving and applying an Alias to restart the process.

    What's potentially concerning is how soon after bootup this process seems to have stopped responding. Not sure if this is a one off for me, or something peculiar that's happening since the upgrade. I'll update this post if I notice the issue reoccur, especially after the next reboot.



  • I can confirm the issue and workaround by ChrisCCC
    (Filter DNS service hangs, killall - 9 filterdns and then Filter reload (in pfSense GUI) solves the issue.)

    I got the same problem after upgrade to 2.4

    Currently running snapshot [2.4.3-DEVELOPMENT (amd64) built on Sun Jan 07 20:44:55 CST 2018]

    Things to consider (in no particular order), that might be causing it:

    • I have substantial amount of hostname records in different  Firewall aliases (hundreds)

    • After a while some hosts become obsolete (i.e. hostname does not resolve to IP address)

    • sometimes DNS servers might not be responding quickly (for such a big volume of DNS queries, perhaps)

    I guess something is broken in filterdns algorithm after release 2.4: either after incorrect response from DNS server or absense of response causes it to hang.



  • Thanks for the workaround… this bug is driving me nuts too. Killing filterdns fixed the issue, at least temporarily for me. After a couple updates, it'll fail again.



  • It drove me crazy too, I wish I could have read this thread before I spent a few hours looking what is wrong.

    Also,
    https://forum.pfsense.org/index.php?topic=141441.15 is same topic.
    Maybe Moderators can merge it?



  • Does any body having this issue uses pfblocker ? In my case removing this PKG and replacing it by url tables aliases solved the issue.
    Filterdns was receiving several sighup signals before, before hanging.



  • @Valeriy:

    • I have substantial amount of hostname records in different  Firewall aliases (hundreds)

    ^This seems to be the problem, I had an alias with a list of Hundreds if not thousands of server Host names deleting them and then simply going to the other aliases with empty tables and clicking save to trigger a filter reload and update to the tables for those aliases has fixed the issue.

    For everyone having this issue I'd recommend checking your Aliases to see if possibly you or someone else maybe added an alias into the firewall with a very large amount of hostnames and doing a backup of them and trying a delete and re-applying the other affected ones.

    If we get some people coming back saying this is the cause then the devs will know what to investigate and hopefully get a fix put out for it.

    I'm On 2.4.4-DEVELOPMENT If anyone finds that info Helpful.



  • I've just experienced this issue, and I think I have a solution.

    For a while (including before the upgrade) I'd been seeing errors saying something like:```
    There were error(s) loading the rules: /tmp/rules.debug:24: cannot define table bogonsv6: Cannot allocate memory - The line in question reads [24]: table <bogonsv6> persist file "/etc/bogonsv6"
    @ 2018-05-16 19:13:38

    
    From [https://forum.pfsense.org/index.php?topic=145990](https://forum.pfsense.org/index.php?topic=145990) I learned that the size of the BogonsV6 file grew substantially recently. The thread suggests increasing maximum table entries. While investigating, I noticed that the tables for my aliases were empty (**Diagnostics** > **Tables**). After increasing the maximum table entries, my aliases were showing up in the tables view.
    
    I think this was probably a latent issue before the upgrade, caused by the size of the BogonsV6 table, and it's the reboot which has caused it, not the upgrade.

 

© Copyright 2002 - 2018 Rubicon Communications, LLC | Privacy Policy