Is this a bug? Hostname Underscore
-
The problems aren't as bad as they used to be, but many higher level systems (java, python, etc.) still have validators that fail when underscores are present in host names. This can result in some very obscure problems.
-
Maybe so, but that wouldn't have anything to do with the firewall or what it can/should accept.
From the perspective of a separate high level system, they should be valid as they would be dealing with hostnames as seen by DNS, unless I'm not understanding the context. Even for an IPAM system, it shouldn't be validating that way, unless it's optional to conform to a chosen site/org standard.
If a program is told to connect to a host using DNS and the DNS name contains and underscore and it fails, that's definitely a bug in the program at this point in time (Thanks to RFC 2181).
-
If a program is told to connect to a host using DNS and the DNS name contains and underscore and it fails, that's definitely a bug in the program at this point in time (Thanks to RFC 2181).
Nope, not a bug.
To be very clear, RFC 2181 does not remove the restriction on underscore in host names.
RFC 2181 reads to DNS itself, not to the data carried in DNS. DNS is a database. Host names are data in the database. People tend to equate DNS and host names because the vast majority of data carried in DNS is host name related. But they are not the same. And their restrictions are not the same. This is noted in 2181.
The wikipedia article on host name restrictions has a good description that calls out the difference:
https://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names
-
I'm aware of the distinction, but given how loose most people seem to be with validationā¦ Again: Why bother?
Even Namecheap's DNS front end allows me to enter them for an A record, and I can resolve them fine all the way through to my workstation.
The classical reason it was rejected "It was shift -, and DNS is not case sensitive" doesn't make much sense. It works in practice, and people are actively using it, so why should we reject it, except to strictly follow the RFC when practically nobody else is?Ā We'd lose functionality, get complaints, etc. It's a losing proposition.
-
I am not saying that pfSense needs to expressly prohibit it. I am saying that it cannot be claimed to be a bug in other's code if problems arise from its use.
-
If they want to strictly point at the RFC and claim that, sure, but the fact is, the cat's out of the bag. People are already doing it, and to not accept that and allow it at this point is probably worth calling a bug. Consumers of the program will eventually demand they "fix" it, and depending on the program, resisting the change is most likely not in their best financial or general interest.
-
To clarify some points:
1. By the letter of the RFC sections that have not been replaced or superseded, underscore is not allowed in hostnames
2. Despite #1, DNS servers, clients, host operating systems, etc (including pfSense), have been allowing underscore in hostnames. Not in line with the original RFC, but with what is in actual use and reflecting reality. Most likely because underscores are now allowed in domain names and service record names, and people are generally lazy and don't want to overdo the validation code.
3. It works fine in many cases, and is in active use
4. Therefore, anyone in this day and age sticking to the letter of the RFC is being pedantic, and likely doing a disservice to their users. I'd call that a bug, others may not. Agree to disagree.IDN/IDNA, Unicode, and other standards most likely make it all even more confusing and hard to validate. The standards track documents like RFC 5890 call each section of the FQDN a "label" and treat them equally, while also referencing RFC 953 at times. It's as if nobody wants to come out and say it, but they know nobody is actually paying attention to the old spec now.
And this from RFC 2181:
The DNS itself places only one restriction on the particular labels
Ā that can be used to identify resource records.Ā That one restriction
Ā relates to the length of the label and the full name.Ā The length of
Ā any one label is limited to between 1 and 63 octets.Ā A full domain
Ā name is limited to 255 octets (including the separators).Ā The zero
Ā length full name is defined as representing the root of the DNS tree,
Ā and is typically written and displayed as ".".Ā Those restrictions
Ā aside, any binary string whatever can be used as the label of any
Ā resource record.Ā Similarly, any binary string can serve as the value
Ā of any record that includes a domain name as some or all of its value
Ā (SOA, NS, MX, PTR, CNAME, and any others that may be added).
Ā Implementations of the DNS protocols must not place any restrictions
Ā on the labels that can be used.Ā In particular, DNS servers must not
Ā refuse to serve a zone because it contains labels that might not be
Ā acceptable to some DNS client programs.Ā A DNS server may be
Ā configurable to issue warnings when loading, or even to refuse to
Ā load, a primary zone containing labels that might be considered
Ā questionable, however this should not happen by default.Using that interpretation, the hostname is just another label part of the record name, and the only restriction is length. And pfSense must not refuse to allow a record just because some client might not like it.
So if it works on your application or operating system, have at it. We won't get in the way. If you have something that refuses to let it work, you can accommodate it manually by not using underscores.
-
If it violates RFC then it's a bug.Ā The fact it is being widely used and accommodated does not change that.Ā The devices, code, people, etc. that are using it and accommodating it are in error.
This is how defacto standards end up coming about and creating incompatibilities.
On the page referenced in the OP the "hostname" portion only is explicitly being requested.Ā Not a domain name, FQDN, label, or DNS record.
Now that a defacto standard has permeated the environment it becomes to a mater of practicality whether or not to enforce the industry standard.Ā From the practicality perspective I personally don't really have a dog in the show one way or the other.
I do think though that there needs to be a clear, agreed upon industry standard and that it should either be followed or changed.Ā Not left up to peoples whims about what they want to do.
If those not following the standard are negatively impacted by the standard being enforce then that is on them and if they don't like it then they should put in the work to get the standard changed to suit them.
Not enforcing the standard also comes with impacting those who do.Ā Yes because pfSense isn't enforcing the standard and accepted an invalid hostname containing underscores from a DHCP client, I had to troubleshoot and accessibility problem with an application that correctly enforces the standard.
-
How I read the section I quoted, in context with the other RFCs is:
1. If you are a host, you probably shouldn't allow _ in your own hostname ā This is all you can realistically control
2. If you are a DNS client, you should resolve whatever you're asked to resolve
3. If you are a DNS server, you must allow anything anywhere in both a record name and data in any part (within the stated length restrictions)But it's not clearly stated, and having to correlate a bunch of RFCs (including some not formally accepted like the standards track IDN/IDNA ones), it's very confusing for everyone.
Which is why I fall back to: There is no benefit to being strict at this point in time, and if we were to be strict after being lenient, it would only generate complaints and wouldn't accomplish anything meaningful.
-
There is a #4, which is all the infrastructure and applications that use host names. This is by far the biggest source of issues.
I agree that #2 and #3 are clear. DNS server and DNS resolver implementations are supposed to be blind to the meaning of the data being resolved and not impose any restrictions beyond RFC 2181. However, it is up to the higher layers to determine if the data returned is considered valid in context. In simple command line terms, dig is data blind and must return the result of the query without interpretation, however that doesn't mean that an application like ssh must honor the result as a valid host name.
Although DNS servers are supposed to be data blind, they aren't always. The most commonly used name server is still bind. One of the reasons that it isn't liked by some is it has a lot of features that extend beyond the basic function of answering queries. Host name validation is one of these features. Bind will actually inspect A and AAAA records for validity, warning and optionally ignoring host names containing underscores. Some would consider this a convenient feature, others might consider this a violation of RFC 2181. :)
-
Here is an interesting piece of history:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=176093
FreeBSD responding to Windows violating the RFCs.
What makes it particularly entertaining is that while the Windows operating system was actively violating the RFCs, Internet Explorer was actively enforcing them. I haven't used Windows in years so I can't confirm, but it appears that they still are enforcing the RFCs in IE:
-
Yes because pfSense isn't enforcing the standard and accepted an invalid hostname containing underscores from a DHCP client, I had to troubleshoot and accessibility problem with an application that correctly enforces the standard.
Btw, what was the application? Is it Java based?
-
Yes because pfSense isn't enforcing the standard and accepted an invalid hostname containing underscores from a DHCP client, I had to troubleshoot and accessibility problem with an application that correctly enforces the standard.
Btw, what was the application? Is it Java based?
You already know what the application was.Ā You mentioned it and posted a link to the "non-bug" in your previous post.
https://connect.microsoft.com/IE/feedback/details/853796/internet-explorer-wont-save-cookies-on-domains-with-underscores-in-them
So consider it verified that IE is still enforcing the RFC.Ā Like so many others should be.
In my case it was an HP printer that was issuing invalid hostname containing underscores.Ā IE would open the printer's built-in web page but the page would not work correctly because IE wasn't saving the cookies.Ā Had to workaround it by accessing with IP address instead, until I figured out what the issue was.Ā Would have been much more obvious if pfSense had refused to register the invalid hostname provide by the client.Ā Fortunately the latest printer firmware doesn't allow or use underscore in the hostname.Ā In my opinion neither should pfSense accept underscore in hostnames.Ā They are not valid.Ā Just because people mistakenly/incorrectly/ill-advisedly use underscores in hostnames does not make them valid.Ā i.e. per spec.
If people want to operate outside of spec then they should be ready and willing to bare the burden when the spec is enforced.