Bugs related to DNS, DHCP, Bridging and the GUI
-
Dear all,
I have been "playing around" with 2.0-BETA1 I have uncovered a few bugs related to DNS, DHCP, Bridging and the GUI.
The description below is looooong, but I hope it will enable the superb development team to find the problem, or that someone can just tell me what I am doing wrong…
I am working on a Soekris net5501 with 8 ethernet ports and a single SSD disk.
My objective is to have one port connect by DHCP to the upstream WAN, and the other ports bridged together to serve LAN clients.
First I installed the 20100308 01:34 snapshot; full install with embedded kernel (not the embedded nano).
After seeing a lot of the problems mentioned below, I decided to start over and note down exactly what I did.I did a console upgrade to the 20100319 04:58 snapshot and a console Reset to factory settings, followed by a reboot.
On serial console:
- Assign LAN and WAN interfaces
- Set LAN interface ip address (192.168.xx.222/24; enable DHCP)
All is good; server connects to upstream server on WAN; client gets IP address from DHCP; DNS is resolving.
On admin interface:
In setup wizard assign hostname and domain.
Leave everything else alone.The final page says "A reload is now in progress […] You can click on the icon above to access the site more quickly."
But the icon links to https://192.168.xx.222/192.168.xx.222.
This results in a 404 error.1 I THINK THIS IS A BUG - it should point to https://192.168.xx.222/
Leaving it for 2 minutes however redirects to the correct page.
On System->General Setup
[check] Allow DNS server list to be overridden by DHCP/PPP on WAN
[Save]On Interfaces->Assign press [plus] multiple times to get all interfaces listed.
[Save] then [Apply Changes]For each opt interface: Interfaces->OPTxx
[check] Enable interface
(leave type as "None")
[Save] then [Apply Changes]Interfaces->Assign Bridges:
Add a bridge between LAN and the relevant OPTxx interfacesOn Interfaces->Assign press [plus] once again to get the BRIDGE interface listed.
Interfaces->OPTxx (for the bridge interface)
[check] Enable interface
Type: Static
IP: 192.168.xx.254/24
[Save] then [Apply Changes]System->Advanced->System Tunables
net.link.bridge.pfil_member 0
net.link.bridge.pfil_bridge 1
[Apply Changes]Firewall->Rules->BRIDGE
Protocol: any
Source: Bridge Subnet
[Save] then [Apply Changes]Firewall->Rules->BRIDGE
Protocol: UDP
Source port range: 67-68
Destination port range: 67-68
[Save] then [Apply Changes]Services -> DHCP Server - LAN
[] Enable DHCP server on LAN interface
[Save]Interfaces -> LAN
Types: None
[Save] then [Apply Changes]Now the GUI cannot be contacted (neither on .222 nor on .254)
2 I THINK THIS IS A BUG - at this time the GUI should be available on .254
Reboot
The GUI is now available at 192.168.xx.254
Services->DHCP server
The GUI shows the "Services: DHCP server" header and one tab: "BRIDGE" (in dark gray)
However, the first line says: "Enable DHCP server on LAN interface"3 I THINK THIS IS A BUG - the tab is BRIDGE, so the interface should be BRIDGE
Pressing the "BRIDGE" tab makes it "light gray", and now we have
"Enable DHCP server on BRIDGE interface" as expectedHave client re-aquire IP by DHCP - works!
And resolving DNS from client works as well. Great.Even now, if I go back to Services->DHCP server:
The browser shows the "Services: DHCP server" header and one tab: "BRIDGE" (in dark gray)
However, the first line says: "Enable DHCP server on LAN interface"
So the problem is still there…Time for yet another reboot ...
... and have client re-aquire IP by DHCP.I shift the client LAN cable to OPT1
And - yo and behold - it works :DNow I install the "dns-server" package (TinyDNS)
Services->DNS forwarder
[]Enable DNS forwarderServices->DNS Servers
Wizard...
Doman name: zz.lan
Primary Nameserver: ns.zz.lan
First A record: ns.zz.lan -> 192.168.xx.254Status->Services shows
DNS Server (TinyDNS Server) running and dnsmasq DNS forwarder running4 I THINK THIS IS A BUG - at this time, "DNS forwarder" aka dnsmasq has been disabled, so it should not run.
Going back to Services->DNS Forwarder shows the DNS forwarder still active.
Remove [] Enable DNS forwarder and [Save]Only now does the Status->Services show no DNS forwarder.
On Services -> DNS Server - Settings:
[check]Enable DNS Forwarders
[Save]
GUI now shows IP Address = 127.0.0.1
and [check]Enable DNS ForwarderHowever, resolving from client does not work, and pfsense /etc/resolv.conf shows
domain xx.yy
nameserver
(I think /etc/resolv.conf should have contained the WAN DNS's)So, once again: Reboot pfSense
Still, resolving from pfsense or client does not work.
However pfsense /etc/resolv.conf now shows the two upstream WAN DNS servers.Services->DNS Server - settings shows:
IP Address 127.0.0.1
Enable DNS Forwarders [check]Serial console shows that dig @127.0.0.1 xx works; but dig xx does not.
Apart from the bugs reported above, I think we have a general problem:
Services (DNS FOrwarder, TinyDNS, etc) wants to bind to the LAN address.
But in a bridged environment, this is not what we want5 I THINK THIS IS A BUG - It should be possible to bind services to ANY assigned static address
I need a working box, so:
Services -> DNS Server
Enable DNS Forwarders []Servicess -> DNS Forwarder
Enable DNS Forwarders [check]Still DNS resolving does not work.
Reboot
Force the client to acquire a new IP by DHCP.
And now DNS resolving from the client works !!
6 I THINK THIS IS A BUG - If it works now, it should have worked before reboot
/Henrik
-
Committed fixes for the first and third bugs. As for the second, in my opinion there are some design flaws that need to be resolved that are related to how you initially create and configure bridges, currently requiring that you connect to the web gui on an interface that will not be a member of the bridge (or at least not initially a member), change IP addresses from the console after a certain point, or reboot after a certain point. I might look into it sometime.
-
I can confirm that the "Services: DHCP server" tabs now work correctly (third bug).
Thank you very much for the fix :D :D
(I did not check first bug).Bugs 2, 4, and 6 have workarounds.
However, I have been looking a little more into the TinyDNS issues (5) and created two bug reports:
http://redmine.pfsense.org/issues/show/439
http://redmine.pfsense.org/issues/show/440
I have no workaround for those (although solution proposals are included in the bug reports) -
http://redmine.pfsense.org/issues/show/440 has been updated with a patch.
Any help in testing and checking in this patch would be most appreciated.See also http://redmine.pfsense.org/issues/show/442 which contains a patch for re-installation of TinyDNS
-
Should I create redmine bug reports for bug 2, 4 and 6 ?
-
resolv.conf file generation should be ok on latest snaps.
-
I have been watching the git repo, but did not see the fix ???
Could you point me to where this was fixed? -
Now the GUI cannot be contacted (neither on .222 nor on .254)
Did you gave time for it to be available?!
Usually it will take around 40-50 seconds for the gui to restart.@kaparasoft
https://rcs.pfsense.org/projects/pfsense/repos/mainline/commits/1033de7481dacd83ee5a1a16078e89c7b4e9efd8 -
I need to do a complete reinstall, so I will try again, making sure I wait at least one minute for the GUI to come up.
I am not so sure the 033de7481dacd83ee5a1a16078e89c7b4e9efd8 commit fixes the problem.
TinyDNS is reading and writing /etc/resolv.conf, not /var/etc/nameservers_*
But I will try it out! -
Unfortunately, I can confirm that the fix did not work.
I did a clean install of 20100324-0246.
Even without TinyDNS, resolving is borked, as /etc/resolv.conf does not contain any nameservers!
(I will continue to see if I can pinpoint the problem…) -
can you please show me the output of ls /var/etc/nameserver_*
Do they have any ip on them? -
Found the problem!
It is in system.inc function get_nameservers. You have:
$master_list[] = $item;
but it should be:
$master_list[] = $dns;
-
Regarding bug #2:
@ermal:
Now the GUI cannot be contacted (neither on .222 nor on .254)
Did you gave time for it to be available?!
Usually it will take around 40-50 seconds for the gui to restart.I can confirm (20100324-0246 snapshot) that after 10 minutes, neither GUI nor SSH can be contacted on .222 or .254.
After reboot GUI and SSH are available on .254 -
An ifconfig and netstat -rn would be useful
-
netstat -f inet -l
shows nothing listening on any ports.I will try the "netstat -rn" next time I am rebuilding the box.
-
kaarposoft: netstat -rn is for showing the routing table, not the status of ports.
-
I know, but I can only do it when I reinstall the box!
-
I tried again: Installing 20100324-0246, upgrading to 20100324-2048 from console.
Following the steps of the original post.After activating the bridge and setting LAN to "none", I get:
netstat -rn Destination Gateway Flags Refs Use Netif Expire default 87.52.xx.1 UGS 0 15 vr0 87.52.xx.0/24 link#1 U 0 708 vr0 87.52.xx.120 link#1 UHS 0 0 lo0 127.0.0.1 link#12 UH 0 1127 lo0 127.0.0.2 127.0.0.1 UHS 0 0 lo0 192.168.yy.254 link#13 UHS 0 0 lo0
(plus a lot of IPv6; xx and yy are my sanitizing).
Link#1 is the LAN, #12 is lo0 and #13 is the bridge.After reboot I get:
Destination Gateway Flags Refs Use Netif Expire default 87.52.xx.1 UGS 0 47 vr0 87.52.xx.0/24 link#1 U 0 137 vr0 87.52.xx.120 link#1 UHS 0 0 lo0 127.0.0.1 link#12 UH 0 19 lo0 127.0.0.2 127.0.0.1 UHS 0 0 lo0 192.168.yy.0/24 link#13 U 0 1 bridge 192.168.yy.254 link#13 UHS 0 0 lo0
But this time, I get no working GUI even after reboot :o