How to block SLAAC on a VLAN.
-
I am in learn mode here so please be gentle!
I recently started a task to replace my unmanaged switches with managed ones. We chose tp-Link because we have been quite happy with their business-grade WiFi access points.
First I installed four new switches replacing the unmanaged ones. The new managed switches work fine, in default mode, meaning VLAN 1 everywhere. With that done I set out to use some of the new features by adding my first VLAN. Here is what I did:
-
added a new VLAN definition to pfSense (SG-5100), using the LAN(igb1), WAN is igb0. This new VLAN is meant for IOT devices and guests. They should be able to access each other and reach the internet with no inbound ports open. That's it, no access to LAN.
-
defined the VLAN on the switch and assigned igb1's port to the VLAN.
-
added the VLAN to a second switch, used by the WiFi access points, tagging the trunk port from the first switch as well as the ports used by the three WiFi access points. No changes were made to the other two switches. They don't need to access the new VLAN.
-
added a new SSID and assigned it to the VLAN. It's important to note that at this time, VLAN 1 is still on for all switch ports. This is required because the WiFi access points host an SSID that bridges to the LAN interlace.
-
activated the VLAN interface, configured DHCP, and firewall rules. After a bit of fibrillation, IPv4 works fine. However even though the VLAN's IPv6 configuration type is "None", IPv6 is causing an issue.
For some reason, VLAN clients are receiving the SLAAC broadcasts being sent out by the "LAN's" DHCPv6 server. Some devices don't seem to care. However cell phones and tablets are "IPv6 first". They are having trouble making DNS queries.
So the first question is, if the pfSense VLAN has no IPv6 configuration, and no IPv6 DHCP server, why are the clients seeing the SLAAC broadcast?
The second question is how to resolve the issue? Can I block the traffic in the firewall? So far I can't figure out how to do that. I am guessing that taking the VLAN ports out of VLAN 1 will stop the behavior, but that will have the effect of killing the most important SSID that we have.
Any thoughts you might share will be of great help.
-
-
@bigtfromaz said in How to block SLAAC on a VLAN.:
So the first question is, if the pfSense VLAN has no IPv6 configuration, and no IPv6 DHCP server, why are the clients seeing the SLAAC broadcast?
because those tplink switches your found of prob leaking vlan 1... They don't let you remove vlan 1 from ports on some of their switches... Which makes them pure junk..
Their AP prob doing the same thing.. I really don't think the company understands how vlans are suppose to work to be honest ;)
-
@johnpoz I can remove VLAN 1 for sure but if I do that the production SSID that is currently not tagged will fail. I eventually want to negate VLAN 1 entirely but have to figure out how to safely move LAN to a VLAN and preserve all my configuration. LAN has a lot going on.
In the meantime is there a way to block those broadcast with a rule on the VLAN?
-
Just at a loss to what your talking about... Why would you think it ok to run more than 1 untagged vlan on a port?
If your device is in vlan X, then it should be vlan 1 as well.. It should only be set for untagged vlan X, and pvid of vlan X... vlan 1 doesn;t come into play at all.
Now if sending traffic to your AP, sure vlan 1 could be the untagged, and your other vlans your going to put out on different ssids would be tagged.
So your AP is leaking vlan 1 to other vlans - that is a problem!! And you shouldn't have to block them, because you should never have multiple vlans on the same layer 2 like that.
-
@bigtfromaz said in How to block SLAAC on a VLAN.:
For some reason, VLAN clients are receiving the SLAAC broadcasts being sent out by the "LAN's" DHCPv6 server
Here's the reason:
We chose tp-Link
A "feature" of some TP-Link gear is they don't handle VLANs properly, with multicasts leaking between networks. Their access points also have that problem. I can't set up a guest SSID on my TL-WA901N because of this.
-
@johnpoz Being new to a lot of this I may be using the wrong nomenclature.
On the switch's configuration pages, VLAN 20 shows that the trunk ports and WiFi ports are selected as "Tagged". The configuration page for VLAN 1 shows all ports as untagged. If I understand, that means the WiFi access points will only see untagged LAN, or tagged VLAN 20 traffic. Does any of that make sense?
On the access points, the new new IOT SSID is tagged. The original production SSID is as it was before the new switches. The prod VLAN is off which I assume means it is emitting untagged packets. This stands to reason because the access points also support unmanged switches and those may not be able to handle tagged packets at all.
All the switches came with a default VLAN 1 configuration defining all ports as untagged. I assume that means the switch treats all VLAN 1 packets the same as untagged packets. True?
Putting all this together, it would seem that LAN packets are untagged and arriving at the WiFi port still untagged. I don't have a clue on how to check that.
In any event, the SLAAC broadcast packets are obviously showing up on VLAN 20 devices and that's a problem because they are being emitted by LAN. If an SSID is on VLAN 20 then it should be ignoring everything else. So it could be what you call leaking in the WiFi access points. An alternative to that is somehow, somewhere one of the switches, or the pfSense router is adding VLAN 20 tags to broadcast traffic, which makes no sense at all. I am pretty sure it's the access point too.
I am going to call tp-link and pose this issue to them.
-
@JKnott It's funny you should mention multicast. My switches have a feature called Multicast filtering. I will look into that.
-
For some reason, VLAN clients are receiving the SLAAC broadcasts being sent out by the "LAN's" DHCPv6 server.
This is exactly the problem I was experiencing with my AP. I was talking to TP-Link support about it a few years ago. First level thought it was normal, but 2nd accepted it was a flaw. However, there never was a fix for my AP.
-
@JKnott I will let you know what I find.
-
@bigtfromaz said in How to block SLAAC on a VLAN.:
If an SSID is on VLAN 20 then it should be ignoring everything else.
Its shouldn't have to ignore anything... if you have broadcast/multicast from vlan 1 traffic leaking to vlan 20 -- that is a problem!! That is tplink lack of understanding of what vlans are if you ask me.
I would never use their products where vlans are required because of personal experience with them and with horror stories here and then my own research into the issues with multiple products of theirs..
I had actually purchased one of their low end switches because I thought the users here posting the problems they were seeing were just not setting them up correctly... And hey it was 35 bucks.. Could use a switch on the shelf that could do vlans for that price sure.. Come to find out its just plain a POS!!!! And you could not remove vlan 1 at all.. And from the forum threads over on tplink - they thought this was normal and a feature - WTF!!!
They finally fixed from what the firmware says for v3 of their hardware... but they never released anything for v2... Which I just found out recently you can actually install the v3 firmware on the v2 hardware... And now the gui says your running v3 hardware.. It allows you to now remove vlan 1 from ports... I have not had time to actually test that it does but the gui now now allows you to remove vlan 1.. So step in the right direction.
But from this, and from @JKnott horror stories with their AP... I could never suggest anyone buy any of their products if they have any plans on doing anything with vlans.
So I would suggest you validate you can remove the vlan 1 - but you should never see bleed over from vlan X to vlan Y be it broadcast or multicast - if you are... Then something is WRONG, be it with the firmware of that device or the config.. But that should never happen.. vlans are isolation at layer 2, if your seeing broadcast and or multicast then you have not isolated at layer 2.. And you might as well just run your multiple layer 3 networks all on some dumb switch and not even worry about tags, etc..
But it could come down to their equipment is just flawed, and you will have to buy equipment that actually isolates the vlans like they are suppose to be isolated. The unifi AP seem to handle this fine, and have had zero issues with any sort of vlan leakage on them that I have run into.
-
Ok. The night before last, when the problem was occurring, I stayed late. After things calmed down on the network, I took two phones, mine and a test phone and disconnected them from all networks.
I brought up the test phone on the SSID that links to the VLAN, no SLAAC broadcast received, IPv4 configured fine and all was well. I let a about a minute pass. Still no SLAAC. That was a bit of a surprise.
Then I brought up my phone on the LAN. It received the correct configuration, IPv4 and an IPv6 configuration supplied by SLAAC.
Then I was surprised again. The moment my phone received its IPv6 configuration from LAN, the test phone received one on the VLAN. I was tired and left it that way for the night.
Now here is the really confusing part. When I returned yesterday morning everything was fine! The test phone had a link-local IPv6 configuration, no SLAAC. All other phones are fine as well. In other words, the problem disappeared and it hasn't returned. At this point, I can no longer recreate the issue. I restarted all switches and APs last night and everything is still fine today.
Sigh...Obviously something changed but I can guarantee you it wasn't anything I did while sleeping. Any thoughts? Perhaps there is a cache, or table somewhere that refreshed overnight?
I had opened a ticket with tp-link and they had requested config files and a network diagram. They want to recreate the issue and correct my configuration or debug. I am going to close that ticket and will give them the above observations as well. As an FYI the tp-link switch models in use are T1600G-18TS (1) and T1500G-10PS (3). VLAN 1 can definitely be disabled port by port the switches. The access points are EAP225 (2) and EAP445 (1). I still think the leak is/was in the access points.
In any event, I have no trust in the way things are. The end goal has always been to totally negate VLAN 1 so the Netgate 5100-G router will never send out a VLAN 1 packet, nor will it send any untagged packets. Any links that might help in that regard will be greatly appreciated. The part I don't understand is how to safely move my LAN from igb1 to a VLAN without losing my LAN interface configuration.
-
Just remove those tplink devices from your network - I do not trust them at all to understand isolation of vlans... When they will not let you remove vlan 1 from a port.. But let you assign another untagged vlan to the port - they do not understand how vlans are suppose to work.
Don't buy their products is the only way to get them to understand it seems.