23.01RC - Suricata stops working after Wireguard installed
-
@mrsunfire said in 23.01RC - Suricata stops working after Wireguard installed:
@bmeeks The /30 did the trick! Changing the tunnel network to /30 subnet prior to upgrade and suricata + wireguard is running fine with 23.01-RC.
Thanks for the confirmation. This is a Radix Tree insertion problem within the Suricata binary and my custom blocking plugin. If a /30 works for you now, just use that and I will work on a fix for the Radix Tree problem. Unfortunately it will require a change in the binary portion of the Suricata package, so not as fast and easy as changing something in the PHP GUI code.
How do you change the tunnel network to a /30 subnet? Do you mean that you set this for the Peer(s) in the Wireguard UI?
I switched over to 23.01 now and changing things there have no effect whatsoever. Suricata is monitoring the interface not the peers, and the passlist file doesn't change until you make a change in the interface.
However, changing the Interface VPN (tun_wg0) to a /30 subnet is not allowed (This IPv4 address is the network address and cannot be used). Changing it to /32 is allowed but breaks the tunnel...
Hmmm, I see now that this is because I'm using n.n.n.0 for the gateway (n.n.n.1 at the other site). I guess if I changed that to .1 & .2 I would be able to use /30 at this end... big risk messing with the remote server though... -
@gblenn Well just change the tunnel interface to /30. Remember that you can't use a 0 at the end of an IP address within this subnet.
-
@mrsunfire said in 23.01RC - Suricata stops working after Wireguard installed:
@gblenn Well just change the tunnel interface to /30. Remember that you can't use a 0 at the end of an IP address within this subnet.
Yes that's what I realized so I need to make changes at the other site as well, which represents a bit of a risk...
-
@gblenn I feel that. Had the same. Turns out it was very handy. First change remote alias (if you have one for the tunnel). Then change remote wg interface. Now on your site and you should be good to go. That's what I did.
-
@mrsunfire Of course it stopped working, since I forgot to change routing on the remote server (as I changed gateway IP at home).... Tailscale to the rescue, I'm back in business...
Next step is to try it on 23.01...
-
And that worked, I'm up and running with 23.01.r.20230202.0019 with my full config.
-
@gblenn said in 23.01RC - Suricata stops working after Wireguard installed:
And that worked, I'm up and running with 23.01.r.20230202.0019 with my full config.
Glad you are back up and running. I will still look into the /31 issue with the Radix Tree, but it may turn out to be something upstream has to modify. The use of /31 subnet masks is still a bit iffy with software apps today. Not everything is fully onboard with that one yet.
-
@bmeeks said in 23.01RC - Suricata stops working after Wireguard installed:
@gblenn said in 23.01RC - Suricata stops working after Wireguard installed:
And that worked, I'm up and running with 23.01.r.20230202.0019 with my full config.
Glad you are back up and running. I will still look into the /31 issue with the Radix Tree, but it may turn out to be something upstream has to modify. The use of /31 subnet masks is still a bit iffy with software apps today. Not everything is fully onboard with that one yet.
Thanks, good that you keep investigating. I have no idea how common it is to use /31 in a site to site config, but I think I was following one of many guides out there...
-
@gblenn said in 23.01RC - Suricata stops working after Wireguard installed:
@bmeeks said in 23.01RC - Suricata stops working after Wireguard installed:
@gblenn said in 23.01RC - Suricata stops working after Wireguard installed:
And that worked, I'm up and running with 23.01.r.20230202.0019 with my full config.
Glad you are back up and running. I will still look into the /31 issue with the Radix Tree, but it may turn out to be something upstream has to modify. The use of /31 subnet masks is still a bit iffy with software apps today. Not everything is fully onboard with that one yet.
Thanks, good that you keep investigating. I have no idea how common it is to use /31 in a site to site config, but I think I was following one of many guides out there...
I cannot seem to reproduce your original failure. Suricata is starting for me every time without error when I put in what I think are your old tunnel networks. I want to reliably reproduce the failure so I can know if my fix actually works or not.
Can you share your Wireguard tunnel and peer addresses? And if you can, the other firewall interface IP settings as well so that I can duplicate everything in my test setup. If you don't want to post that information in the open forum, can you share them via a DM (Forum chat message)?
-
@bmeeks I can provide more detail later on when I'm back home but I think these are the key elements.
- The wireguard Interface for the home site has the IP 10.6.210.0 /31 (the peer site has 10.6.210.1 /31)
- The allowed IP's for the wireguard peer(s) use 10.6.210.0 /31 at both ends. But I do think it is the Interface that is the actual trigger here. This setting is only an internal thing for wireguard as far as I understand...
-
@gblenn said in 23.01RC - Suricata stops working after Wireguard installed:
@bmeeks I can provide more detail later on when I'm back home but I think these are the key elements.
- The wireguard Interface for the home site has the IP 10.6.210.0 /31 (the peer site has 10.6.210.1 /31)
- The allowed IP's for the wireguard peer(s) use 10.6.210.0 /31 at both ends. But I do think it is the Interface that is the actual trigger here. This setting is only an internal thing for wireguard as far as I understand...
Thank you for the details. I will try these settings in my virtual machine. I want to be able to reliably reproduce the issue so that I can test and then have confidence in any fix I come up with.
One small difference is my VM is pfSense 2.7.0 CE instead of 23.01 Plus, but I can't imagine that would matter. Suricata is exactly the same on both platforms.
-
@bmeeks I am also using VM and did make a quick test using 2.7.0 and had the same issues.
I got back home now and as I have my two tunnels up and running using /30 subnet, I decided to make a test with a third tunnel. I basically only created the tunnel, with 10.10.200.0 /31 which includes setting up the interface. I did not bother going into routing or creating a peer for that tunnel...
At first it did not seem to create the crash but I noticed that my passlist did not update.
In the drop down in Suricata I have three options: default, none and passlist_18215. I have had the 18215 selected and it wasn't until I changed over to default that I got the crash this time... So the passlist_18215 did not update but the default did... perhaps something for you to look into. -
@gblenn:
One last request -- can you post the actual contents of both the default Pass List file (from the interface subdirectory), and then also post the contents of thepasslist_18215
file? You can obfuscate the public WAN IP if you wish. I want to see if any other IP address in there is perhaps colliding with the tunnel IPs.Edit: never mind -- I just got the failure. The trick in my case to reproduce is to stop Suricata, have the Wireguard setup in place, then attempt to restart Suricata. I believe I know where and why the failure is happening. It is within the custom blocking plugin and its interaction with the Radix Tree API in the Suricata binary. I am pretty sure I can fix it, though.
Here is the
suricata.log
file showing the failure:6/2/2023 -- 09:38:29 - <Notice> -- This is Suricata version 6.0.8 RELEASE running in SYSTEM mode 6/2/2023 -- 09:38:29 - <Info> -- CPUs/cores online: 4 6/2/2023 -- 09:38:29 - <Info> -- HTTP memcap: 67108864 6/2/2023 -- 09:38:29 - <Info> -- alert-pf -> Creating automatic firewall interface IP address Pass List. 6/2/2023 -- 09:38:29 - <Info> -- alert-pf -> adding firewall interface em0 IPv6 address fe80:0000:0000:0000:020c:29ff:fe38:9fdd to automatic interface IP Pass List. 6/2/2023 -- 09:38:29 - <Info> -- alert-pf -> adding firewall interface em0 IPv4 address 192.168.10.29 to automatic interface IP Pass List. 6/2/2023 -- 09:38:29 - <Info> -- alert-pf -> adding firewall interface em1 IPv6 address fe80:0000:0000:0000:020c:29ff:fe38:9fe7 to automatic interface IP Pass List. 6/2/2023 -- 09:38:29 - <Info> -- alert-pf -> adding firewall interface em1 IPv4 address 192.168.233.10 to automatic interface IP Pass List. 6/2/2023 -- 09:38:29 - <Info> -- alert-pf -> adding firewall interface vmx0 IPv6 address fe80:0000:0000:0000:020c:29ff:fe38:9ff1 to automatic interface IP Pass List. 6/2/2023 -- 09:38:29 - <Info> -- alert-pf -> adding firewall interface vmx0 IPv4 address 192.168.2.1 to automatic interface IP Pass List. 6/2/2023 -- 09:38:29 - <Info> -- alert-pf -> adding firewall interface lo0 IPv6 address 0000:0000:0000:0000:0000:0000:0000:0001 to automatic interface IP Pass List. 6/2/2023 -- 09:38:29 - <Info> -- alert-pf -> adding firewall interface lo0 IPv6 address fe80:0000:0000:0000:0000:0000:0000:0001 to automatic interface IP Pass List. 6/2/2023 -- 09:38:29 - <Info> -- alert-pf -> adding firewall interface lo0 IPv4 address 127.0.0.1 to automatic interface IP Pass List. 6/2/2023 -- 09:38:29 - <Info> -- alert-pf -> adding firewall interface tun_wg0 IPv4 address 10.6.210.0 to automatic interface IP Pass List. 6/2/2023 -- 09:38:29 - <Info> -- alert-pf output device (regular) initialized: block.log 6/2/2023 -- 09:38:29 - <Info> -- alert-pf -> Loading and parsing Pass List from: /usr/local/etc/suricata/suricata_22480_em1/passlist. 6/2/2023 -- 09:38:29 - <Error> -- [ERRCODE: SC_ERR_FATAL(171)] - prefix or user NULL
The problem happens when the addition of the VPN address is attempted after the automatic pass list logic has already registered the IP for the Wireguard interface itself. Notice that it fails immediately upon attempting to load the Pass List. The WG Tunnel IP is the very first entry in my Pass List file.
The internal logic of the custom blocking plugin first grabs all of the firewall interface IP addresses and inserts those into the Radix Tree so no firewall interface IP will get blocked. Then the code reads in the Pass List file and adds those IP addresses and/or subnets to the Radix Tree. The collision happens when the Wireguard Tunnel IP values pulled in from the "get configured VPNs" call into pfSense returns the same IP value that exists for the actual Wireguard interface. When insertion of that IP is attempted with the /31 mask, the Radix Tree complains. It wants additional data supplied to help it differentiate between the two IP addresses. I suspect the /31 mask is tripping it up.
-
@bmeeks Yes restarting Suricata might be needed for this to happen.
But what about only one of the lists being updated when I added the interface? It's like the issue I had on the test environment earlier...
The default list contains all the "auto generated things". Picked from System > General (DNS), Interfaces (VLAN and VPN) which includes wireguard. AND this list got updated when I created a new interface.
The passlist_18215 includes all the things I added via Suricata Passlist (GUI), but it did not pick up the new interface. And when I had that list selected, I could not recreate the problem...
-
@gblenn said in 23.01RC - Suricata stops working after Wireguard installed:
@bmeeks Yes restarting Suricata might be needed for this to happen.
But what about only one of the lists being updated when I added the interface? It's like the issue I had on the test environment earlier...
The default list contains all the "auto generated things". Picked from System > General (DNS), Interfaces (VLAN and VPN) which includes wireguard. AND this list got updated when I created a new interface.
The passlist_18215 includes all the things I added via Suricata Passlist (GUI), but it did not pick up the new interface. And when I had that list selected, I could not recreate the problem...
Pass List files themselves only get written to when you click Save on the interface settings tab, or in a few other select instances. And what you are seeing in the
suricata.log
with regards to entries being added to the pass list logic are from two distinct sources. One is not related to the physical pass list file at all. All the entries that begin with "alert-pf -> adding firewall interface ..." are from the custom blocking plugin as it starts up and asks the firewall to send it all the configured interface IP addresses. It then inserts those into the Radix Tree. After that is done, then the Pass List file is read in from the subdirectory and processed. This is during startup of Suricata only.Once Suricata has been started, the Pass List file is never read from again until the next time Suricata starts. But, during operation, the automatic interface pass list option works by subscribing to a kernel routing socket message API whereby it receives notification each time a firewall interface IP changes. That thread updates entries in the Radix Tree corresponding to firewall interface IPs only.
So, it's hard for me to answer your question because I do not know the exact sequence of events you were doing when you saw the claimed behavior. It's possible you did not trigger the sequence needed for an actual write to the Pass List file in the interface subdirectory.
What is happening is during startup the Wireguard interface IP is read by that kernel routing subscription thread and gets added to the Radix Tree. This querying of firewall interface IPs is done BEFORE the Pass List file is processed. In your case, when the Pass List file is finally processed, it tries to add the same WG tunnel interface IP as already exists from the previous firewall interface scan. That's where the error happens. With a sufficiently large subnet mask, this is normally not a problem. But with the shorter mask, the Radix Tree kicks off into a different subroutine that wants/expects additional "qualifying" data to help it keep track of the two IP addresses (which are really the same).
-
@bmeeks I did restart Suricata, several times actually, since at first I couldn't reproduce the problem...
However, I don't add the tunnel IP's into the list myself. They are all added automatically as they should, since I have all tickboxes filled under the Auto-Generate IP addresses section (as per default).
But I will need to test this again, more thoroughly, to be sure.
-
@gblenn said in 23.01RC - Suricata stops working after Wireguard installed:
@bmeeks I did restart Suricata, several times actually, since at first I couldn't reproduce the problem...
However, I don't add the tunnel IP's into the list myself. They are all added automatically as they should, since I have all tickboxes filled under the Auto-Generate IP addresses section (as per default).
But I will need to test this again, more thoroughly, to be sure.
To be sure a custom Pass List is used, you must select it using the drop-down selector on the INTERFACE SETTINGS tab, then click Save on that tab. And finally, restart Suricata after saving. That sequence is necessary in order for the newly assigned Pass List to be recognized and the corresponding text file in the interface subdirectory to be built for Suricata to read when starting up. Once you perform this sequence once, it will stay in place until you later change the Pass List assignment again. At that point you must repeat the same steps.
-
@bmeeks Precisely, and in my case I had been using that same passlist since I first installed Suricata.... Really strange that the addition of an interface didn't show up there then...
-
@bmeeks So I just went ahead and did the following.
- Checked that custom passlist was selected from the drop-down in Suricata interface settings. It was, but to be sure Save, and restart of Suricata
- Created a test tunnel in Wireguard, Save - Apply.
- Assigned interface to tunnel with IP 10.10.210.0 /31, Save - Apply.
- Suricata.log shows:
-> Received notification of IP address change on firewall interface tun_wg2.
-> added address 10.10.210.0 to automatic firewall interface IP Pass List. - Restarted Suricata no issues...
- View List (custom passlist) does NOT show the newly added IP. Selecting and viewing default passlist does show the new IP...
- Change to default passlist in drop-down, Save - restart Suricata
Suricata NOT restarting.
<Error> -- [ERRCODE: SC_ERR_FATAL(171)] - prefix or user NULL - Change back to custom passlist, Save - restart Suricata
NOT starting, same error... - Deleting tunnel interface
- Suricata restarting without issues...
-
@gblenn said in 23.01RC - Suricata stops working after Wireguard installed:
@bmeeks So I just went ahead and did the following.
- Checked that custom passlist was selected from the drop-down in Suricata interface settings. It was, but to be sure Save, and restart of Suricata
- Created a test tunnel in Wireguard, Save - Apply.
- Assigned interface to tunnel with IP 10.10.210.0 /31, Save - Apply.
- Suricata.log shows:
-> Received notification of IP address change on firewall interface tun_wg2.
-> added address 10.10.210.0 to automatic firewall interface IP Pass List. - Restarted Suricata no issues...
- View List (custom passlist) does NOT show the newly added IP. Selecting and viewing default passlist does show the new IP...
- Change to default passlist in drop-down, Save - restart Suricata
Suricata NOT restarting.
<Error> -- [ERRCODE: SC_ERR_FATAL(171)] - prefix or user NULL - Change back to custom passlist, Save - restart Suricata
NOT starting, same error... - Deleting tunnel interface
- Suricata restarting without issues...
I will look into the Pass List display issue, but it is not the underlying culprit here.
I have a fix prepared and tested that works. It requires changing the custom binary plugin. Since the /30 subnet mask is a usable workaround for now, the Netgate team has asked to hold off modifying the Suricata binary until AFTER pfSense Plus 23.01 goes to full RELEASE. This is to minimize changes during the final RC testing phase. After 23.01 goes RELEASE, I will submit the binary fix for review and merge. It will first go into 2.7.0 CE DEVEL, and if things look good there, it will be migrated to the production 23.01 branch. We will hold open Redmine Issue #13920 until then.