Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    "Tailscale is not online" problem

    Scheduled Pinned Locked Moved Tailscale
    55 Posts 14 Posters 17.4k Views 16 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • chudakC Offline
      chudak @mcury
      last edited by

      @mcury said in "Tailscale is not online" problem:

      @chudak did you create the key at the tailscale's console and then imported it to pfsense ?
      also, set the key to do not expire.

      I did set my key to not expire.
      I have used the original key with pfS and did not regenerate it
      And have seen it’s working normally after this error.
      So suspecting it’s unrelated

      I’m hesitant to mess up with keys now as it used to work literally two days ago.

      M 1 Reply Last reply Reply Quote 0
      • M Offline
        mcury Rebel Alliance @chudak
        last edited by

        @chudak said in "Tailscale is not online" problem:

        @mcury said in "Tailscale is not online" problem:

        @chudak did you create the key at the tailscale's console and then imported it to pfsense ?
        also, set the key to do not expire.

        I did set my key to not expire.
        I have used the original key with pfS and did not regenerate it
        And have seen it’s working normally after this error.
        So suspecting it’s unrelated

        I’m hesitant to mess up with keys now as it used to work literally two days ago.

        Ok, if the problem happens again, try to create a key, import it, and then set it to "don't expire".

        dead on arrival, nowhere to be found.

        M 1 Reply Last reply Reply Quote 0
        • M Offline
          mcury Rebel Alliance @mcury
          last edited by mcury

          I want to improve the script above to make it "force" direct connections.

          Another issue with this script is that its pinging only once and if that ping fails, it stops and then starts the service.

          I think it would be much better if the script pings 10 times, and if 10 out of 10 fails, it will restart the service.
          This would increase the reliability of the script and also in the same time, make connections leave the relay and connect directly.

          But I'm failing to do so, any ideas to improve the code with the insights above in mind ?

          Edit:

          I think I got it..

          1- It will ping "headquarters" 10 times using tailscale.
          This will help connections through tailscale prefer "direct" instead of relay.
          2- If at least one of the tailscale ping works, it won't do anything.
          This will avoid the service to being brought down every time.
          3- If all pings fails, it will restart the tailscale service.

          #!/bin/sh
          
          DEST="headquarters"
          SUCCESS=0
          COUNT=0
          
          while [ $COUNT -le 9 ]
          do
                  for DEST in $DEST
                  do
                          COUNT=`expr $COUNT + 1`
                          tailscale ping --c 1 -timeout 1s $DEST >/dev/null 2>/dev/null
          #                ping -c 1 -t 100 $DEST
                  if [ $? -eq 0 ]
                          then
                          SUCCESS=`expr $SUCCESS + 1`
                  fi
                  done
          done
          if [ $SUCCESS -ge 1 ] && [ $COUNT -eq 10 ]
                  then
                  exit 0
          else
                          /usr/local/sbin/pfSsh.php playback svc stop tailscale
                          sleep 5
                          /usr/local/sbin/pfSsh.php playback svc start tailscale
                          sleep 5
                  exit 1
          fi
          done
          

          One important observation is, if there are more peers in the tailscale network, you can and should add them to this script.
          See, if you are only pinging one host, if that host goes down, the script will take the entire tailscale service down affecting other hosts.

          Code for multiple hosts

          #!/bin/sh
          
          DEST="server-1"
          DEST1="server-2"
          DEST2="servier-3"
          SUCCESS=0
          COUNT=0
          
          while [ $COUNT -le 9 ]
          do
                  for DEST in $DEST
                  do
                          COUNT=`expr $COUNT + 1`
                          tailscale ping --c 1 --timeout 1s $DEST >/dev/null 2>/dev/null
          #               ping -c 1 -t 100 $DEST
                  if [ $? -eq 0 ]
                          then
                          SUCCESS=`expr $SUCCESS + 1`
                  fi
                          tailscale ping --c 1 --timeout 1s $DEST1 >/dev/null 2>/dev/null
          #               ping -c 1 -t 100 $DEST1
                  if [ $? -eq 0 ]
                          then
                          SUCCESS=`expr $SUCCESS + 1`
                  fi
                          tailscale ping --c 1 --timeout 1s $DEST2 >/dev/null 2>/dev/null
          #               ping -c 1 -t 100 $DEST2
                  if [ $? -eq 0 ]
                          then
                          SUCCESS=`expr $SUCCESS + 1`
                  fi
                  done
          done
          if [ $SUCCESS -ge 1 ] && [ $COUNT -eq 10 ]
                  then
                  exit 0
          else
                          /usr/local/sbin/pfSsh.php playback svc stop tailscale
                          sleep 5
                          /usr/local/sbin/pfSsh.php playback svc start tailscale
                          sleep 5
                  exit 1
          fi
          done
          

          The code above will sum SUCCESS variable, and if any of the hosts answers, tailscale service will be considered to be UP and no actions will be taken.

          dead on arrival, nowhere to be found.

          1 Reply Last reply Reply Quote 0
          • Y Offline
            yobyot
            last edited by

            I'm late to the party but unless I misunderstand this thread it's not about Tailscale not starting up but instead about the auth key expiring.

            Auth keys are good for a maximum of 90 days. If you reboot pfSense on day 91, Tailscale will not come up and the "API" error will be generated (it's actually an auth key expired error).

            Thus, unless you never reboot pfSense, starting with the 91st day, you must re-generate an auth key and input it to Tailscale even if you have key expiry disabled.

            What makes this worse, IMHO, is that the longer you go between reboots, the more obscure the problem. So, Tailscale is not a reliable service because it cannot survive a reboot after 90 days.

            This occurs on both CE 2.7.2 in a Protectli Vault Proxmox VM and on a real SG-1100 running Plus 24.11 (packages as distributed with those releases).

            chudakC J 2 Replies Last reply Reply Quote 0
            • chudakC Offline
              chudak @yobyot
              last edited by

              @yobyot said in "Tailscale is not online" problem:

              I'm late to the party but unless I misunderstand this thread it's not about Tailscale not starting up but instead about the auth key expiring.

              Auth keys are good for a maximum of 90 days. If you reboot pfSense on day 91, Tailscale will not come up and the "API" error will be generated (it's actually an auth key expired error).

              Thus, unless you never reboot pfSense, starting with the 91st day, you must re-generate an auth key and input it to Tailscale even if you have key expiry disabled.

              What makes this worse, IMHO, is that the longer you go between reboots, the more obscure the problem. So, Tailscale is not a reliable service because it cannot survive a reboot after 90 days.

              This occurs on both CE 2.7.2 in a Protectli Vault Proxmox VM and on a real SG-1100 running Plus 24.11 (packages as distributed with those releases).

              I wish TS would introduce a new feature "a la Acme update" so that this would be done automatically.

              Y 1 Reply Last reply Reply Quote 0
              • J Offline
                Jim Coogan @yobyot
                last edited by

                @yobyot I think maybe it is node key expiring at 180 days?

                fwiw I have discovered that running in shell

                tailscale down
                tailscale up --force-reauth
                

                will give you a URL you can then paste in browser and it re authenticates and gets pfsense back online as the same machine and status shows this in pf tailscale UI. This is reauthing the node key.

                The node key shouldn't expire when you set it not to on the tailscale admin but I just caught is note on https://tailscale.com/kb/1028/key-expiry
                "A change to the Key Expiry value applies to any devices that are logged in after you make the change. The key expiration for any devices that are already logged in remains unchanged, until the next time the device is logged in."

                So maybe when we setup pf tailscale and the subsequently disable node key expiring it doesn't take effect until reauth which maybe doesnt happen until --force-reauth and doesn't become apparant until after 180 days?

                However, what I don't understand and undermines my theory somewhat is that after doing reauth, at first I noticed that restarting tailscale in pf UI caused me to be logged out again with error "You are logged out. The last login error was: invalid key: API key does not exist"

                I manually updated tailscale to 1.84.2 (see https://forum.netgate.com/topic/174525/how-to-update-to-the-latest-tailscale-version/155 but basically run pkg add -f https://pkg.freebsd.org/FreeBSD:15:amd64/latest/All/tailscale-1.84.2.pkg) and then did tailscale down and up --force-reauth and this time it made me resign (I have tailnet lock on) after auth it. Now restarting the service in UI works.

                Not sure yet what is going on and what role the new tailscale pkg played. One thing I suspect that maybe also is a factor is the fact that /usr/local/pkg/tailscale/state/tailscaled.state is the state file with node key instead of the standard /var/db/tailscale/tailscaled.state could be a factor On pf tailscale, /usr/local/etc/rc.d/tailscaled uses /var/db/tailscale/tailscaled.state as state directory so maybe sometimes somehow tailscale is looking for state there and it doesn't exist.

                But this wouldn't really explain why everything is fine for a while initially (usually 90 to 180 days Im not exactly sure in my case). This may explain why it logs out on reboot if you if you use ram disk though.

                M 1 Reply Last reply Reply Quote 2
                • Y Offline
                  yobyot @chudak
                  last edited by

                  @chudak Yup. That's exactly what we need for this to be reliable.

                  1 Reply Last reply Reply Quote 0
                  • A Offline
                    AlphaDog45
                    last edited by

                    I am also effected by this and it happened upon a reboot within the 180 day key expiry. Running the tailscale down and up with reuath, followed by pasting the URL into a browser fixes it, but is far from ideal.

                    1 Reply Last reply Reply Quote 0
                    • M Offline
                      manupfdude
                      last edited by manupfdude

                      Hi all,

                      I wanted to share that I’m also experiencing issues with Tailscale (v1.86.4) on the latest pfSense CE (v2.8.0).
                      After either a pfSense reboot or a Tailscale service restart, my node is logged out of Tailscale — exactly as many of you have described here.

                      It bothered me enough that I opened a support ticket with the Tailscale team. I explained the problem, included details, and linked this thread as a reference. The good news is that support responded quickly and said they’re setting up a pfSense VM to reproduce the issue.

                      In the meantime, they suggested a workaround:

                      “Based on the behavior described here and in the thread, it looks like the auth key used for authentication may be expiring. Since OAuth clients don’t expire, switching to an OAuth Client rather than an Auth Key might resolve this.
                      See our docs here:
                      https://tailscale.com/kb/1215/oauth-clients#registering-new-nodes-using-oauth-credentials

                      I tested this, and after configuring Tailscale with OAuth, everything worked!
                      Tailscale now stays connected and authenticated even after a reboot or service restart.

                      That said, I did tell support that while OAuth solves the issue, it used to work out of the box with the simpler interactive login / auth key flow. Something seems to have changed on the Tailscale side, and I hope they identify and fix it. But until then, the OAuth workaround is a solid fix.

                      I’d love to hear how it’s working for the rest of you.
                      Anyone else try OAuth yet?

                      Cheers

                      Y 1 Reply Last reply Reply Quote 0
                      • Y Offline
                        yobyot @manupfdude
                        last edited by

                        @manupfdude This is a very inventive approach!

                        I'd like to try it but have a couple of questions about pfSense implementation.

                        Where did you enter the key and secret? And what scopes did you grant the Tailscale OAuth client?

                        M 1 Reply Last reply Reply Quote 0
                        • M Offline
                          manupfdude @yobyot
                          last edited by manupfdude

                          @yobyot
                          I've SSHed into pfsense
                          and for the sake of testing I've simply run the command:

                          tailscale up --auth-key=tskey-client-kQ_THE_REST_IS_A_SECRET\?preauthorized=true\&ephemeral=false --accept-dns=false --accept-routes --advertise-exit-node --advertise-routes=X.X.X.X/24 --advertise-tags=tag:pfsense
                          

                          Note the preauthorized=true and ephemeral=false
                          I gave this key all permissions (temporarly as I just wanted to verify it's working)
                          of course I had to register the tag used also in the ACL tags pane:
                          https://login.tailscale.com/admin/acls/visual/tags

                          so far so good

                          L L 2 Replies Last reply Reply Quote 0
                          • L Offline
                            lbm_ @manupfdude
                            last edited by

                            @manupfdude Interesting. Thanks for the detailed answer. Im also facing the same issue unfortunately where I need to auth again after an reboot.

                            Are you with this approach having issues with the advertised routes not working ? I have to manually do (I havent tried your approach yet to see if this fixes the issue).

                             tailscale set --advertise-routes=192.168.1/24,192.168.2.0/24 after I login. Even though they are set in the UI
                            
                            1 Reply Last reply Reply Quote 0
                            • Y Offline
                              yobyot
                              last edited by

                              Well,

                              I spent some time tonight playing around with this and I think I have it.

                              Some suggestions for others:

                              1. Generate the OAuth client in the Tailscale admin before anything else.

                              2. Make sure to create the tag you'll need. One per pfSense instance (and clearly, one OAuth client per pfSense instance).

                              3. Give the OAuth client the permissions you think appropriate.

                              4. Very Important: make sure that you can generate an API key with the OAuth creds. The OAuth creds are, apparently, used by the CLI to generate an API key. The latter is what does the trick in tailscale up.
                                Do this from the pfSense console:
                                curl -d "client_id=kY5Mv4h8kQ11CNTRL" -d "client_secret=tskey-client-kY5[invalidchars]CNTRL-ZXo2FfBbb[moreinvalidchars]GVT" "https://api.tailscale.com/api/v2/oauth/token"
                                If you don't get back something like this, you'll never be able to get it to work:
                                {"access_token":"tskey-api-kM[lotsofinvalidchars]NTRL-[stillmoreinvalidchars]9YevL","token_type":"Bearer","expires_in":3600,"scope":"all"}

                              5. Here's what worked for me if the above returned an API token:
                                /usr/local/bin/tailscale up --auth-key=tskey-client-[greekedout]GVT\?ephemeral=false\&preauthorized=true --accept-dns=false --accept-routes --advertise-exit-node --advertise-routes=192.168.211.0/24 --advertise-tags=tag:[yourtaghere]

                              6. Make sure you have the cron package installed. Then add a @reboot entry using the full path (see above). I also added a cron entry every six hours as if Tailscale is up, this command does not interrupt or reset any sessions.

                              I've left some bytes of the creds in these examples to make it clearer where your full creds should go. The curl command requires the escape symbol (\) in the parameters that will be passed to the control plane.

                              FWIW, I lost an hour or more because I had (God only knows why) set Tailscale on one pfSense instance to accept DNS. Do this and the router cannot resolve the control plane API endpoint. Dumb. And I own it.

                              I don't know if this "fixes" everything. But it's a lot of work and it shouldn't be necessary. Somehow, this package to be useful needs to survive reboot without the need to go to these lengths.

                              1 Reply Last reply Reply Quote 0
                              • L Offline
                                left4apple @manupfdude
                                last edited by

                                @manupfdude Thanks for the info. If I understand correctly, pfSense Tailscale package will ensure tailscaled is running all the time even after reboot, yet Tailscale is actually using your command to authenticate itself, right? The KEY entered in the pfSense Tailscale UI has already expired(or you can enter fake KEY there).

                                M 1 Reply Last reply Reply Quote 0
                                • M Offline
                                  manupfdude @left4apple
                                  last edited by

                                  @left4apple Basicaslly - Yes.
                                  The key on the UI is no longer relevant (once you go with the oauth route)
                                  I'm on version 1.88.3 of tailscale, after several reboots (due to power outages...) - TS still connected and authenticated :-)

                                  1 Reply Last reply Reply Quote 0
                                  • K Offline
                                    KStarRunner
                                    last edited by

                                    This seems like a bug to me. Every other platform where TailScale is setup, the tskey-auth is only used once, and then immediately discarded. Clients don't maintain the original tskey-auth, but rather a separate credential which was bootstrapped by the tskey-auth.

                                    So why is pfSense storing this key? IMHO, the GUI should accept the key, and once bootstrapped, delete it. The persistent presence of the tskey-auth makes me think pfSense keeps attempting to re-use it each time the service restarts.

                                    The correct behavior would seemingly be to accept the tskey-auth in the GUI, and then once the bootstrap is completed, delete it from the stored configuration. Should a new value be entered into the GUI, presume that means a new bootstrap is needed and start fresh. But as long as that box remains blank, keep on using the existing credentials.

                                    Y 1 Reply Last reply Reply Quote 0
                                    • Y Offline
                                      yobyot @KStarRunner
                                      last edited by

                                      @KStarRunner Clearly, it’s a bug.

                                      K 1 Reply Last reply Reply Quote 0
                                      • K Offline
                                        KStarRunner @yobyot
                                        last edited by

                                        @yobyot Sorry, meant to say the bug appears to be in the way pfsense manages the TailScale client, vs a bug in the TailScale client itself.

                                        1 Reply Last reply Reply Quote 0
                                        • M Offline
                                          marcg @Jim Coogan
                                          last edited by marcg

                                          @Jim-Coogan said in "Tailscale is not online" problem:

                                          fwiw I have discovered that running in shell

                                          tailscale down
                                          tailscale up --force-reauth
                                          

                                          will give you a URL you can then paste in browser and it re authenticates and gets pfsense back online as the same machine and status shows this in pf tailscale UI. This is reauthing the node key.

                                          @Jim-Coogan Looks like your fix is working for me with the packaged Tailscale 1.82.4 on 25.07. Thanks.

                                          Previously, if I rebooted pfSense or started/stopped tailscale, tailscale wouldn't reconnect even though key expiry was disabled. After following your steps with an expiry-disabled key, tailscale reconnects after restarts and reboots. Hoping that continues ... I'm sure I'll find out.

                                          1 Reply Last reply Reply Quote 0
                                          • V Online
                                            Vad-B
                                            last edited by Vad-B

                                            Faced the same "Tailscale is not online" issue today - 7 days after the initial setup (October 23rd).
                                            Resolved it by regenerating the key and authenticating again. I noticed the ongoing discussions about OAuth, but chose to dig deeper into this problem first.

                                            Here’s what I found:

                                            This appears to be a known issue affecting Tailscale across all FreeBSD-based systems, including vanilla FreeBSD, pfSense, and OPNsense.
                                            There is an open bug in the Tailscale repository with a solution for pfSense. I plan to test it today: https://github.com/tailscale/tailscale/issues/17047

                                            This is how it works (source):
                                            "Auth keys and node keys are two separate things. The auth key is used one to register the device and a node key is obtained. This node key is part of the tailscale state and authenticates this device. You can remove the expiration for this node key
                                            ...
                                            After the initial auth, you don't need the auth key anymore. The node key in the tailscale state (with expiration removed) will auth the node forever."

                                            But in reality, on boot or restart, the Tailscale package attempts to start the Tailscale with "tailscale up --auth-key=..." (as defined in /usr/local/etc/rc.d/pfsense_tailscaled) which lead to re-register the node with the Tailscale network using the auth-key (that is set to expire), overriding the valid existing node key that is configured not to expire.

                                            So to fix it, we should force tailscale up without the auth-key using one of the options from the git comment above.


                                            I was still puzzled by this 7-day timeline - where it comes from? Following a link to the related OPNsense GitHub issue, I found this explanation:

                                            "Folks who create and use reusable auth keys won't see the issue until that key expires (default 90 days). Folks like me who created a one off key will see the issue after the first reboot after one week (default expiration for a one off key is a week.)"

                                            V 2 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.