• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

"Tailscale is not online" problem

Scheduled Pinned Locked Moved Tailscale
35 Posts 5 Posters 7.5k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C
    chudak @mathwilp1011
    last edited by Jul 29, 2024, 2:20 PM

    @mathwilp1011

    Today TS is again shows Tailscale is not online. Refresh or check the Tailscale status page.

    The scrip says tailscale has been started.

    But in fact TS is down :(

    WTH

    M 1 Reply Last reply Jul 29, 2024, 2:22 PM Reply Quote 0
    • M
      mcury @chudak
      last edited by Jul 29, 2024, 2:22 PM

      @chudak 4 months later, I think this time frame tell us something.
      Problem should be with tailscale itself, or the other node..

      dead on arrival, nowhere to be found.

      C 1 Reply Last reply Aug 8, 2024, 3:42 PM Reply Quote 0
      • C
        chudak @mcury
        last edited by Aug 8, 2024, 3:42 PM

        @mcury said in "Tailscale is not online" problem:

        @chudak 4 months later, I think this time frame tell us something.
        Problem should be with tailscale itself, or the other node..

        Frankly, there is nothing to update and I still did not get to the bottom of it.

        TS sometimes is up and running and very stable for long periods. And then it gets flaky and can't connect.

        I do run "restart_tailscale" script in crontab, so I assume it makes it start.

        But in general, I am puzzled ...

        Any clues?

        M 1 Reply Last reply Aug 8, 2024, 3:55 PM Reply Quote 0
        • M
          mcury @chudak
          last edited by mcury Aug 8, 2024, 3:56 PM Aug 8, 2024, 3:55 PM

          @chudak said in "Tailscale is not online" problem:

          But in general, I am puzzled ...

          Any clues?

          Any logs when the problem starts ?
          What happens when you try to ping the other node ?
          What status it shows in the GUI ?

          I have been using that script with the following scenario, which works fine:

          I have a customer that runs multi WAN in their headquarters.
          One of this links is a CGNAT and other is not.

          The branch office connects directly to the primary non CGNAT link (I have opened a port in the firewall for that connection).
          If a link failover happens in the headquarters, sometimes it loses connections to the TS network and that's when the script "fixes" the problem by forcing the headquarter firewall to restart the service but now using the CGNAT link, thus connecting through the TS node and not a directly connection anymore.

          The reverse is also true, I mean, when the primary link which is not CGNAT comes back online.

          dead on arrival, nowhere to be found.

          C 1 Reply Last reply Aug 8, 2024, 4:03 PM Reply Quote 0
          • C
            chudak @mcury
            last edited by Aug 8, 2024, 4:03 PM

            @mcury

            I don't know exactly where to look :(

            On a high level, only what I see is TS service is green (which is confusing but that's in a different thread and unrelated) and TS connection status is down.

            My use case is:

            pfS runs TS
            iPad runs TS
            iPhone runs TS
            Windows 11 VM runs TS

            So when psF is down, others actually work fine.
            I even noticed that routes get resolved.

            M 1 Reply Last reply Aug 8, 2024, 4:06 PM Reply Quote 0
            • M
              mcury @chudak
              last edited by Aug 8, 2024, 4:06 PM

              @chudak said in "Tailscale is not online" problem:

              TS connection status is down.

              isn't the script working for that ?
              script tries to ping and if it fails, it will restart the service.

              dead on arrival, nowhere to be found.

              C 1 Reply Last reply Aug 8, 2024, 4:12 PM Reply Quote 0
              • C
                chudak @mcury
                last edited by Aug 8, 2024, 4:12 PM

                @mcury said in "Tailscale is not online" problem:

                @chudak said in "Tailscale is not online" problem:

                TS connection status is down.

                isn't the script working for that ?
                script tries to ping and if it fails, it will restart the service.

                That's an interesting part.
                Yesterday I found TS down

                I tried to start it manually, and switched Kea DHCP to ISC DHCP and back, removed /tmp/kea4-ctrl-socket.lock and could not make it start.

                Then today in the morning - everything is up and running normally

                ??!!

                M 1 Reply Last reply Aug 8, 2024, 4:15 PM Reply Quote 0
                • M
                  mcury @chudak
                  last edited by Aug 8, 2024, 4:15 PM

                  @chudak said in "Tailscale is not online" problem:

                  I tried to start it manually, and switched Kea DHCP to ISC DHCP and back, removed /tmp/kea4-ctrl-socket.lock and could not make it start.

                  Then today in the morning - everything is up and running normally

                  ??!!

                  I don't see how one thing could interfere with each other.

                  But, I'm still using ISC-DHCP for that customer.
                  Can't switch to KEA yet...

                  dead on arrival, nowhere to be found.

                  C 1 Reply Last reply Aug 8, 2024, 4:38 PM Reply Quote 0
                  • C
                    chudak @mcury
                    last edited by Aug 8, 2024, 4:38 PM

                    @mcury said in "Tailscale is not online" problem:

                    @chudak said in "Tailscale is not online" problem:

                    I tried to start it manually, and switched Kea DHCP to ISC DHCP and back, removed /tmp/kea4-ctrl-socket.lock and could not make it start.

                    Then today in the morning - everything is up and running normally

                    ??!!

                    I don't see how one thing could interfere with each other.

                    But, I'm still using ISC-DHCP for that customer.
                    Can't switch to KEA yet...

                    Do you know by chance how to catch related to TS stop/start errors/warnings in the logs?

                    M 1 Reply Last reply Aug 8, 2024, 5:20 PM Reply Quote 0
                    • M
                      mcury @chudak
                      last edited by Aug 8, 2024, 5:20 PM

                      @chudak said in "Tailscale is not online" problem:

                      Do you know by chance how to catch related to TS stop/start errors/warnings in the logs?

                      You can search old system logs in /var/log directory if I'm not mistaken. Shouldn't be hard to find it, I mean, if problem started yesterday at 15:00hrs, so there you go.

                      dead on arrival, nowhere to be found.

                      1 Reply Last reply Reply Quote 0
                      • C
                        chudak
                        last edited by Sep 9, 2024, 3:23 PM

                        TS in conjunction with pfS instability is really frustrating
                        I’m travelling now and TS is simply down with the error:

                        Error executing command (/usr/local/bin/tailscale status)

                        Health check:

                        - not logged in, last login error=invalid key: API key does not exist

                        unexpected state: NoState

                        So far nothing I’ve done, rebooting, deleting a lock file, nothing helped.

                        Thx G.. I still have OpenVPN as a backup option.
                        I’m surprised no one else is complaining about this…

                        M 1 Reply Last reply Sep 9, 2024, 3:25 PM Reply Quote 0
                        • M
                          mcury @chudak
                          last edited by mcury Sep 9, 2024, 3:25 PM Sep 9, 2024, 3:25 PM

                          @chudak did you create the key at the tailscale's console and then imported it to pfsense ?
                          also, set the key to do not expire.

                          dead on arrival, nowhere to be found.

                          C 1 Reply Last reply Sep 9, 2024, 3:35 PM Reply Quote 0
                          • C
                            chudak @mcury
                            last edited by Sep 9, 2024, 3:35 PM

                            @mcury said in "Tailscale is not online" problem:

                            @chudak did you create the key at the tailscale's console and then imported it to pfsense ?
                            also, set the key to do not expire.

                            I did set my key to not expire.
                            I have used the original key with pfS and did not regenerate it
                            And have seen it’s working normally after this error.
                            So suspecting it’s unrelated

                            I’m hesitant to mess up with keys now as it used to work literally two days ago.

                            M 1 Reply Last reply Sep 9, 2024, 3:36 PM Reply Quote 0
                            • M
                              mcury @chudak
                              last edited by Sep 9, 2024, 3:36 PM

                              @chudak said in "Tailscale is not online" problem:

                              @mcury said in "Tailscale is not online" problem:

                              @chudak did you create the key at the tailscale's console and then imported it to pfsense ?
                              also, set the key to do not expire.

                              I did set my key to not expire.
                              I have used the original key with pfS and did not regenerate it
                              And have seen it’s working normally after this error.
                              So suspecting it’s unrelated

                              I’m hesitant to mess up with keys now as it used to work literally two days ago.

                              Ok, if the problem happens again, try to create a key, import it, and then set it to "don't expire".

                              dead on arrival, nowhere to be found.

                              M 1 Reply Last reply Jan 16, 2025, 11:05 PM Reply Quote 0
                              • M
                                mcury @mcury
                                last edited by mcury Jan 18, 2025, 2:51 PM Jan 16, 2025, 11:05 PM

                                I want to improve the script above to make it "force" direct connections.

                                Another issue with this script is that its pinging only once and if that ping fails, it stops and then starts the service.

                                I think it would be much better if the script pings 10 times, and if 10 out of 10 fails, it will restart the service.
                                This would increase the reliability of the script and also in the same time, make connections leave the relay and connect directly.

                                But I'm failing to do so, any ideas to improve the code with the insights above in mind ?

                                Edit:

                                I think I got it..

                                1- It will ping "headquarters" 10 times using tailscale.
                                This will help connections through tailscale prefer "direct" instead of relay.
                                2- If at least one of the tailscale ping works, it won't do anything.
                                This will avoid the service to being brought down every time.
                                3- If all pings fails, it will restart the tailscale service.

                                #!/bin/sh
                                
                                DEST="headquarters"
                                SUCCESS=0
                                COUNT=0
                                
                                while [ $COUNT -le 9 ]
                                do
                                        for DEST in $DEST
                                        do
                                                COUNT=`expr $COUNT + 1`
                                                tailscale ping --c 1 -timeout 1s $DEST >/dev/null 2>/dev/null
                                #                ping -c 1 -t 100 $DEST
                                        if [ $? -eq 0 ]
                                                then
                                                SUCCESS=`expr $SUCCESS + 1`
                                        fi
                                        done
                                done
                                if [ $SUCCESS -ge 1 ] && [ $COUNT -eq 10 ]
                                        then
                                        exit 0
                                else
                                                /usr/local/sbin/pfSsh.php playback svc stop tailscale
                                                sleep 5
                                                /usr/local/sbin/pfSsh.php playback svc start tailscale
                                                sleep 5
                                        exit 1
                                fi
                                done
                                

                                One important observation is, if there are more peers in the tailscale network, you can and should add them to this script.
                                See, if you are only pinging one host, if that host goes down, the script will take the entire tailscale service down affecting other hosts.

                                Code for multiple hosts

                                #!/bin/sh
                                
                                DEST="server-1"
                                DEST1="server-2"
                                DEST2="servier-3"
                                SUCCESS=0
                                COUNT=0
                                
                                while [ $COUNT -le 9 ]
                                do
                                        for DEST in $DEST
                                        do
                                                COUNT=`expr $COUNT + 1`
                                                tailscale ping --c 1 --timeout 1s $DEST >/dev/null 2>/dev/null
                                #               ping -c 1 -t 100 $DEST
                                        if [ $? -eq 0 ]
                                                then
                                                SUCCESS=`expr $SUCCESS + 1`
                                        fi
                                                tailscale ping --c 1 --timeout 1s $DEST1 >/dev/null 2>/dev/null
                                #               ping -c 1 -t 100 $DEST1
                                        if [ $? -eq 0 ]
                                                then
                                                SUCCESS=`expr $SUCCESS + 1`
                                        fi
                                                tailscale ping --c 1 --timeout 1s $DEST2 >/dev/null 2>/dev/null
                                #               ping -c 1 -t 100 $DEST2
                                        if [ $? -eq 0 ]
                                                then
                                                SUCCESS=`expr $SUCCESS + 1`
                                        fi
                                        done
                                done
                                if [ $SUCCESS -ge 1 ] && [ $COUNT -eq 10 ]
                                        then
                                        exit 0
                                else
                                                /usr/local/sbin/pfSsh.php playback svc stop tailscale
                                                sleep 5
                                                /usr/local/sbin/pfSsh.php playback svc start tailscale
                                                sleep 5
                                        exit 1
                                fi
                                done
                                

                                The code above will sum SUCCESS variable, and if any of the hosts answers, tailscale service will be considered to be UP and no actions will be taken.

                                dead on arrival, nowhere to be found.

                                1 Reply Last reply Reply Quote 0
                                • First post
                                  Last post
                                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                                  [[user:consent.lead]]
                                  [[user:consent.not_received]]