"Tailscale is not online" problem
-
Hi Guys,
For anyone interested: here is the script that I used that is working 100%.
The --timeout 2 is not a flag within the tailscale CLI commands.
SUBCOMMANDS for Tailscale
up Connect to Tailscale, logging in if needed
down Disconnect from Tailscale
set Change specified preferences
login Log in to a Tailscale account
logout Disconnect from Tailscale and expire current node key
switch Switches to a different Tailscale account
configure [ALPHA] Configure the host to enable more Tailscale features
netcheck Print an analysis of local network conditions
ip Show Tailscale IP addresses
status Show state of tailscaled and its connections
ping Ping a host at the Tailscale layer, see how it routed
nc Connect to a port on a host, connected to stdin/stdout
ssh SSH to a Tailscale machine
funnel Serve content and local servers on the internet
serve Serve content and local servers on your tailnet
version Print Tailscale version
web Run a web server for controlling Tailscale
file Send or receive files
bugreport Print a shareable identifier to help diagnose issues
cert Get TLS certs
lock Manage tailnet lock
licenses Get open source license information
exit-node
update [BETA] Update Tailscale to the latest/different version
whois Show the machine and user associated with a Tailscale IP (v4 or v6)Anyone has comments, please let leave them.
Note: you must make it executable with chmod +x and I just modified the above script to make it work for my use case. The tailscale node keeps on falling off (exit node unavailable) after either a reboot or it fails after a few days ofd being online. Added error checking display message.
@cmcdonald, this is still occurring in the 24.03 BETA (latest revision) as you are aware.
============
Script:#!/bin/sh
ALLDEST="tailscaleexternalNODE"
COUNT=1
while [ $COUNT -le 2 ]
do
for DEST in $ALLDEST
do
tailscale ping --c 1 $DEST >/dev/null 2>/dev/null
if [ $? -eq 0 ]
then
echo "Tailscale is up"
exit 0
fi
done
if [ $COUNT -le 1 ]
then
echo "Tailscale down"
/usr/local/sbin/pfSsh.php playback svc stop tailscale
sleep 2
/usr/local/sbin/pfSsh.php playback svc start tailscale
sleep 10
echo "Tailscale is up"
exit 1
fi
COUNT=expr $COUNT + 1
done -
Today TS is again shows
Tailscale is not online. Refresh or check the Tailscale status page.
The scrip says
tailscale has been started.
But in fact TS is down :(
WTH
-
@chudak 4 months later, I think this time frame tell us something.
Problem should be with tailscale itself, or the other node.. -
@mcury said in "Tailscale is not online" problem:
@chudak 4 months later, I think this time frame tell us something.
Problem should be with tailscale itself, or the other node..Frankly, there is nothing to update and I still did not get to the bottom of it.
TS sometimes is up and running and very stable for long periods. And then it gets flaky and can't connect.
I do run "restart_tailscale" script in crontab, so I assume it makes it start.
But in general, I am puzzled ...
Any clues?
-
@chudak said in "Tailscale is not online" problem:
But in general, I am puzzled ...
Any clues?
Any logs when the problem starts ?
What happens when you try to ping the other node ?
What status it shows in the GUI ?I have been using that script with the following scenario, which works fine:
I have a customer that runs multi WAN in their headquarters.
One of this links is a CGNAT and other is not.The branch office connects directly to the primary non CGNAT link (I have opened a port in the firewall for that connection).
If a link failover happens in the headquarters, sometimes it loses connections to the TS network and that's when the script "fixes" the problem by forcing the headquarter firewall to restart the service but now using the CGNAT link, thus connecting through the TS node and not a directly connection anymore.The reverse is also true, I mean, when the primary link which is not CGNAT comes back online.
-
I don't know exactly where to look :(
On a high level, only what I see is TS service is green (which is confusing but that's in a different thread and unrelated) and TS connection status is down.
My use case is:
pfS runs TS
iPad runs TS
iPhone runs TS
Windows 11 VM runs TSSo when psF is down, others actually work fine.
I even noticed that routes get resolved. -
@chudak said in "Tailscale is not online" problem:
TS connection status is down.
isn't the script working for that ?
script tries to ping and if it fails, it will restart the service. -
@mcury said in "Tailscale is not online" problem:
@chudak said in "Tailscale is not online" problem:
TS connection status is down.
isn't the script working for that ?
script tries to ping and if it fails, it will restart the service.That's an interesting part.
Yesterday I found TS downI tried to start it manually, and switched Kea DHCP to ISC DHCP and back, removed
/tmp/kea4-ctrl-socket.lock
and could not make it start.Then today in the morning - everything is up and running normally
??!!
-
@chudak said in "Tailscale is not online" problem:
I tried to start it manually, and switched Kea DHCP to ISC DHCP and back, removed /tmp/kea4-ctrl-socket.lock and could not make it start.
Then today in the morning - everything is up and running normally
??!!
I don't see how one thing could interfere with each other.
But, I'm still using ISC-DHCP for that customer.
Can't switch to KEA yet... -
@mcury said in "Tailscale is not online" problem:
@chudak said in "Tailscale is not online" problem:
I tried to start it manually, and switched Kea DHCP to ISC DHCP and back, removed /tmp/kea4-ctrl-socket.lock and could not make it start.
Then today in the morning - everything is up and running normally
??!!
I don't see how one thing could interfere with each other.
But, I'm still using ISC-DHCP for that customer.
Can't switch to KEA yet...Do you know by chance how to catch related to TS stop/start errors/warnings in the logs?
-
@chudak said in "Tailscale is not online" problem:
Do you know by chance how to catch related to TS stop/start errors/warnings in the logs?
You can search old system logs in /var/log directory if I'm not mistaken. Shouldn't be hard to find it, I mean, if problem started yesterday at 15:00hrs, so there you go.
-
TS in conjunction with pfS instability is really frustrating
I’m travelling now and TS is simply down with the error:Error executing command (/usr/local/bin/tailscale status)
Health check:
- not logged in, last login error=invalid key: API key does not exist
unexpected state: NoState
So far nothing I’ve done, rebooting, deleting a lock file, nothing helped.
Thx G.. I still have OpenVPN as a backup option.
I’m surprised no one else is complaining about this… -
@chudak did you create the key at the tailscale's console and then imported it to pfsense ?
also, set the key to do not expire. -
@mcury said in "Tailscale is not online" problem:
@chudak did you create the key at the tailscale's console and then imported it to pfsense ?
also, set the key to do not expire.I did set my key to not expire.
I have used the original key with pfS and did not regenerate it
And have seen it’s working normally after this error.
So suspecting it’s unrelatedI’m hesitant to mess up with keys now as it used to work literally two days ago.
-
@chudak said in "Tailscale is not online" problem:
@mcury said in "Tailscale is not online" problem:
@chudak did you create the key at the tailscale's console and then imported it to pfsense ?
also, set the key to do not expire.I did set my key to not expire.
I have used the original key with pfS and did not regenerate it
And have seen it’s working normally after this error.
So suspecting it’s unrelatedI’m hesitant to mess up with keys now as it used to work literally two days ago.
Ok, if the problem happens again, try to create a key, import it, and then set it to "don't expire".