Telegraf: no connection from pfSense via FRR (can I choose the interface?)
I have a problem with Telegraf, I think I know why, but I don't know how to solve it.
I have several networks connected to each other via FRR, all working fine. Each of these has an own pfSense firewall. Let's call them:
I set up an InfluxDB server behind 172.xx.xx.1, it has IP 172.xx.xx.111. On the pfSense box at 172.xx.xx.1, I added the Telegraf package and set the IP - all works perfectly.
On 10.10.xx.1 and 10.20.xx.1, the same setup does not work.
From servers behind 10.10.xx.1 and 10.20.xx.1 I can reach the InfluxDB server without problems.
When I run telegraf from the command line on one of the pfSense firewalls with no connection, I get this:
2020-02-10T16:19:25Z E! [outputs.influxdb]: when writing to [http://172.xx.xx.111:8086]: Post http://172.xx.xx.111:8086/write?db=pfsense: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) 2020-02-10T16:19:25Z E! Error writing to output [influxdb]: could not write any address
OK, so try curl on the address to see if it can even be reached:
/root: curl -v http://172.xx.xx.111 * Expire in 0 ms for 6 (transfer 0x803a94000) * Trying 172.xx.xx.111... * TCP_NODELAY set * Expire in 200 ms for 4 (transfer 0x803a94000) * connect to 172.xx.xx.111 port 80 failed: Operation timed out * Failed to connect to 172.xx.xx.111 port 80: Operation timed out * Closing connection 0 curl: (7) Failed to connect to 172.xx.xx.111 port 80: Operation timed out
If I try this:
/root: curl -v --interface ix0 http://172.xx.xx.111 * Expire in 0 ms for 6 (transfer 0x803a94000) * Trying 172.xx.xx.111... * TCP_NODELAY set * Local Interface ix0 is ip 10.20.xx.1 using address family 2 * Local port: 0 * Expire in 200 ms for 4 (transfer 0x803a94000) * Connected to 172.xx.xx.111 (172.xx.xx.111) port 80 (#0) > GET / HTTP/1.1 > Host: 172.xx.xx.111 > User-Agent: curl/7.64.0 > Accept: */* > < HTTP/1.1 200 OK < Server: nginx/1.14.0 (Ubuntu) etc etc...
So, that works - the problem seems to be that Telegraf is trying to go out over the wrong interface - probably because of FRR (I can see with a package capture that the source IP is the VTI interface IP). Is there any way I can change the interface which Telegraf communicates on?
Any suggestions? I've returned to this problem, and tried to find options for Telegraf as well.. but I can't seem to find any simple solution (as in eg ping command, where
ping -S 172.xx.xx.1solves the problem...)
Did you try adding the route for the IP
thanks for your input. I have not added a static route, if that's what you suggest.
There is a route to this network created by FRR, it looks something like this:
172.xx.xx.0/24 link#2 U 322721536 1500 vtnet1
What route should I add so that telegraf chooses the right interface?
Sorry for the late reply, In the above 10* addresses are not included. Try setting up the static route for each interface -> System/Routing/Static Routes
as I understand it, the static routes method should not be necessary when using FRR.
What I showed wasn't quite correct, it was from the wrong firewall. This is what it looks like from 10.10.xx.1:
172.xx.xx.0/24 10.88.88.17 UG1 41097543 1500 ipsec3000 10.88.88.17 link#15 UH 15196154 1500 ipsec3000
There are a lot more routes there, but I won't post them all since it will be incredibly confusing (there is a large network of several firewalls interconnected). There are also entries for all the local interfaces, for example:
10.xx.xx.0/24 link#3 U 4056229286 1500 ix0
10.88.88.17 is the VTI interface IP on the local firewall. In general, the routing also works - it's only on the actual firewall that the wrong interface is chosen for the outgoing connection. Anything originating behind the firewall takes the right route.
I have solved my problem, for reference and for anyone who runs into the same problem:
On the pfSense gateway 172.xx.xx.1 (the one in front of the InfluxDB server, which was working), add the following to the Telegraf configuration:
[[inputs.influxdb_listener]] service_address = ":8086"
This can be done at the bottom of the Telegraf configuration page. This activates Telegraf as a forwarder.
On each other pfSense machine, I find the VTI interface IP, ie 10.88.88.xx and set the influxDB server as http://10.88.88.xx:8086.
Now, the data is sent via Telegraf to the InfluxDB server, and arrives as it should.
There would probably be other solutions with proxies, but this seems simplest.