Grafana Dashboard using Telegraf with additional plugins

VictorRobellini

I want to share the Grafana dashboard I've been building. It utilizes pfSense with the Telegraf package. I've been using github to track all of my changes and manage the plugins.

The dashboard is build using variables so there should be no need to adjust any of the queries for your system. Just make sure to select the proper datasource when you import the dashboard. I've also included some additional plugins to get more information. All of this is documented in my github page as well as instructions to set everything up.

If you have any other data points to add I would be very interested in adding them. Ideally, just fork the repo and submit a merge request.

Setup info:
Repo URL: https://github.com/VictorRobellini/pfSense-Dashboard

See the GitHub page for version information

Screenshot

Thanks

kiokoman

nice

bigjohns97

First off, this is an awesome dashboard, as someone who recently moved from UNIFI UDMP / PiHole to pfSense the only thing I am really missing is the pretty dashboards I used to see, I already have a system performance dashboard up but I would really like to see some of the security options you have laid out.

Would it be possible to show the top DNS clients as well as top allowed DNS requests, I always liked seeing this in PiHole and it was very useful when trying to find out who was being too chatty with my DNS servers.

Awesome work, I am going to start implementing this right away!

EDIT : wow this works like a charm, great instructions!

JohnKap

Looks great, thanks for sharing. Will have a go at this myself.

VictorRobellini

@bigjohns97 I don't know if that information is even being logged. I believe the dnsbl.log only logs what has been blocked. You could maybe get that data from Unbound. It would require additional logging, but even then I don't know if you would be able to differentiate between a blocked vs allowed domain. If you have a datasource, I'll take a look but I don't know if there is one.

bigjohns97

Any idea why I would be getting the following when trying the telegraf_gateways.py option?

[inputs.exec] Error in plugin: exec: fork/exec /usr/local/bin/telegraf_gateways.py: no such file or directory for command '/usr/local/bin/telegraf_gateways.py'

Vikonder

Hi

Do you have at option on dashboard to calculate files count in specific folder?

VictorRobellini

@Vikonder Building shell based plugins is really easy. Just look at the temperature plugin.

Something like this will do it, all you need to do is wrap it in a shell script and replace the hostname with a variable like I did in the temperature script.

find /var -d -print0 -exec sh -c "ls -1 {} | wc -l " \; | xargs printf 'files,host=hostname,directory=directory=%s filecount=%s\n'

VictorRobellini

@bigjohns97 Have you tried executing it from a shell on it's own?

bigjohns97

@VictorRobellini yes actually messing with it right now, was able to get a little further by updating the referenced python version in the script to 3.7 which seems to be the version included 2.4.5 of pfSense. Doing this I now get this when running via command line.

File "./telegraf_gateways.py", line 17
print "gateways,gateway_name=" + values[0] + " rtt=" + str(int(values[1])/100.0) +
^
SyntaxError: invalid syntax

VictorRobellini

@bigjohns97 If you transferred this from a windows box, you may want to check your carriage returns.

bigjohns97

@VictorRobellini I checked that, all clear.

I'll keep looking.

EDIT : I thought you were talking about the added junk that sometimes gets added to files when going back and forth through windows and linux, I don't see any of that when I vi the interfaces script (I do see it in the telegraf.conf file BTW but only with the entries from the extra config section of the GUI).

Are you saying that should be in there or that it shouldn't?

VictorRobellini

@bigjohns97 said in Grafana Dashboard using Telegraf with additional plugins:
On linux you can run "cat -A filename'. I don't think the cat in pfSense has the same capability. When in doubt, just the tr command against the file, all it should be doing is replacing the carriage returns, so running it on a valid file will have no impact. Even running it multuple times will have no impact.

bigjohns97

@VictorRobellini does that script work for you? if so, what version of pfSense are you running?

My research tells me that it will have to be re-written due to some new way that python converts certain data to bytes before sending, but I am not very knowledgeable about this stuff. I posted something to the github link from the original author to see if he knows what might be the issue.

bigjohns97

https://gist.github.com/fastjack/a0feb792a9655da7aa3e2a7a1d9f812f

Looks like FastJack uploaded an edited version that works with PFS 2.4.5 that comes with python 3.7

VictorRobellini

@bigjohns97 Awesome! I'll update the repo shortly.

VictorRobellini

@bigjohns97 Thanks again, I've updated the repo with some basic instructions. How are you showing the measurements in the dashbord? When I tried converting from microseconds to milliseconds things didn't look right.

bigjohns97

@VictorRobellini said in Grafana Dashboard using Telegraf with additional plugins:

@bigjohns97 Thanks again, I've updated the repo with some basic instructions. How are you showing the measurements in the dashbord? When I tried converting from microseconds to milliseconds things didn't look right.

I am using the default microseconds, I don't specify an ip address so it uses my WAN default gateway which does show in microseconds which I think is nice.

I really don't need to see 1ms it's more useful to 4-8 microseconds even though the difference is very small, kinda lets you know when you start reaching the limits of your connection.

BTW it looks like the graph will switch back to milliseconds as soon as the value gets that high and then it goes back to micro once things calm down.

JohnKap

I'm not sure if it's something on my system, but I needed to modify the telegraf_gateways.py script.

Changed this:

    print("gateways,gateway_name="+values[0]+" rtt="+str(int(values[1])/100.0)+ \
          ",rttsd="+str(int(values[2])/100.0)+",loss="+str(int(values[3]))+"i")

to this:

    print("gateways,gateway_name="+values[0]+" rtt="+str(int(values[1])/1.0)+ \
          ",rttsd="+str(int(values[2])/1.0)+",loss="+str(int(values[3]))+"i")

The figures displayed are now correct and match what is displayed on the pfsense dashboard.

Green is my ISP connection, sits at ~10ms.
Yellow is a VPN tunnel to the other side of the world and ~400ms is about right.

bigjohns97

@JohnKap after actually pinging the address from a machine that shows below 1ms you are correct, I never went and actually checked the accuracy of this.

LOL at me thinking I had 8 microseconds to my first hop