• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Grafana Dashboard using Telegraf with additional plugins

Scheduled Pinned Locked Moved pfSense Packages
173 Posts 28 Posters 71.7k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • B
    bigjohns97 @von Papst
    last edited by Jan 27, 2021, 7:33 PM

    @von-papst said in Grafana Dashboard using Telegraf with additional plugins:

    I'm not sure where do I install these plugins and telegraf config. On my pfsense or on my Linux box where I have telegraf and influxdb installed?

    Telegraf should be on your pfsense box and it should send to influxdb on your linux box, you shouldn't need Telegraf on your linux box for anything.

    V 1 Reply Last reply Jan 28, 2021, 2:46 PM Reply Quote 1
    • E
      erbalo @bigjohns97
      last edited by erbalo Jan 27, 2021, 8:03 PM Jan 27, 2021, 8:01 PM

      @bigjohns97 said in Grafana Dashboard using Telegraf with additional plugins:

      @bigjohns97 said in Grafana Dashboard using Telegraf with additional plugins:

      I was playing with the telegraph_unbound script and noticed it wasn't working and ended up replacing the command in the script with the following

      unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
      

      This provides a cache hit stats output and was wondering if anyone had any skills with grafana to provide a nice panel addition to this already great dashboard. Was looking for something like this.

      alt text

      I found this thread from reddit where they were discussing how to optimize cache hits and couldn't figure out how to get the following panel to show.

      If anyone has the knowledge on how to create this panel I would greatly appreciate it!

      (I tried to post the original thread from reddit but it kept flagging this post as spam)

      Was able to reach out to the originator of this post from reddit and get a working version of this

      SELECT "total_num_cachehits" FROM "unbound" WHERE ("host" = 'pfSense.localdomain') AND $timeFilter
      
      SELECT "total_num_cachemiss" FROM "unbound" WHERE ("host" = 'pfSense.localdomain') AND $timeFilter
      
      {
        "aliasColors": {
          "Hits": "#629e51",
          "Misses": "#bf1b00"
        },
        "breakPoint": "50%",
        "cacheTimeout": null,
        "combine": {
          "label": "Others",
          "threshold": 0
        },
        "datasource": "PfSense",
        "decimals": null,
        "fieldConfig": {
          "defaults": {
            "custom": {}
          },
          "overrides": []
        },
        "fontSize": "100%",
        "format": "short",
        "gridPos": {
          "h": 6,
          "w": 3,
          "x": 13,
          "y": 7
        },
        "hideTimeOverride": false,
        "id": 23763571993,
        "interval": null,
        "legend": {
          "header": "",
          "percentage": true,
          "percentageDecimals": 0,
          "show": true,
          "sortDesc": true,
          "values": false
        },
        "legendType": "On graph",
        "links": [],
        "maxDataPoints": 3,
        "nullPointMode": "connected",
        "pieType": "donut",
        "pluginVersion": "6.3.3",
        "strokeWidth": "2",
        "targets": [
          {
            "alias": "Hits",
            "groupBy": [],
            "measurement": "unbound",
            "orderByTime": "ASC",
            "policy": "default",
            "refId": "A",
            "resultFormat": "time_series",
            "select": [
              [
                {
                  "params": [
                    "total_num_cachehits"
                  ],
                  "type": "field"
                }
              ]
            ],
            "tags": [
              {
                "key": "host",
                "operator": "=",
                "value": "pfSense.localdomain"
              }
            ]
          },
          {
            "alias": "Misses",
            "groupBy": [],
            "measurement": "unbound",
            "orderByTime": "ASC",
            "policy": "default",
            "refId": "B",
            "resultFormat": "time_series",
            "select": [
              [
                {
                  "params": [
                    "total_num_cachemiss"
                  ],
                  "type": "field"
                }
              ]
            ],
            "tags": [
              {
                "key": "host",
                "operator": "=",
                "value": "pfSense.localdomain"
              }
            ]
          }
        ],
        "thresholds": [],
        "timeFrom": null,
        "timeShift": null,
        "title": "DNS Cache Hit/Miss Ratio",
        "type": "grafana-piechart-panel",
        "valueName": "current"
      }
      

      Here is my current stats (I set min ttl to 3600 in unbound)
      4ebabdfa-7e7b-44ef-99df-480a2ec20b18-image.png

      EDIT : I forgot to mention you have to use this command in the telegraf_unbound.sh and make sure you uncomment it from the telegraf config in the install instructions

      unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
      

      i am interesting on this graph. What can you do/analyze really with this graph DNS cache hit?

      Do i only uncomend the unbound part on telegraf on pfsense and put the unbound-control.. to the telegraf.sh ?


      #[[inputs.unbound]]

      server = "127.0.0.1:953"

      binary = "/usr/local/bin/telegraf_unbound.sh"

      What should be the Server IP adres? is that my pfsense box?

      B 1 Reply Last reply Jan 27, 2021, 8:10 PM Reply Quote 0
      • B
        bigjohns97 @erbalo
        last edited by bigjohns97 Jan 27, 2021, 8:16 PM Jan 27, 2021, 8:10 PM

        @erbalo said in Grafana Dashboard using Telegraf with additional plugins:

        @bigjohns97 said in Grafana Dashboard using Telegraf with additional plugins:

        @bigjohns97 said in Grafana Dashboard using Telegraf with additional plugins:

        I was playing with the telegraph_unbound script and noticed it wasn't working and ended up replacing the command in the script with the following

        unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
        

        This provides a cache hit stats output and was wondering if anyone had any skills with grafana to provide a nice panel addition to this already great dashboard. Was looking for something like this.

        alt text

        I found this thread from reddit where they were discussing how to optimize cache hits and couldn't figure out how to get the following panel to show.

        If anyone has the knowledge on how to create this panel I would greatly appreciate it!

        (I tried to post the original thread from reddit but it kept flagging this post as spam)

        Was able to reach out to the originator of this post from reddit and get a working version of this

        SELECT "total_num_cachehits" FROM "unbound" WHERE ("host" = 'pfSense.localdomain') AND $timeFilter
        
        SELECT "total_num_cachemiss" FROM "unbound" WHERE ("host" = 'pfSense.localdomain') AND $timeFilter
        
        {
          "aliasColors": {
            "Hits": "#629e51",
            "Misses": "#bf1b00"
          },
          "breakPoint": "50%",
          "cacheTimeout": null,
          "combine": {
            "label": "Others",
            "threshold": 0
          },
          "datasource": "PfSense",
          "decimals": null,
          "fieldConfig": {
            "defaults": {
              "custom": {}
            },
            "overrides": []
          },
          "fontSize": "100%",
          "format": "short",
          "gridPos": {
            "h": 6,
            "w": 3,
            "x": 13,
            "y": 7
          },
          "hideTimeOverride": false,
          "id": 23763571993,
          "interval": null,
          "legend": {
            "header": "",
            "percentage": true,
            "percentageDecimals": 0,
            "show": true,
            "sortDesc": true,
            "values": false
          },
          "legendType": "On graph",
          "links": [],
          "maxDataPoints": 3,
          "nullPointMode": "connected",
          "pieType": "donut",
          "pluginVersion": "6.3.3",
          "strokeWidth": "2",
          "targets": [
            {
              "alias": "Hits",
              "groupBy": [],
              "measurement": "unbound",
              "orderByTime": "ASC",
              "policy": "default",
              "refId": "A",
              "resultFormat": "time_series",
              "select": [
                [
                  {
                    "params": [
                      "total_num_cachehits"
                    ],
                    "type": "field"
                  }
                ]
              ],
              "tags": [
                {
                  "key": "host",
                  "operator": "=",
                  "value": "pfSense.localdomain"
                }
              ]
            },
            {
              "alias": "Misses",
              "groupBy": [],
              "measurement": "unbound",
              "orderByTime": "ASC",
              "policy": "default",
              "refId": "B",
              "resultFormat": "time_series",
              "select": [
                [
                  {
                    "params": [
                      "total_num_cachemiss"
                    ],
                    "type": "field"
                  }
                ]
              ],
              "tags": [
                {
                  "key": "host",
                  "operator": "=",
                  "value": "pfSense.localdomain"
                }
              ]
            }
          ],
          "thresholds": [],
          "timeFrom": null,
          "timeShift": null,
          "title": "DNS Cache Hit/Miss Ratio",
          "type": "grafana-piechart-panel",
          "valueName": "current"
        }
        

        Here is my current stats (I set min ttl to 3600 in unbound)
        4ebabdfa-7e7b-44ef-99df-480a2ec20b18-image.png

        EDIT : I forgot to mention you have to use this command in the telegraf_unbound.sh and make sure you uncomment it from the telegraf config in the install instructions

        unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
        

        i am interesting on this graph. What can you do/analyze really with this graph DNS cache hit?

        Do i only uncomend the unbound part on telegraf on pfsense and put the unbound-control.. to the telegraf.sh ?


        #[[inputs.unbound]]

        server = "127.0.0.1:953"

        binary = "/usr/local/bin/telegraf_unbound.sh"

        What should be the Server IP adres? is that my pfsense box?

        Correct, uncomment out the part you posted and then go and edit the sh script with the "unbound-control" command I posted above.

        As far as what purpose this serves, for me this helped me get to a point where the majority of my dns requests were cached and being responded to locally by pfsense, this greatly speeds up internet experience.

        A lot of IOT devices today use very small TTLs which means pfsense is constantly sending out DNS traffic to DNS servers on the internet for domain queries. With a min TTL of 3600 I am telling unbound to always use at least 3600 as a minimum ttl. This sets a low watermark for all domains but allows domains above 3600 to keep their designed TTL as well.

        As you can see it gives me a very high cache rate, and I have done this since way back when I was running pihole and have never seen any issues with it but you will hear from quite a few who believe this is a bad practice because it goes against what the domain owner intended. I have always felt that domain owners have to account for the lowest common denominator however so for me a 3600 min TTL is a nice middle ground between performance and proper name resolution.

        E 1 Reply Last reply Jan 27, 2021, 8:28 PM Reply Quote 1
        • E
          erbalo @bigjohns97
          last edited by Jan 27, 2021, 8:28 PM

          @bigjohns97

          Thank you, which IP should i put here and late the port number 953 ?

          server = "127.0.0.1:953"

          E B 2 Replies Last reply Jan 27, 2021, 8:32 PM Reply Quote 1
          • E
            erbalo @erbalo
            last edited by Jan 27, 2021, 8:32 PM

            Should that be ok so?

            #!/bin/sh
            /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf $* | grep -vE 'thread[0-9]+'
            unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
            
            B 1 Reply Last reply Jan 27, 2021, 8:35 PM Reply Quote 0
            • B
              bigjohns97 @erbalo
              last edited by Jan 27, 2021, 8:33 PM

              @erbalo That server ip and port should be fine.

              1 Reply Last reply Reply Quote 0
              • B
                bigjohns97 @erbalo
                last edited by bigjohns97 Jan 27, 2021, 8:39 PM Jan 27, 2021, 8:35 PM

                @erbalo said in Grafana Dashboard using Telegraf with additional plugins:

                Should that be ok so?

                #!/bin/sh
                /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf $* | grep -vE 'thread[0-9]+'
                unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
                

                When I tried running that command that was originally in there (your top line) it didn't work.

                E 1 Reply Last reply Jan 27, 2021, 8:42 PM Reply Quote 0
                • E
                  erbalo @bigjohns97
                  last edited by Jan 27, 2021, 8:42 PM

                  @bigjohns97 said in Grafana Dashboard using Telegraf with additional plugins:

                  @erbalo said in Grafana Dashboard using Telegraf with additional plugins:

                  Should that be ok so?

                  #!/bin/sh
                  /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf $* | grep -vE 'thread[0-9]+'
                  unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
                  

                  When I tried running that command that was originally in there (your top line) it didn't work.

                  Just it should be:

                  #!/bin/sh
                  unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
                  

                  ?

                  B 1 Reply Last reply Jan 27, 2021, 8:46 PM Reply Quote 0
                  • B
                    bigjohns97 @erbalo
                    last edited by Jan 27, 2021, 8:46 PM

                    @erbalo said in Grafana Dashboard using Telegraf with additional plugins:

                    @bigjohns97 said in Grafana Dashboard using Telegraf with additional plugins:

                    @erbalo said in Grafana Dashboard using Telegraf with additional plugins:

                    Should that be ok so?

                    #!/bin/sh
                    /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf $* | grep -vE 'thread[0-9]+'
                    unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
                    

                    When I tried running that command that was originally in there (your top line) it didn't work.

                    Just it should be:

                    #!/bin/sh
                    unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
                    

                    ?

                    That is what I am running, correct, and it didn't affect any other metrics.

                    E 1 Reply Last reply Jan 27, 2021, 9:13 PM Reply Quote 0
                    • E
                      erbalo @bigjohns97
                      last edited by Jan 27, 2021, 9:13 PM

                      @bigjohns97 said in Grafana Dashboard using Telegraf with additional plugins:

                      @erbalo said in Grafana Dashboard using Telegraf with additional plugins:

                      @bigjohns97 said in Grafana Dashboard using Telegraf with additional plugins:

                      @erbalo said in Grafana Dashboard using Telegraf with additional plugins:

                      Should that be ok so?

                      #!/bin/sh
                      /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf $* | grep -vE 'thread[0-9]+'
                      unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
                      

                      When I tried running that command that was originally in there (your top line) it didn't work.

                      Just it should be:

                      #!/bin/sh
                      unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
                      

                      ?

                      That is what I am running, correct, and it didn't affect any other metrics.

                      I don't receive any data to grafana, what can be wrong?

                      B 1 Reply Last reply Jan 27, 2021, 9:20 PM Reply Quote 0
                      • B
                        bigjohns97 @erbalo
                        last edited by bigjohns97 Jan 27, 2021, 9:31 PM Jan 27, 2021, 9:20 PM

                        @erbalo said in Grafana Dashboard using Telegraf with additional plugins:

                        @bigjohns97 said in Grafana Dashboard using Telegraf with additional plugins:

                        @erbalo said in Grafana Dashboard using Telegraf with additional plugins:

                        @bigjohns97 said in Grafana Dashboard using Telegraf with additional plugins:

                        @erbalo said in Grafana Dashboard using Telegraf with additional plugins:

                        Should that be ok so?

                        #!/bin/sh
                        /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf $* | grep -vE 'thread[0-9]+'
                        unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
                        

                        When I tried running that command that was originally in there (your top line) it didn't work.

                        Just it should be:

                        #!/bin/sh
                        unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
                        

                        ?

                        That is what I am running, correct, and it didn't affect any other metrics.

                        I don't receive any data to grafana, what can be wrong?

                        Make sure the data source and table is the same on your side as what I posted.

                        @erbalo copy the JSON from above again I replaced some of my entries with variables so be more plug and play.

                        1 Reply Last reply Reply Quote 0
                        • V
                          von Papst @bigjohns97
                          last edited by Jan 28, 2021, 2:46 PM

                          @bigjohns97 got it running. But still missing CPU, memory and system load data. What am I missing?

                          B 1 Reply Last reply Jan 28, 2021, 2:53 PM Reply Quote 0
                          • B
                            bigjohns97 @von Papst
                            last edited by Jan 28, 2021, 2:53 PM

                            @von-papst This is just a single panel, add it to the dashboard being developed in this thread.

                            J 1 Reply Last reply Feb 23, 2021, 2:18 AM Reply Quote 0
                            • J
                              jpcapone @bigjohns97
                              last edited by Feb 23, 2021, 2:18 AM

                              This post is deleted!
                              1 Reply Last reply Reply Quote 0
                              • J
                                jpcapone
                                last edited by Feb 23, 2021, 3:04 AM

                                I think I am late to the party but I am trying to figure some things out. I am running pfsense in a vm on esxi 6.7. I was able to figure out enough to get most of the panels working but I think the scripts arent working. I am not super familiar with FreeBSD so I am finding it difficult to determine how to test run the scripts so that I can remediate. I am pretty sure the scripts arent running because all of the panels aren't populated with data and when I do a show measurements on the DB I get only the entries listed below. Any advice would be appreciated.
                                cpu
                                disk
                                diskio
                                mem
                                net
                                pf
                                processes
                                swap
                                system

                                B 1 Reply Last reply Feb 23, 2021, 1:47 PM Reply Quote 0
                                • B
                                  bigjohns97 @jpcapone
                                  last edited by Feb 23, 2021, 1:47 PM

                                  @jpcapone This is the best way to troubleshoot the plugins

                                  Taken from https://github.com/VictorRobellini/pfSense-Dashboard

                                  36550fb1-a659-4d99-8e4b-7aa80294b608-image.png

                                  J 2 Replies Last reply Feb 23, 2021, 6:55 PM Reply Quote 1
                                  • J
                                    jpcapone @bigjohns97
                                    last edited by Feb 23, 2021, 6:55 PM

                                    This post is deleted!
                                    1 Reply Last reply Reply Quote 0
                                    • J
                                      jpcapone @bigjohns97
                                      last edited by Feb 23, 2021, 7:05 PM

                                      @bigjohns97
                                      Thanks for that. I was able to figure out the issues with the plugins. Now I am just left with what I have pasted below. Can you please advise?

                                      2021-02-23T19:01:58Z I! Loaded inputs: cpu disk diskio exec kernel logparser (2x) mem net pf processes swap system
                                      2021-02-23T19:01:58Z I! Loaded aggregators:
                                      2021-02-23T19:01:58Z I! Loaded processors:
                                      2021-02-23T19:01:58Z I! Loaded outputs: influxdb
                                      2021-02-23T19:01:58Z I! Tags enabled: host=xxxxpfSense.xxxxolutions.co
                                      2021-02-23T19:01:58Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"xxxxpfSense.xxxxolutions.co", Flush Interval:10s
                                      2021-02-23T19:01:58Z D! [agent] Initializing plugins
                                      2021-02-23T19:01:58Z W! [inputs.logparser] The logparser plugin is deprecated; please use the 'tail' input with the 'grok' data_format
                                      2021-02-23T19:01:58Z W! [inputs.logparser] The logparser plugin is deprecated; please use the 'tail' input with the 'grok' data_format
                                      2021-02-23T19:01:58Z D! [agent] Connecting outputs
                                      2021-02-23T19:01:58Z D! [agent] Attempting connection to [outputs.influxdb]
                                      2021-02-23T19:01:58Z D! [agent] Successfully connected to outputs.influxdb
                                      2021-02-23T19:01:58Z D! [agent] Starting service inputs
                                      2021-02-23T19:01:58Z E! [inputs.logparser] Error in plugin: open /var/log/pfblockerng/dnsbl.log: no such file or directory
                                      2021-02-23T19:01:58Z E! [inputs.logparser] Error in plugin: open /var/log/pfblockerng/ip_block.log: no such file or directory
                                      2021-02-23T19:02:00Z E! [inputs.logparser] Error in plugin: open /var/log/pfblockerng/dnsbl.log: no such file or directory
                                      2021-02-23T19:02:00Z E! [inputs.logparser] Error in plugin: open /var/log/pfblockerng/ip_block.log: no such file or directory

                                      B 1 Reply Last reply Feb 23, 2021, 8:33 PM Reply Quote 0
                                      • B
                                        bigjohns97 @jpcapone
                                        last edited by Feb 23, 2021, 8:33 PM

                                        @jpcapone said in Grafana Dashboard using Telegraf with additional plugins:

                                        @bigjohns97
                                        Thanks for that. I was able to figure out the issues with the plugins. Now I am just left with what I have pasted below. Can you please advise?

                                        2021-02-23T19:01:58Z I! Loaded inputs: cpu disk diskio exec kernel logparser (2x) mem net pf processes swap system
                                        2021-02-23T19:01:58Z I! Loaded aggregators:
                                        2021-02-23T19:01:58Z I! Loaded processors:
                                        2021-02-23T19:01:58Z I! Loaded outputs: influxdb
                                        2021-02-23T19:01:58Z I! Tags enabled: host=xxxxpfSense.xxxxolutions.co
                                        2021-02-23T19:01:58Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"xxxxpfSense.xxxxolutions.co", Flush Interval:10s
                                        2021-02-23T19:01:58Z D! [agent] Initializing plugins
                                        2021-02-23T19:01:58Z W! [inputs.logparser] The logparser plugin is deprecated; please use the 'tail' input with the 'grok' data_format
                                        2021-02-23T19:01:58Z W! [inputs.logparser] The logparser plugin is deprecated; please use the 'tail' input with the 'grok' data_format
                                        2021-02-23T19:01:58Z D! [agent] Connecting outputs
                                        2021-02-23T19:01:58Z D! [agent] Attempting connection to [outputs.influxdb]
                                        2021-02-23T19:01:58Z D! [agent] Successfully connected to outputs.influxdb
                                        2021-02-23T19:01:58Z D! [agent] Starting service inputs
                                        2021-02-23T19:01:58Z E! [inputs.logparser] Error in plugin: open /var/log/pfblockerng/dnsbl.log: no such file or directory
                                        2021-02-23T19:01:58Z E! [inputs.logparser] Error in plugin: open /var/log/pfblockerng/ip_block.log: no such file or directory
                                        2021-02-23T19:02:00Z E! [inputs.logparser] Error in plugin: open /var/log/pfblockerng/dnsbl.log: no such file or directory
                                        2021-02-23T19:02:00Z E! [inputs.logparser] Error in plugin: open /var/log/pfblockerng/ip_block.log: no such file or directory

                                        Looks like you aren't using pfblockerng is that the case?

                                        Are you now getting data on the influxdb side and in turn on your dashboard?

                                        J 2 Replies Last reply Feb 24, 2021, 12:41 AM Reply Quote 0
                                        • J
                                          jpcapone @bigjohns97
                                          last edited by Feb 24, 2021, 12:41 AM

                                          @bigjohns97
                                          yup, I am getting data but I am still not seeing the same measurements in my DB that you see in the in the troubleshooting section. Also, I had to turn on pfblockerng and now but I am still not getting any data from it in grafana. Any suggestions?
                                          9c0b0377-11f1-4e51-8e28-9fdd20cac828-image.png

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                                            This community forum collects and processes your personal information.
                                            consent.not_received