Grafana

From Open Source Controls Wiki
Jump to navigation Jump to search

Links

Reasons for Using Grafana

Grafana has been developed to monitor and alarm cloud native systems, such as servers and online services. It is open-source as well as a managed service, providing all the functions required for maintaining critical services.

The cloud landscape requires continued up-times, monitoring of containers and clusters, and advanced alarm routing with escalation of problems as they occur to prevent any loss of services.

There are many similarities between cloud native services and HVAC services, and it makes complete sense to adopt systems developed in the cloud industry if they can benefit the HVAC industry, and assist in saving carbon. When the disk space on a server is running low, services may become slow, and when a pump in a heat network has faulty bearings, hot water may cool down. Apart from the names involved, the systems used to catch and highlight problems are identical.

The key difference between cloud systems and HVAC is open source. Cloud systems run on open-source software, and are developed collaboratively. HVAC runs on closed systems, where every function rests on installers creating them. It is no surprise that the standard methods used in the cloud landscape are light years ahead of anything in HVAC. They are also far more evolved, covering everything down to engineer notification, and alarm escalation.

  • Freemium model, with a free forever plan alongside commercial packages
  • Quick to setup
  • Community Edition that runs in Docker on a controller, as well as online, and as a hosted service
  • Scalable
  • Secure
  • Public and private dashboards
  • Technical support
  • Removes need for further cloud infrastructure, by providing core functions of data storage, visualisation and alarming
  • Works with both time series and static data
  • Numerous plugins, including SVG graphics manipulation
  • Compatible with all data sources
  • Powerful query construction
  • Repeating objects and panels allow for dynamic dashboarding
  • Organisation and user management

Points to Consider

  • HVAC equipment (and IoT in general) often differs from server architecture (Grafana origins) in regards to data bandwidths, with some devices connected on SIM cards or radios with very little data use available.
  • Needs to remain compatible with the standard 5 level MQTT structure (Network_Id / Node_Id / Device_Id / Group / Key) so that dashboards based on named devices can be shared easily. The Network_Id will be the only variable that changes. http://heatweb.co.uk/w/index.php?title=Heat_Network_Protocol
  • With limited data bandwidth, installations can make use of a data concentrator, a controller that generates summary statistics and alarms from groups of devices on a local network, that can then be sent over the limited uplink bandwidth.
  • It is required to make use of a data request function (a sync), whereby all (or specific) data on a device can be requested for publishing. This allows for high resolution historic data held on a controller that would not normally be published, to be remotely requested, published, and then accessed in the even of problems. Such a function would also apply to detailed system data and files.
  • Two-way communication is required, with a need to implement user commands. Examples include remote control over central heating, or the facility to switch equipment between on / off / auto states. Security needs to be considered with control dashboards locked to unauthorised users. Logging of control requests (commands) is required. In the existing protocol, the data groups cmd and set are used for commands and persistent settings respectively.

Data Routing

InfluxDB and MySQL

Numeric data is fed into InfluxDB, and all data is fed into MySQL.

Boilers4.PNG

Prometheus via InfluxDB Endpoint

The following Node-RED Flow is successfully feeding data into Prometheus on Grafana Cloud via the InfluxDB end-point.

Noderedinfluxprom.png


The following function is used to convert MQTT formatted messages into the InfluxDB format required.

if (isNaN(msg.payload)) { return null; }

var tops = msg.topic.split("/");

//const body = 'test,bar_label=abc,source=grafana_cloud_docs metric=35.2';

var body = tops[3] + '_' + tops[4] + ",network=" + tops[0] + ",node=" + tops[1] + ",device=" + tops[2] + ",vargroup=" + tops[3] + ",varkey=" + tops[4] + " metric=" + msg.payload ;

msg.payload = body; 

return msg;

This formatted message is then fed into an http request node containing credentials.

Noderedinfluxpromfunc.png

Prometheus Queries in Grafana

Numeric Data

Numeric data can be viewed in Grafana, pulling data from Prometheus.

Promgraph1.png

Metadata from GitHub

Standard topic metadata can be pulled from GitHub as JSON.

https://raw.githubusercontent.com/heatweb/plumbing-controller/main/json/topics/default.json

This is a variant on https://github.com/heatweb/heat-network/blob/master/devices/default_topics.json tailored for Grafana.

Promlabels2.png

Current Problems:

  • How to perform lookups on text values, so that repeating panels can automatically be given meaningful titles. e.g. /b1/settings/title and /b1/system/ipLAN can be combined with labels into a title "Boiler 1 (b1) 192.168.1.113"
  • How to filter on a lookup, for example to display in a row all devices with a /system/deviceType = "boiler" and also to then repeat for each network in a list. This is currently working using a variable lookup from MySQL.


Below is a FAILED attempt to combine data sources to add metadata to graphs, specifically to get each line labelled as "device, title" e.g. "boiler1, Flow Temperature", and to pull in units.


Promlabels1.png Promlabels1a.png

Kiosk

Using a Forms plugin and the Virtual Keyboard extension in Chrome, Grafana can provide a complete UI for altering system settings using the standard 7" touchscreen display.


Grafanasettings.png

Grafanasettings2.png

Installation

sudo docker run -d -p 3000:3000 --name=grafana --restart=always --net mqtt -v grafana-storage:/var/lib/grafana grafana/grafana-oss

See https://github.com/heatweb/plumbing-controller/blob/main/scripts/docker-setup.sh

Setting up for reverse proxy

Open a console into container.

Shell into Grafana and edit ini. May need to install nano

apk update 
apk add nano


nano /usr/share/grafana/conf/defaults.ini

Scroll down and alter to...

# The public facing domain name used to access grafana from a browser
domain = hwwiki.ddns.net

# Redirect to correct domain if host header does not match domain
# Prevents DNS rebinding attacks
enforce_domain = false

# The full public facing url
#root_url = %(protocol)s://%(domain)s:%(http_port)s/
root_url = %(protocol)s://%(domain)s:/dashboard/

# Serve Grafana from subpath specified in `root_url` setting. By default it is set to `false` for compatibility reasons.
serve_from_sub_path = true

At the bottom you can also set an anonymous login to allow access without a password.

Restart container

SVG Element JavaScript


  
function hexdec (hexv) {
    
    var hexString = (hexv + '').replace(/[^a-f0-9]/gi, '')
    return parseInt(hexString, 16);
    
  }
  
  function dechex (number) {
    
    if (number < 0) {
      number = 0xFFFFFFFF + number + 1
    }
    var hx =  parseInt(number, 10).toString(16);
    
    if (hx.length==1) { hx = "0" + hx; }
    return hx;
  }
  
  function color_avg(color1,color2,factor) {
  
          // extract RGB values for color1.
          
          var r1 = color1.substr(1,2);
          var g1 = color1.substr(3,2);
          var b1 = color1.substr(5,2);
          
          var r2 = color2.substr(1,2);
          var g2 = color2.substr(3,2);
          var b2 = color2.substr(5,2);
  
          // get the average RGB values.
          var r_avg = (hexdec(r1)*(1-factor)+hexdec(r2)*factor);
          var g_avg = (hexdec(g1)*(1-factor)+hexdec(g2)*factor);
          var b_avg = (hexdec(b1)*(1-factor)+hexdec(b2)*factor);
  
          // construct the result color.    
          var color_avg = '#' + dechex(r_avg) + dechex(g_avg)+ dechex(b_avg);
  
          // return it.
          return color_avg;
  }
  
  
  function tempColour(temp) {
      
      
      
      var values = parseFloat(temp);
      
      if(values===0) { return("#ffffff"); }

      var tempcolour = "#ff6c6c";
  
      var tv = Math.ceil(values / 10);
  
      var factr = 1 - (tv - (values / 10));
  
  
      //if ($tv<1) { $tempcolour = '#57007f'; }
      //elseif ($tv==1) { $tempcolour = color_avg("#57007f","#140af7",$factr); }
      //elseif ($tv==2) { $tempcolour = color_avg("#140af7","#008ef9",$factr); }
      
      if (tv<1) { tempcolour = '#ffffff'; }
      else if (tv==1) { tempcolour = color_avg("#ffffff","#ffffff",factr); }
      else if (tv==2) { tempcolour = color_avg("#ffffff","#008ef9",factr); }
      else if (tv==3) { tempcolour = color_avg("#008ef9","#14fffb",factr); }
      else if (tv==4) { tempcolour = color_avg("#14fffb","#66fe00",factr); }
      else if (tv==5) { tempcolour = color_avg("#66fe00","#f7fe02",factr); }
      else if (tv==6) { tempcolour = color_avg("#f7fe02","#fdbe10",factr); }
      else if (tv==7) { tempcolour = color_avg("#fdbe10","#f08e12",factr); }
      else if (tv==8) { tempcolour = color_avg("#f08e12","#ff4f19",factr); }
      else if (tv==9) { tempcolour = color_avg("#ff4f19","#c90504",factr); }
      else if (tv>9) { tempcolour = "#c90504";  }
  
              
      return(tempcolour);
      
  }
  


  console.log("ctrl...",ctrl);
  console.log("rows...",ctrl.data[0].rows);
  var lastdev = "";
  
  for (var row in ctrl.data) {
     
     
        var val = ctrl.data[row].datapoints.lastItem[0];
        console.log("val...",val);
      
        //"{__name__="dat_tF_metric", __proxy_source__="influx", device="boilers", network="putneyplaza", node="zcccaef4ahl", vargroup="dat", varkey="tF"}"
        var rinf = ctrl.data[row].alias;
        
        var dev = rinf.split("device=")[1].split('"')[1];
        lastdev = ""+dev;
        console.log("dev...",dev);

        var varkey = rinf.split("varkey=")[1].split('"')[1];
        // var varkey = ctrl.data[0].rows[row].varkey;
        
        var vargroup = rinf.split("vargroup=")[1].split('"')[1];
        // var vargroup = ctrl.data[0].rows[row].vargroup;
         
        
        //$(svgnode).find("[device='"+dev+"']").filter("[varkey='"+varkey+"']").filter("text").html(val);
        
        if (!isNaN(val)) { val = parseInt(10*parseFloat(val)) / 10; }
        
     
        $(svgnode).find("[varkey='"+varkey+"']")
        .filter(function( index ) {
            return !$( this ).attr( "device" ) || $( this ).attr( "device" ) == dev  || $( this ).attr( "device" ) == "#d";
        })
        .filter(function( index ) {
            return !$( this ).attr( "vargroup" ) || $( this ).attr( "vargroup" ) == vargroup;
        })
        .filter("text").html(val);
        
        
        
        var valcol = tempColour(val) ;
        console.log("colour...",valcol);
        
        $(svgnode).find("[varkey='"+varkey+"']")
        .filter(function( index ) {
            return !$( this ).attr( "device" ) || $( this ).attr( "device" ) == dev || $( this ).attr( "device" ) == "#d";
        })
        .filter(function( index ) {
            return !$( this ).attr( "vargroup" ) || $( this ).attr( "vargroup" ) == vargroup;
        })
        .filter("path").css({ fill: valcol });
        
  }
  
  $(svgnode).find("[varkey='#d']").filter("text").html(lastdev);
  
 
  

Plugins

The following plugins are used.

To Do List

  • How to display a table of latest values, by group, from InfluxDB. What if latest value is before time range in Grafana? Can Grafana lookup older (the last) values?
  • SVG linked to Prometheus. Only 4 values come though, so not showing latest values unless reduce time range to 5m.
  • How can one lookup Display Name, Units, and optionally Min, Max, from a JSON source?
  • (Optional/Advanced) How can one change the colour of a dashboard item based on a separate metric - so temperatures from devices that are turned off are always grey. Ideally a group or row could be greyed out. Note can do this via an SVG because it opens up scripting on elements. Is there a method of injecting script for other elements, e.g. custom html with JavaScript?
  • How can we setup data sources and dashboards at the same time as creating Docker instances? Are they stored/loaded from a directory that we can populate?