A System Architecture for Monitoring the Reliability of

0 downloads 0 Views 909KB Size Report
applications run on so you have system administrators, .... docker images as well. .... rounds the sample values of all elements in v up to the nearest integer.
A System Architecture for Monitoring the Reliability of IoT

Radu BONCEA* **, Ioan BACIVAROV** *Romania Top Level Domain, National Institute for Research and Development in Informatics-ICI Bucharest ** University Politehnica of Bucharest, Faculty of Electronics, Telecommunications and Information Technology * [email protected], **[email protected]

Abstract The Internet of Things has gained momentum in recent years, supported by new technologies and computing paradigms such as Cloud Computing and Service Oriented Architecture and an increasing demand from the enterprise. With hundreds of billions of devices to be connected in the near future, IoT will need new methods for addressing key challenges in security and reliability. One particular challenge we will focus on is the ability of the system to prevent itself from failing by continuously introspecting its own state and take decisions without human intervention. We will demonstrate how this can be achieved using new time series databases and monitoring systems such as Prometheus, InfluxDB, OpenTSDB and Graphite. By logging performance and other transaction metrics, the system can use specific algorithms to predict potential issues and react. We will then show how machine-learning algorithms could be used to reveal new insights, patterns and relationships across data. Keywords: IoT, monitoring, reliability, self-management, time series, automation, Prometheus, OpenTSDB, InfluxDB 1. INTRODUCTION Internet of Things is a vision where every object in the world has the potential to connect to the Internet and provide their data so as to derive actionable insights on its own or through other connected objects [1]. With support from Do It Yourself communities, IoT has emerged as a key enabling technology for the 4th Industrial Revolution [2] along with Internet of Services, Cloud Computing, Machine-to-Machine, RDIF, Cyber-Physical Systems, Autonomic Systems, Systems of Systems, Robotics, Software Agents, Cooperating Objects [3] and Machine Learning. Recent studies have put the number of IoT devices connected to Internet to reach 38.5 billion by 2020 [4]. The classic methods of monitoring application performance rely on tools such as Nagios, Cacti or Zabbix to log application metrics and a lot of human engineering to interpret these metrics and make appropriate decisions. There are multiple layers the

applications run on so you have system administrators, network administrator and application developers doing the monitoring at regular intervals of time. Cloud computing has been the first technology to challenge this model with its increased number of applications and services that needed to be monitored, from infrastructure components (servers, routers storage) to cloud computing services and the user experience. Cloud computing vendors like VMware or Microsoft have integrated the active monitoring into centralized management and analytics solutions. In cloud computing, the computational resources are monitored for both the physical and virtual layers using agents like VMware vSphere Hypervisor which reports metrics of the physical machine or host and VMware View Agent which is addressing virtual machine metrics as pictured in Fig.1. All metric values are pulled from agents and pushed to a time-series database where an analytics platform has access to.

143

gateways in a typical IoT ecosystem. Like it is the case with plant and soil monitoring over a large area. There would be tens of thousands of sensors and hundreds of gateways deployed in a star-of-stars topology. Manually monitoring the performance and reliability of so many devices would be expensive and inefficient. The system must be able to monitor itself and react, either by sending alerts or executing series of operations. In this paper we will discuss the monitoring of the devices at the Edge and the applications deployed on gateways using prediction models and trends based on time-series data. 2. TIME-SERIES DATABASES

Figure 1 - Monitoring computational resources in Cloud Computing

The IoT ecosystem is composed of 4 layers (Fig.2): the Edge where IoT devices are located; the Gateway where sensors data is initially stored, filtered and curated; the Cloud Platform is the layer where data is enriched and processed by analytics tools; the last layer is the Presentation where data-centric business services are offered to end users [5].

There are specific non-functional requirements for IoT time-series databases that are deployable on gateways: • labeling and tagging data points is a must due to the large variety of devices; • labels should be indexable so filtering by a specific tag or label should be done at database engine level; • high resolution datapoints; • the engine should be optimized for intensive writing, almost no updates and the deletes are done in bulk; • compressed storage; • support service integration through a HTTP API as the gateways would accommodate a service oriented architecture with a plethora of microservices deployed. One observation worth noting is the fact there is no requirement for long-term retention of data. This is because the gateways are pushing data further to the Cloud Platform where the historical data is analyzed in a greater context. The data at gateway is used only near real-time analysis. We will analyze 4 modern monitoring systems that are implementing the above described requirements: Prometheus, InfluxData, OpenTSDB and Graphite. 2.1. Prometheus

Figure 2 - IoT generic architecture

The monitoring of the applications in the Presentation layer is based on the classic model with Nagios, Zabbix and Cacti largely used. The Cloud Platform monitoring is done using vendor-oriented solutions such as VMWare vSphere or cloud operating systems such as OpenStack. Because the IoT devices are generally resourceconstrained, the monitoring is done at the Gateway layer along with the monitoring of other Gateway applications. There is also a challenge regarding the number of devices per gateway and the number of

Prometheus is an open-source systems monitoring and alerting toolkit, using LevelDB as a time-series database (TSDB) and featuring: • a multi-dimensional data model with support for labels; • a flexible query language that lets the user to aggregate time series data in real time; • no reliance on distributed storage; single server nodes are autonomous; • time series collection happens via a pull model over HTTP; • pushing time series is supported via an intermediary gateway;

144

• • • •

targets are discovered via service discovery or static configuration; multiple modes of graphing and dashboarding support; HTTP API; alert manager;

2.2. InfluxData InfluxData provides a robust, open source and fully customizable time-series data management platform. It uses InfluxDB for storing metrics and IoT sensor data. It features: • support for labels and data annotations, but unlike Prometheus, InfluxDB is attaching the metadata to each event/row, thus increasing the overall overhead and disk space required; • high availability with InfluxDB Relay; • expressive SQL-like query language; • continuous queries automatically compute aggregate data to make frequent queries more efficient; • it implements the push model where agents are sending the metrics to InfluxDB; • downsampling and resolution adjustment over time; • HTTP API.

InfluxDB, OpenTSDB and Graphite are passive databases, in the sense that the agents are pushing metrics to the database’s interface, while Prometheus adopts a pull model, “scrapping” metrics from applications. Another major difference is that Prometheus has builtin aggregation functions and alert manager subsystem. In this regard, Prometheus is a full monitoring and trending system that includes built-in and active scraping, storing, querying, graphing, and alerting based on time series data. If we take in consideration that the gateways are relatively light computational devices, at least when compared with cloud computing performance, we note that Prometheus has an edge over competition: • InfluxDB and Grahite require more storage and have limited aggregation functions, functions which otherwise would have to be implemented on client side and consequently requiring more computational resources; • OpenTSDB storage is implemented on top of Hadoop and HBase, requiring the complex deployment of a cluster with multiple nodes from the beginning. Thus, we will focus on Prometheus as it supports greater autonomy and does well on resourceconstrained environments. 3. DEPLOYING PROMETHEUS ON GATEWAYS

2.3. OpenTSDB OpenTSDB is a time-series database running on top of Hadoop and HBase, designed specifically for long retention of raw data and greater scalability. It features: • millisecond resolution; • HTTP API; • variable length encoding - use less storage space for smaller integer values; • support for both synchronous and asynchronous writings; • support for labels, annotations and metadata. 2.4. Graphite Graphite is an enterprise-scale monitoring system composed of a daemon listening for time-series data, a fixed-size database similar to RRD (round-robindatabase) and a dashboard-like web application. It features: • long-term retention but with the expense of storage efficiency; • multi-archive storage; • average-like aggregation with functions such as average, sum, min, max, last; • support for labels, annotations and metadata.

Prometheus consists of multiple components, some of them optional: • the Prometheus server scrapes and stores the time-series data; it supports a query language which allows for a wide range of operations including aggregation, slicing and dicing, prediction and joins; • the push gateway allows ephemeral and batch jobs to expose their metrics to Prometheus; since these kinds of jobs may not exist long enough to be scraped, they can instead push their metrics to a push gateway; • a browser-based dashboard builder based on Rails/SQL; • a large variety of special-purpose exporters; an exporter is basically a http resource identified by an URL and which contains metrics (key, tags and values) in a specific format; • an alert manager which takes care of deduplicating, grouping, silencing and routing them to the correct receiver integration such as email, PagerDuty or OpsGenie.

145

Figure 3 – Prometheus overall architecure. Source: www.prometheus.io

Figure 4 - Example of script for starting Prometheus

Prometheus can be compiled from sources or precompiled binaries for common operating systems can be downloaded and installed. There is support for docker images as well. There are two ways of telling Prometheus what targets to use (data scrapping locations): either using file-based local configuration if a high level of autonomy is desired or solutions that support service discovery like Kubernetes and Consul.io for centralized system architectures. When deploying Prometheus at gateway level, we should consider as first target the gateway (in IoT we associate a gateway with single-board computers like Raspberry Pi). To achieve this, we can use Prometheus node exporter to expose thousands of different types of metrics specific to machines running on unix-like OS. These metrics cover statistics about cpu, diskstats, conntrack, available entropy, file descriptors, network, hardware devices, virtual devices, vmstat, interrupts, network connections, etc. The node exporter can be started as a background process as bellow: $ nohup node_exporter

Then we can add the target to Prometheus configuration in YAML format, specifying the target URL, the job’s name and the scrape interval. By default the node exporter will listen on port 9100. scrape_configs: - job_name: "node" scrape_interval: "15s" target_groups: - targets: ['localhost:9100']

The next step would be to start Prometheus server. One way to do it is to start it as a subsystem by placing a script similar to the one bellow in /etc/init.d.

There are several arguments Prometheus accept that are very important when considering deploying on gateways: • storage.local.chunk-encoding-version: the type 1 encoding allows a faster random access at the expense of storage (3 bytes per sample), type 2 has better compression (1.3 bytes) but cause more CPU usage and increased query latency. • storage.local.retention: measured in hours, it allows you to configure the retention time for samples. Because the gateway is used to push curated data upstream, this parameter should have small values, like 30 days or less. • storage.local.memory-chunks: there should be 3 memory chunks per series. • storage.local.series-file-shrink-ratio: greater value minimizes rewrites but at the cost of more disk space. 4. Q UERYING Prometheus provides a functional expression language that lets the user select and aggregate time series data in real time. The result of an expression can either be shown as a graph, viewed as tabular data in Prometheus's expression browser, or consumed by external systems via the HTTP API. There are four data types in Prometheus expression language (PromQL): • instant vector - a set of time series containing a single sample for each time series, all sharing the same timestamp; • range vector - a set of time series containing a range of data points over time for each time series; • scalar - a simple numeric floating point

146

value; string - a simple string value; currently unused. Besides arithmetic, comparison and logical operators, PormQL supports: • vector matching: operations between vectors attempt to find a matching element in the right-hand-side vector for each entry in the left-hand side; • aggregation operators like sum, min, max, avg, stddev (standard deviation over dimensions), stdvar (standard variance over dimensions), count, bottomk (smallest k elements by sample value), topk (largest k elements by sample value), count_values (count number of elements with the same value). •

Function name and arguments abs(v vector) absent(v vector) ceil(v instant-vector) changes(v range-vector) clamp_max(v instant-vector, max scalar) clamp_min(v instant-vector, min scalar) count_scalar(v instant-vector) delta(v range-vector) deriv(v range-vector) drop_common_labels(instant-vector) exp(v instant-vector) floor(v instant-vector) histogram_quantile(φ float, b instantvector) holt_winters(v range-vector, sf scalar, tf scalar) increase(v range-vector) irate(v range-vector)

An instant vector can be obtained by simply calling the metric name. For instance, the node exporter has a metric called process_cpu_seconds_total which is a counter telling us the total user and system CPU time spent in seconds. The instant vector is process_cpu_seconds_total. A range vector works like an instant vector, except that it selects a range of samples back from the current instant. The range duration can be appended in square brackets to the end of the vector name. For instance, at a scrape interval of 15 seconds, process_cpu_seconds_total[1m] will return 4 values recorded in the last 1 minute. Prometheus also comes with more than 30 built-in functions that operates on vectors

Description returns the input vector with all sample values converted to their absolute value returns an empty vector if the vector passed to it has any elements and a 1element vector with the value 1 if the vector passed to it has no elements rounds the sample values of all elements in v up to the nearest integer for each input time series, the function returns the number of times its value has changed within the provided time range as an instant vector clamps the sample values of all elements in v to have an upper limit of max clamps the sample values of all elements in v to have a lower limit of min returns the number of elements in a time series vector as a scalar calculates the difference between the first and last value of each time series element in a range vector v calculates the per-second derivative of the time series in a range vector v, using simple linear regression drops all labels that have the same name and value across all series in the input vector calculates the exponential function for all elements in v rounds the sample values of all elements in v down to the nearest integer calculates the φ-quantile (0 ≤ φ ≤ 1) from the buckets b of a histogram produces a smoothed value for time series based on the range in v

round(v instant-vector, to_nearest=1 scalar) scalar(v instant-vector) sort(v instant-vector)

calculates the increase in the time series in the range vector calculates the per-second instant rate of increase of the time series in the range vector calculates the natural logarithm for all elements in v calculates the binary logarithm for all elements in v calculates the decimal logarithm for all elements in v predicts the value of time series t seconds from now, based on the range vector v, using simple linear regression calculates the per-second average rate of increase of the time series in the range vector returns the number of counter resets within the provided time range as an instant vector rounds the sample values of all elements in v to the nearest integer returns the sample value of that single element as a scalar returns vector elements sorted by their sample values, in ascending order

sort_desc(v instant-vector) sqrt(v instant-vector)

returns vector elements sorted by their sample values, in descending order calculates the square root of all elements in v

ln(v instant-vector) log2(v instant-vector) log10(v instant-vector) predict_linear(v range-vector, t scalar) rate(v range-vector) resets(v range-vector)

147

avg|min|max|sum|count_over_time(v range- the average|minimum|maximum|sum|count value of all points in the specified vector) interval Table 1 - Prometheus built-in functions

To demonsatrate the usage of these functions, let us consider this example: we want to predict what how much disk space we will have 1 day from now on the root filesystem partition, mounting point “/” and on machine identify by label instance “serv1”. The Prometheus function that does that is predict_linear, which accepts as arguments a range vector (we will take ranges of 1 minute) and a scalar for the interval in seconds. The PromQL query for our use case is: predict_linear(node_filesystem_avail{instance=" serv1",mountpoint="/"}[1m],86400)

alert exists that matches another set of matchers. Both alerts must have a set of equal labels. The alerting rules are defined similar to recording rules and reloaded by Prometheus by sending a SIGHUP signal. A rule has the following syntax: ALERT IF [ FOR ] [ LABELS ] [ ANNOTATIONS ]

For frequent and computationally expensive queries, Prometheus comes with precomputed results saved as new time-series based on explicit rules, like in the following example: job:http_inprogress_requests:sum = sum(http_inprogress_requests) by (job)

Here, the recording rule is evaluated at the interval specified by the evaluation_interval field in the Prometheus configuration. During each evaluation cycle, the right-hand-side expression of the rule statement is evaluated at the current instant in time and the resulting sample vector is stored as a new set of time series with the current timestamp and a new metric name (job:http_inprogress_requests:sum).

The optional FOR clause causes Prometheus to wait for a certain duration between first encountering a new expression output vector element and counting an alert as firing for this element. The LABELS clause allows specifying a set of additional labels to be attached to the alert. The ANNOTATIONS clause specifies another set of labels that are not identifying for an alert instance. They are used to store longer additional information such as alert. In our example with the prediction of the disk space available tomorrow, we want to create an alarm that would be fired (sent to Alarmanager for dispatching) if tomorrow we will run out of free space. The rule is based on predict_linear as shown bellow: ALERT WeWillRunOutOfSpace IF

5. ALERTS

predict_linear(node_filesystem_avail{instance="srv1",mountp oint="/"}[1m],86400) < 1

Alerting with Prometheus is separated into two parts. Alerting rules in Prometheus servers send alerts to an Alertmanager. The Alertmanager then manages those alerts, including silencing, inhibition, aggregation and sending out notifications via methods such as email, PagerDuty and HipChat. The Alertmanager can be started similar to Prometheus:

FOR 1m ANNOTATIONS { summary = "No more free disk space tomorrow on {{ $labels.instance }}", description = "{{ $labels.instance }} will run out of space (current value: {{ $value }}s)",

$> nohup alertmanager -config.file=config.yml

5. CONCLUSIONS

The configuration file holds information about the notification integrations (e.g. email, hipchat, slack,webhook, pagerduty,pushover), routing rules and inhibition rules. A route block defines a node in a routing tree and its children. Its optional configuration parameters are inherited from its parent node if not set. That way, when an alert enters the tree at the configuration toplevel route, it will traverse the child nodes until it “hits” a matching node and consequently a notification is fired. An inhibition rule is a rule that mutes an alert matching a set of matchers under the condition that an

Time-series databases facilitate predictive forecasting which has long been the goal for reliability engineers. With a service oriented architecture, solution such as Prometheus can be used to automate the reaction of the system to certain predictions or to new data insights. We can create alerting rules and have them routed by an alert manager to message brokers such as ActiveMQ or Redis. Or we can create our own reactive manager that has more complex rules and functions that Prometheus offer. Monitoring services, part of the reactive applications (see Figure 5), which are subscribers to the notification stream can then execute explicit instructions for specific events.

148

Prometheus comes with many useful functions that process the time-series. However, they are limited to simple arithmetic and logic operations. For more complex use cases, Machine Learning can be used to classify time-series events based on historical data. Such a solution is TensorFlow, an open source software library numerical computation and machine learning. Given multiple time-series that have causal connections, we can use TensorFlow to train logistic regression models to identify (classify) events that impact the performance of the applications. For instance, the system can be trained to know that an increase in memory usage over time by a certain application signals a memory leak. Of course, in this case we could also use rather much simpler arithmetic operations based on certain thresholds, but machine learning allows more precision makes it easier to avoid false alarms. Also once trained, the system will be able to dynamically classify based on patterns without human intervention.

REFERENCES 1. Balani, Naveen. Enterprise IoT: A Definitive Handbook. ISBN 1518790860. 2.

Acatech.

NATIONAL

ACADEMY

OF

SCIENCE

AND

ENGINEERING. 2016. 3. Varmesan, Ovidiu and Friess, Peter. Internet of Things: Converging Technologies for Smart Environments and Integrated Ecosystems. s.l. : River Publishers. ISBN: 978-87-92982-73-5. 4. Juniper Research. Internet of things’ connected devices to almost triple

to

over

38

billion

units

by

2020.

[Online]

http://www.juniperresearch.com/press/press-releases/iot-connecteddevices-to-triple-to-38-bn-by-2020. 5. Boncea, Radu, Bacivarov, Ioan C. Security in Internet of Things: Mitigating the Top Vulnerabilities. Asigurarea Calităţii – Quality Assurance. January-March 2016, Vol. XXII, 85, pp. Pages 11-17. 6. Prometheus - Monitoring system & time series database. [Online] [Cited: 06 20, 2016.] https://prometheus.io. 7. Gorilla: A Fast, Scalable, In-Memory Time Series Database. Tuomas Pelkonen, Scott Franklin, Paul Cavallaro, Qi Huang, Justin Meza,

Justin

Teller,

Kaushik

Veeraraghavan.

2014-2015,

Proceedings of the VLDB Endowment, Vol. 8, pp. 1816 - 1827. 8. Gilchrist, Alasdair. The Technical and Business Innovators of the Industrial Internet. Industry 4.0. s.l. : Apress, pp. 33-64. 9. Mauro Andreolini, Marcello Pietri, Stefania Tosi, Riccardo Lancellotti. A Scalable Monitor for Large Systems. Cloud Computing and Services Sciences. 2015 : Springer International Publishing, pp. 100-116.

Figure 5 - The architecture for automatic monitoring

149

Suggest Documents