Provisioning, operating and maintaining distributed apps requires fine grained capacity planning, objective definition and consumption monitoring.
The daily mood
We had an interesting meeting with the team about API security, then I had my 1-1 with my Manager. He suggested that I move from tutorial applications to real world applications when evaluating Helm chart composition tools like Helmfile, Kustomize, and others. Which I will probably start with tomorrow.
In the mean time I am also ramping up on observability, or the practice of designing systems and applications with the perception that someone needs to watch them. As a growing provider of Platform services, our R&D organization must obviously care about monitoring resource metrics and component logs, but it's easier to say than to do. We define 3 levels of observability:
- Foundation supports better operations
- Core supports better usability
- Intelligence supports better automation
This post talks about monitoring in general, then transits to technical monitoring in Kubernetes and finally take a look at Prometheus, a general purpose monitoring system originally inspired by Google Borgmon and developped by SoundCloud. It is independently maintained by the Cloud Native Computing Foundation (CNCF) since 2016.
What is monitoring?
- Why monitor:
- Prevent issues, better troubleshoot and react if they occur
- Improve service performance and reliability
- Optimize capacity and reduce costs
- Types of monitoring
- White-box = with knowledge of the monitored system/component
- Black-box = vice versa.
- Hierarchy of monitoring
- Workflow monitoring and Business activity monitoring (BAM) -> Key Performance Indicators (KPI)
- Process monitoring and Application performance monitoring (APM) -> Functional supervision
- System and Service monitoring (SSM) -> Technical supervision (ex. infrastructure resources)
- Levels of maturity:
- Logging: Execution information (trace) in semi-structured text format
- Observability: Ability for a component to expose its internal state (cf. Observer Pattern in Event-Driven Architecture) in the form of timestamped measures (metrics)
- Monitoring: Collection, storage and analysis of logs and/or metrics
- Alerting: Rule based behavioural detection, ticketing, notification, call-to-action
- AI: ML prediction, prescription
In this article we will focus on System and service monitoring (SSM) with alerting.
What to monitor
The USE method is a most accademical way of monitoring infrastructure components. This is what traditional monitoring solutions like Nagios and Ganglia focus on.
Resources | All physical resources (cpu, mem, disk, network) |
Utilization | Avg. time the resource is busy (%) |
Saturation | Degree of load the resource cannot serve (%) |
Errors | Failed (events/sec) |
Source: Brendan D. Gregg (currently working at Netflix)
The Four Golden Signals is a white-box monitoring approach defining the minimum metrics that a DevOps team should collect from a resilient system to be able to maintain its reliability. Here is a comprehensive article on how to collect those metrics per application type (Load Balancer, Web server, Database etc.).
Latency | Time (sec/event) |
Traffic | Handled (event/sec) |
Errors | Failed (event/sec) |
Saturation | Resource load (% system) |
Source: Rob Ewaschuk via Google SRE Book
The RED method is a black-box monitoring approach and kind of subset of the Four Golden Signals. It is service consumer centric and therefore convenient for microservices architecture. In a future post, we might talk about how Spring Boot Acuator module exposes JMX and HTTP endpoints to enable Microservices monitoring and auditing.
Rate | Handled (requests/sec) |
Error | Failed (requests/sec) |
Duration | Time (sec/request) |
Source: Tom Wilkie (currently working at Weaveworks)
Monitoring in Kubernetes
Originally, Kubernetes nodes were monitored via a project called Heapster which followed a Push approach to a time-series database (TSDB) and is now end-of-life. In 2017, the Kubernetes sig instrumentation group defined 2 new APIs: resource API and custom metrics API. Standard implementations of the corresponding APIs are listed here. Of course it is also possible to process external metrics beside these APIs.
The difference between resource metrics and custom metrics is that resource metrics are pre-existing (ex. CPU) and pre-aggregated (ex. Avg.), whereas custom metrics need to be created e.g. from resource objects by the logical layer. A typical use-case for custom metrics is implementing of a controller for the Kubernetes Horizontal Pod Autoscaler (HPA).
Source: Partly Cloudy, Custom metrics for horizontal scalability by Christian Dennig
Metric colleciton
The canonical implementation of the resource API is the metrics-server, an observer collecting metrics from the kubelet stats via Pull approach API, aggregating and storing them on the cluster. Metrics server is accessible via REST API which allows for kubectl top, Kubernetes Dashboard and other third-party client tools like for example k9s and kube-capacity to easily display compute (CPU) and memory (MEM) utilisation at both node and resource (i.e. pod) level.
$ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
prec-5520 3094m 38% 24481Mi 76%
K9s header with metrics-server activated
$ ./kube-capacity --util --sort cpu.util
NODE CPU REQUESTS CPU LIMITS CPU UTIL MEMORY REQUESTS MEMORY LIMITS MEMORY UTIL
prec-5520 5405m (67%) 18800m (235%) 2773m (34%) 15669Mi (49%) 23937Mi (74%) 24557Mi (76%)
As stated by the metrics-server documentation, current version is limited to 2 "advanced" use-cases:
- Horizontal Pod Autoscaler (HPA)
- Scheduler
There are plans to extend the Metrics server capabilities in the coming future, like for ex. with a support of kubectl top view or time series in Kubernetes dashboard, as well as custom application metrics. In fact Heapster is not developped any more and metrics-server is not yet mature for the enterprise. Go for something else.
In addition to the Metrics server, Prometheus already adopted the efficient practice of scraping HTTP endpoints instead of relying on agents by the time Heapster was the default, and therefore became large adoption accross the Kubernetes industry.
Custom metrics
Because of its popularity, Prometheus also had the first implementation k8s-prometheus-adapter of the Kubernetes custom metrics API, making it the solution of choice for any Kubernetes monitoring requirement.
Prometheus setup
Prometheus is a popular monitoring tool for pulling, storing and accessing metrics as time-series data. It can be setup via different ways, like for example locally, but of course it makes especially sense to set-it-up directly inside the monitored cluster. There is an offical Prometheus Helm chart available, and a dedicated microk8s add-on which basically deploys the Prometheus Operator (owned by RedHat sinced they acquired CoreOS in 2018) along with a bunch of example rules, as well as dashbaords and alerts which I believe to be forked from the Mixin project. With this, Prometheus follows default configuration and recommendation to watch all namespaces and applications. You can obviously change that as per the documentation.
$ microk8s.enable prometheus $ kubectl get pods -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 6m37s
grafana-7789c44cc7-j8t8z 1/1 Running 0 6m42s
kube-state-metrics-78c549dd89-khgcp 4/4 Running 0 6m28s
node-exporter-svzc2 2/2 Running 0 6m43s
prometheus-adapter-644b448b48-zkfvl 1/1 Running 0 6m43s
prometheus-k8s-0 3/3 Running 1 6m29s
prometheus-operator-7695b59fb8-v9qds 1/1 Running 0 6m43s
Storage
Prometheus comes with its own single-node storage by default, an on-disk times series database (TSDB). Support for storage scalability and/or durability can be achieved via the numerous client libraries and third-party integration connectors to long-term storages such as external databases and file systems.
Access
Prometheus basically offers 3 points of access to monitoring data:
- Browser-friendly endpoint on the prometheus server (simplistic webapp for querying metrics, rules, and alerts using own query language PromQL)
- Console templates (scripting library supporting Go templates)
- Grafana (standard datasource)
The webbapp is exposed unsecurely at port 9090.
sensible-browser http://$(kubectl -n monitoring describe pod prometheus-k8s-0 | grep IP: | awk '{print $2}'):9090/graph
In microk8s, Grafana runs on port 3000. It automatically provisions a user admin/admin as well as a number of standard Prometheus dashboards.
sensible-browser http://`kubectl -n monitoring describe pod $(kubectl -n monitoring get pods | grep grafana | cut -d' ' -f1) | grep ^IP: | awk '{print $2}'`:3000
Dashboard K8s / USE Method / Node
Alterting
Prometheus comes with a notification service called alertmanager that is executing rules described in YAML format. For example, you could define an alerting rule for sending a notification via E-Mail or PagerDuty when a disk is getting 90% full.
amtool is a command line utility for interacting with alertmanager. You may install it locally from release or use directly the one on your cluster:
$ kubectl exec alertmanager-main-0 -- sh -c "/bin/amtool alert --alertmanager.url=http://localhost:9093" Defaulting container name to alertmanager. Use 'kubectl describe pod/alertmanager-main-0 -n monitoring' to see all of the containers in this pod. Alertname Starts At Summary DeadMansSwitch 2020-06-08 18:10:45 UTC KubeClientCertificateExpiration 2020-06-08 18:11:47 UTC KubeClientCertificateExpiration 2020-06-08 18:11:47 UTC KubeCPUOvercommit 2020-06-08 18:16:47 UTC KubeMemOvercommit 2020-06-08 18:16:47 UTC TargetDown 2020-06-08 18:21:15 UTC KubeletDown 2020-06-08 18:25:47 UTC KubeControllerManagerDown 2020-06-08 18:25:47 UTC KubeSchedulerDown 2020-06-08 18:25:47 UTC
Since alerting rule might have to change frequently, Promgen web-UI is potentially a good solution for handling corresponding configurations.
Prometheus metrics architecture overview and custom pipelines
Source: Prometheus.io
Alternatives to Prometheus
Most established provider is certainly InfluxData. They maintain different open-source projects like Telgraf for collecting and forwarding metrics which are then made available both at rest in their Timeseries Database InfluxDB and in motion through their streaming engine Kapacitor.
Source: InfluxData
See also
- https://www.digitalocean.com/community/tutorials/an-introduction-to-metrics-monitoring-and-alerting
- https://www.infoq.com/articles/effective-monitoring-habits/
- https://www.infoq.com/articles/prometheus-monitor-applications-at-scale/
- https://chris-vermeulen.com/how-to-monitor-your-kubernetes-cluster-with-prometheus-and-grafana/
Comments
Post a Comment