metrics
An ansible role which configures performance analysis services for the managed host. This (optionally) includes a list of remote systems to be monitored by the managed host.
Requirements
Performance Co-Pilot (PCP) v5+. All of the packages are available from the standard repositories on Fedora, CentOS 8, and RHEL 8. On RHEL 7 and RHEL 6, you will need to enable the Optional repository/channel on the managed host.
The role can optionally use Grafana v6+
(metrics_graph_service
) and Valkey
(metrics_query_service
) on Fedora, CentOS 10, RHEL 10 and
later, or Redis v5+ (metrics_query_service
) on CentOS 8 or
9, RHEL 8 or 9.
Collection requirements
The role requires the firewall
role and the
selinux
role from the
fedora.linux_system_roles
collection, if
metrics_manage_firewall
and
metrics_manage_selinux
is set to true, respectively.
(Please see also the variables in the Role Variables
section.)
If the metrics
is a role from the
fedora.linux_system_roles
collection or from the Fedora RPM
package, the requirement is already satisfied.
The role requires additional collections to manage
rpm-ostree
systems. If you need to manage
rpm-ostree
systems, run the below command to install the
collections.
ansible-galaxy collection install -r meta/collection-requirements.yml
Role Variables
metrics_monitored_hosts: []
List of remote hosts to be analysed by the managed host. These hosts will have metrics recorded on the managed host, so care should be taken to ensure sufficient disk space exists below /var/log for each host.
Example:
metrics_monitored_hosts: ["webserver.example.com", "database.example.com"]
metrics_webhook_endpoint: ''
Webhook endpoint (URL) where notification about any automatically detected performance issues are to be sent. By default, these events are logged to the local system log only.
metrics_retention_days: 14
Retain historical performance data for the specified number of days; after this time it will be removed (day by day).
metrics_graph_service: false
Boolean flag allowing host to be setup with graphing services. Enabling this starts PCP and Grafana servers for visualizing PCP metrics. This option requires Grafana v6+ which is available on Fedora, CentOS 8, RHEL 8, or later versions of these platforms.
metrics_query_service: false
Boolean flag allowing host to be setup with time series query services. Enabling this starts PCP and Valkey or Redis servers for querying any recorded PCP metrics. This option requires either Valkey or Redis v5+ which is available on Fedora, CentOS 8, RHEL 8, or later versions of these platforms (Valkey is the prefered solution on Fedora, Centos 10, RHEL 10 and later).
metrics_into_elasticsearch: false
Boolean flag allowing metric values to be exported into Elasticsearch.
metrics_from_elasticsearch: false
Boolean flag allowing metrics from Elasticsearch to be made available.
metrics_from_postfix: false
Boolean flag allowing metrics from Postfix to be made available.
metrics_from_mssql: false
Boolean flag allowing metrics from SQL Server to be made available. Enabling this flag requires a 'trusted' connection to SQL Server.
metrics_from_bpftrace: false
Boolean flag allowing metrics from bpftrace to be made available.
metrics_username: metrics
An account to establish authenticated access to remote metrics via the PCP pmcd daemon. For more information, see https://pcp.readthedocs.io/en/latest/QG/AuthenticatedConnections.html.
Additionally, if the bpftrace metrics are configured, this user account will be able to register bpftrace scripts.
metrics_password: metrics
Do not use a clear text metrics_password
. Use Ansible
Vault to encrypt the password.
Mandatory authentication for executing dynamic bpftrace scripts.
metrics_provider: pcp
The metrics collector to use to provide metrics.
Currently Performance Co-Pilot is the only supported metrics provider. When using the PCP provider these TCP ports will be used - 44321 (pmcd, live metric value sampling), 44322 (pmproxy, with metrics_query_service or metrics_graph_service), 6379 (either valkey-server or redis-server for metrics_query_service) and 3000 (grafana-server for metrics_graph_service).
metrics_manage_firewall: false
Boolean flag allowing to configure firewall using the firewall role.
Manage the pmcd port, the pmproxy port, the Grafana port and either the
Valkey or Redis port depending upon the configuration parameters. If the
variable is set to false, the metrics role
does not manage
the firewall.
NOTE: metrics_manage_firewall
is limited to
adding ports. It cannot be used for removing ports. If
you want to remove ports, you will need to use the firewall system role
directly.
NOTE: the firewall management is not supported on RHEL 6.
metrics_manage_selinux: false
Boolean flag allowing to configure selinux using the selinux role.
Assign the pmcd port, the pmproxy port, the Grafana port and either the
Valkey or Redis port depending upon the configuration parameters. If the
variable is set to false, the metrics role
does not manage
the selinux.
Please note that the pmcd and pmproxy services are in the "ephemeral"
range requiring no special setup and the Grafana port is "unregistered".
The Valkey or Redis ports are gated by the valkey_port_t or redis_port_t
SELinux types respectively, and may need to be further configured if you
require direct access (not required if you are accessing it from metrics
role tools like Grafana and PCP). Use the selinux
system
role to manage port access, for SELinux contexts.
NOTE: metrics_manage_selinux
is limited to
adding policy. It cannot be used for removing policy.
If you want to remove policy, you will need to use the selinux system
role directly.
Example Playbook
Basic metric recording setup for each managed host only, with one weeks worth of data retained before culling.
---
- name: Manage metrics service
hosts: all
vars:
metrics_retention_days: 7
roles:
- linux-system-roles.metrics
Scalable metric recording, analysis and visualization setup for the managed hosts, providing a REST API server with an OpenMetrics endpoint, graphs and scalable querying.
---
- name: Manage metrics with graph and query services
hosts: all
vars:
metrics_graph_service: true
metrics_query_service: true
roles:
- linux-system-roles.metrics
Centralized metric recording setup for several remote hosts and scalable metric recording, analysis and visualization setup for the local host, providing a REST API server with an OpenMetrics endpoint, graphs and scalable querying.
---
- name: Manage centralized metrics gathering
hosts: monitors
vars:
metrics_monitored_hosts: [app.example.com, db.example.com, nas.example.com]
metrics_graph_service: true
metrics_query_service: true
roles:
- linux-system-roles.metrics
rpm-ostree
See README-ostree.md
License
MIT