Day 2 Ops for Red Hat OpenStack Platform

Red Hat OpenStack 12 is out, so it’s time to re-test Day2 opstools.
Red Hat Cloudforms has been providing day2 operations and monitoring services for OpenStack for as long as I remember. However starting with Red Hat OSP 10 some new tools have been added to ease up operations.  With Red Hat OSP 11 we ended up with three new agents – fluentd, sensu and collecd.

I am not going to focus on installing and integrating Red Hat Cloudforms. I will leave that piece for another blog post in the future. There is a lot of great documentation available on Red Hat website for OpenStack + Cloudforms integration. This is a good start:

Instead Cloudforms, I will describe installation and configuration of new third party agents and integrating them with third party dashboards.

Agents and what they do:

Fluentd – open source data collector for logging
Integrates with: Elasticsearch, Kibana

Sensu - Monitor servers, services, application health
Integrates with: Uchiwa

Collectd - gathers metrics from various sources - operating system, applications, logfiles and external devices
Integrates with: Grafana

For this effort, I have built a quick reference architecture lab:

Please note 1 undercloud node, 3 controllers, 2 computes and 3 ceph nodes connected to standard TripleO networks
At the top a new node – opstools – connected just to public network, running vanilla RHEL 7.4 and connected to OSP12 repositories.

Installation Opstools server:

On pre-installed RHEL7 node:

[root@opstools ~]# yum install git ansible

[root@opstools ~]# git clone

[root@opstools ~]# cd opstools-ansible/

[root@opstools opstools-ansible]# ssh-copy-id root@localhost


Two files need to be defined before executing the playbook – hosts inventory file and config.yaml that defined password, ports, network settings, security, etc.


[root@opstools opstools-ansible]# vi inventory/hosts
opstools ansible_host=localhost ansible_user=root ansible_become=true





[root@opstools opstools-ansible]# vi config.yml
grafana_username: admin
grafana_password: changme
  - username: 'uchiwa'
    password: 'changeme'
  - username: 'kibana'
    password: 'changeme'
data_storage: graphite


All the settings are described in here:


Install all the dashboards with a single playbook

[root@opstools opstools-ansible]# ansible-playbook playbook.yml -e @config.yml


The playbook is decent, but it’s being modified constantly in true CI/CD fashion, so it’s not unusual to hit a small bug. It is usually very easy to correct these bugs. Simply run the playboook with -vvvv to better identify failing component. Fix and re-run again. Most of the issues I hit are due to missing repository or typo in the package name.


The successfully deployed opstools server will result in following message:
PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************************************************
opstools                   : ok=187  changed=39   unreachable=0    failed=0   


After playbooks complete, you can verify the functionality of 3 dashboards by opening them in your webbrowser:

Installation OpenStack:

First let me start with steps that are specific to OSP12. These steps will not apply to OSP11 or OSP10 (but will probably apply to OSP13 and above).
Since OSP12 introduced containerization of overcloud services, we need to ensure that we provide containers to fluentd, sensu and collectd.

When preparing container images I have included yaml configuration files for all 3 agents:

[osp12 specific]
(undercloud) [stack@chrisj-undercloud ~]$ openstack overcloud container image prepare \
--namespace \
--set \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/logging-environment.yaml \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/monitoring-environment.yaml \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/collectd-environment.yaml


[osp12 specific]
(undercloud) [stack@chrisj-undercloud ~]$ openstack overcloud container image prepare \
--images-file ~/container-images.yaml \
--namespace \
--tag 12.0-20180124.1 \
--set \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/logging-environment.yaml \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/monitoring-environment.yaml \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/collectd-environment.yaml

[osp12 specific]
(undercloud) [stack@chrisj-undercloud ~]$ openstack overcloud container image prepare --env-file ~/templates/docker-registry.yaml --namespace --tag 12.0-20180124.1 --set ceph_namespace= --environment-file /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/logging-environment.yaml \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/monitoring-environment.yaml \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/collectd-environment.yaml

Continue with standard overcloud preparation
Steps below are valid for all OSP versions.


Next copy default opstools configuration yaml files to you local templates directory:
(undercloud) [stack@chrisj-undercloud ~]$ cp /usr/share/openstack-tripleo-heat-templates/environments/logging-environment.yaml templates/
(undercloud) [stack@chrisj-undercloud ~]$ cp /usr/share/openstack-tripleo-heat-templates/environments/monitoring-environment.yaml templates/
(undercloud) [stack@chrisj-undercloud ~]$ cp /usr/share/openstack-tripleo-heat-templates/environments/collectd-environment.yaml templates/


Edit the files to include information about the opstools server and metrics that needs to be tracked

(undercloud) [stack@chrisj-undercloud templates]$ vi logging-environment.yaml
## A Heat environment file which can be used to set up
## logging agents

  OS::TripleO::Services::FluentdClient: /usr/share/openstack-tripleo-heat-templates/docker/services/fluentd-client.yaml


## Simple configuration
   - host:
     port: 24224
#   - host:
#     port: 24224
## Example SSL configuration
## (note the use of port 24284 for ssl connections)
# LoggingServers:
#   - host:
#     port: 24284
# LoggingUsesSSL: true
# LoggingSharedKey: secret
# LoggingSSLCertificate: |
#   ...certificate data here...
#   -----END CERTIFICATE-----


(undercloud) [stack@chrisj-undercloud templates]$ vi monitoring-environment.yaml
## A Heat environment file which can be used to set up monitoring agents

  OS::TripleO::Services::SensuClient: /usr/share/openstack-tripleo-heat-templates/docker/services/sensu-client.yaml

  MonitoringRabbitPort: 5672
  MonitoringRabbitUserName: sensu
  MonitoringRabbitPassword: sensu
#  MonitoringRabbitUseSSL: false
#  MonitoringRabbitVhost: "/sensu"
#  SensuClientCustomConfig:
#    api:
#      warning: 10
#      critical: 20

(undercloud) [stack@chrisj-undercloud templates]$ vi collectd-environment.yaml
  OS::TripleO::Services::Collectd: /usr/share/openstack-tripleo-heat-templates/docker/services/collectd.yaml

## Collectd server configuration
#### Other config parameters, the values shown here are the defaults
#   CollectdServerPort: 25826
#   CollectdSecurityLevel: None
#### If CollectdSecurityLevel is set to Encrypt or Sign
#### the following parameters are also needed
#   CollectdUsername: user
#   CollectdPassword: password
## CollectdDefaultPlugins, These are the default plugins used by collectd
     - disk
     - interface
     - load
     - memory
     - processes
     - tcpconns
## Extra plugins can be enabled by the CollectdExtraPlugins parameter:
## All the plugins availables are:
     - disk
     - df
     - cpu
## You can use ExtraConfig (or one of the related *ExtraConfig keys)
## to configure collectd.  See the documentation for puppet-collectd at
## for details.
       - "/^[vhs]d[a-f][0-9]?$/"
       - "/"
     collectd::plugin::df::ignoreselected: false
     collectd::plugin::cpu::valuespercentage: true


Please note Resource registry section with absolute path change. Additional information on setting up these files can be found in here:


Finally make sure to include the newly create files in your deploy command. Example:
(undercloud) [stack@chrisj-undercloud ~]$ cat

source ~/stackrc
cd ~/
time openstack overcloud deploy --templates --stack chrisj \
     --ntp-server \
     -e templates/network-environment.yaml \
     -e templates/node-info.yaml \
     -e templates/docker-registry.yaml \
     -e templates/host-memory.yaml \
     -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
     -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
     -e templates/ceph-custom-config.yaml \
     -e templates/logging-environment.yaml \
     -e templates/monitoring-environment.yaml \
     -e templates/collectd-environment.yaml


How to use Dashboards:


1. Kibana – logging
After accessing dashboard (https://<ops-tools-ip/kibana) for the first time, you will be greated by the following screen:



simply select ‘Create’ button


Going to Discover tab will show you all the logs displayed from all the overcloud nodes. You can search for specific message or filter them out in any specific way desired.

There is also way to visualize the data.



2. Uchiwa – monitoring

Unfortunately out-of-the-box Openstack health checks no longer apply to OSP12 services. This is due to containerization of all services and inability to verify health with systemd. BZ have been already raised:

and potential fix:


In OSP11 and OSP10 however (and hopefully in future releases) you could take advantage of the healthchecks.

There are checks for most of the OpenStack standard service. The general idea is, if something goes down or have operational issues, the alert will appear in event section (main page).

If you get an alert that doesn’t apply to your environment, you can simply silence it or even remove.

Here is the example of RH OSP11 in a healthy state:




The one failed check is dues to openstack-cinder-api service moving to httpd in OSP11, so it can be silenced


3. Grafana – performance

Grafana dashboard (https://<ops-tools-ip/grafana) requires little more tuning before being able to display any data.

The initial screen will ask you to create new dashboard:



Select dashboard and start playing with it:

- select ‘graph’

- select ‘panel title’ at the top and ‘Edit’

- in general tab change title to ‘CPU Load - Controllers’

- in metrics tab under ‘data source’ select Graphite or default

- then ‘select metric’ → collectd

- ‘select metric’ and type → *controller*

- ‘select metric’ → *

- ‘select metric’ → Load

- ‘select metric’ → shortterm

The end result should look something like this:



it’s a nice graph for tracking CPU load.

There is tons of more options that could be measured. I also found this quick 10 minutes tutorial in here:



This concludes the installation procedure of the opstools for Red Hat OpenStack. Happy hacking!


There are 4 Comments

I'd suggest to clone the git repo to a laptop and to run ansible from there. I would not recommend to log into any machine as root via ssh, since most machines should have root logins disabled.

Hey Matthias,

Thanks for your comment. Definetly a valid statement. I have taken some shortcuts here.

Also, I am not encrypting any of the connections for the agents, which is probably not a best practice.

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.