Hypercool Infrastructure - HCI RHV/Gluster + OpenStack/CEPH

Last year (2017) at OpenStack summit in Boston a friend of mine (Randy Rubins) presented an interesting concept of deploying OpenStack Control plane (together with HCI compute/storage) on Hyper-Converged Red Hat Virtualization Platform (the upstream ovirt). The idea was to capture all the workloads (pets and cattle) in single integrated infrastructure.  I remember sitting there in that conference room and thinking about all the pros and cons of this solution. Both RHV (ovirt) and OpenStack comes with a sets of trade-offs. Combining both platforms together makes it a "Unicorn", hence the reason for Hyper-Cool name.  

Here is a 3 minutes video demonstating working environment:

 

 

 

Randy's presentation from last year:

Even though this has been an interesting concept I knew back then I could not recommend this to any of my clients due to inability to control power state of the virtualized controllers. If you can't control power state of your controller nodes you can't fence them off in case they start causing some major problems. Not having that functionality could also (in edge cases) lead to split brain of your galera cluster. It would be also very inconvenient to manually power VMs on and off every time we deploy or destroy our stack. 

Thankfully, starting with RH OSP13 (Queens), this limitation has been satisfied by introduction of staging-ovirt driver for ironic. You can now add RHV based VM as a node in Ironic and use it as a regular baremetal node.

So why would anyone want to do that:

  • integrated IaaS for all type of workloads
  • virtualized openstack controllers have better flexibility in dynamic allocation of resources (let's say you can start them with 24GB of RAM to later increase that amount to higher number if you start bringing more demanding compute to the mix). Adding another RHEV node or cluster of nodes is also an option.
  • you can take a snapshot of virtual controllers and ultimately backup the state of entire control plane in relatively easy fashion (backup before major upgrade anyone?)
  • share resources between multiple roles. You could virtualize multiple OpenStack composable roles and have them running on the same hardware as you monolithic controllers. How about deploying Cloudforms, Ansible Tower, Opstools or undercloud to the same cluster? As long as your physical hosts have enough resources, this should be more then ok
  • maintenance of the physical hosts could be easier. Simply live migrate a VMs to another host and without interruption to provided service, repair/replace the node
  • extra layer of HA - Red Hat Virtualization provides ways to recover VMs out of failing node in automatic fashion
  • better efficiency in hardware utilization

 

Ok, this sounds like a no-brainer. Why would someone not want to do that?

  • extra overhead of virtualized hardware
  • extra layer of abstraction to traditional deployment - additional layer = more pieces that needs to be maintained
  • this is rather new and edge cases might have not been fully tested

 

Now, we know the pros and cons so let's jump into how.

 

I. Architecture - Lab Environment

 

Hardware Used:

Role Model Qty Nics Memory Disks
RHHI Node X10SDV-6C+-TLN4F 3

1x 1GB Public Network

1x 10GB Storage Network

1x 10GB VLAN Trunk

64GB

1x 128GB SSD OS

1x 480GB NVME Gluster

OSP HCI X10SDV-4C-TLN2F 3

1x 10GB PXE, Tenant, Storage Mgmt

1x 10GB VM networks, Storage, APIs

 

64GB

1x 128 GB SSD OS

2x 750GB SSD Ceph OSD

 

 

II. Building RHHI Cluster

Building RHV/Gluster (RHHI) cluster described in this blog involves fair share of manual steps. It's all GUI driven, so folks with VMware background will feel like at home here. Alternatively, you could build RHV image for OpenStack and deploy it with Ironic. We will cover that in the future Ironic blog.

We are going to implement RHHI 2.0. You could either use 3,6 or 9 nodes for this. You need at least 1 disk for Operating system (2x disk in raid 1 are recommended) and at least 1 more for Gluster (multiple disks in raid 6 recommended). You also need to dedicate 1 nic for public network (management) and 1 for storage traffic.

1. Download Latest RHV appliance

At the time of writing this blog the latest version is "Hypervisor Image for RHV 4.2 Maintenance Batch #3 Async". You need to have active RH subscription to download the software and you can simply click on the link here -> https://access.redhat.com/products/red-hat-virtualization#getstarted  , then select "Download Latest" from the right top hand side of this page.

2. Boot all 3 nodes to just downloaded image

- Select install RHV 4.2 

- pick your language 

- configure networking and the hostname

- select correct drive for OS installation

- specify timezone and ensure you have access to ntp server

- select keyboard type

After items above are complete simply select "Begin Installation

- create password for at least root user

- after installation completes, reboot the node

3. SSH to each RHV hypervisor and verify basic status. Set up networking and hosts.

- FQDN Hostname is set up:

[root@rhv1 ~]# hostname
rhv1.home.lab

-  your /etc/hosts contains entries for all nodes in the cluster

[root@rhv1 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.31.8.51 rhv1 rhv1.home.lab
172.31.8.52 rhv2 rhv2.home.lab
172.31.8.53 rhv3 rhv3.home.lab

 

Note: Even though you might use DNS server, I find it a good practice to include entries in local /etc/hosts file

- set up interface to be used for Gluster storage network

Note: this might not work for everyone - in my lab Gluster storage network is routable, the issue might occur later in the process if you use layer2 network.

[root@rhv1 network-scripts]# cat ifcfg-eno3
TYPE=Ethernet
BOOTPROTO=static
NAME=eno3
DEVICE=eno3
NM_CONTROLLED=no
ONBOOT=yes
[root@rhv1 network-scripts]# cat ifcfg-eno3.313 
TYPE=Ethernet
BOOTPROTO=static
NAME=eno3.313
DEVICE=eno3.313
ONBOOT=yes
IPADDR=172.31.3.51
PREFIX=24
NM_CONTROLLED=no
VLAN=yes
[root@rhv1 network-scripts]# systemctl restart network

- generate ssh key on all the host and ensure they can ssh to each other without the password

[root@rhv1 ~]# ssh-keygen -t rsa
[root@rhv1 ~]# ssh-copy-id rhv1.home.lab

[root@rhv1 ~]# ssh-copy-id rhv2.home.lab
[root@rhv1 ~]# ssh-copy-id rhv3.home.lab

Ensure the host can ssh to each other on public and storage network

 

- finally register all the nodes to Red Hat or Satellite/local repos

You would at least need access to rhel-7-server-rhv-4-mgmt-agent-rpms channel of one of the nodes to complete installation process

4. Access one of the RHV hypervisor via webbrowser to complete Storage and Self hosted engine deployment

- Access the cockpit installer via https://<hostname_or_ip>:9090/  use the root user and password specified during installation

- on the left select Virtualization -> Hyperconverged -> start 

- specify IP address for 3 hosts that will be used for hyperconverge - make sure to specify a dedicated storage network IPs

Note: This might not work for everyone - in my example Gluster Storage network is routable. For layer2 networks use the same public network for gluster during installation and then switch to dedicated network after RHEV-M is up.

- next specify IP addresses or hostname for other 2 hypervisors that could be used for hosted engine - make sure to use a public ip here.

- On the "Packages" step, you can leave everything blank and select next

- also on "Volumes" step leave everything to default (optionally you could use an Arbiter - not covered in this blog)

- define your bricks to match your local disk configuration (in my case I am using 480GB nvme drive for Gluster bricks) 

- Next review configuration, tune to your liking. It is also a good practice to backup this into external text file.

- Finally select - "Deploy"

The process has taken ~ 30 minutes in my environment and resulted in following message:

- Select "Continue to Hosted Engine Deployment"

Define all the variables for Hosted Engine VM:

- on "Engine" step, enter your password and prepare VM

- finally in "Storage" step ensure you are pointing to your Hyper-Converged Gluster

Finish deployment.

5. Complete RHV-M configuration

- Log on to RHV-M by entering either IP or hostname specified during installation, then select Administration Portal and login with just create credentials

- Enable node fencing by  selecting 'Compute -> hosts'. Edit each host at the time and under Power Management, enable power management and enter OOB information for your hardware

- Next, let's set up networking. You want to define 3 type of networks:

  • vlan-trunk that will carry all the tagged traffic
  • individual VLAN networks - this one is going to be attached to VMs and work as native VLAN
  • management networks - for example define the same Gluster network that has been used during deployment or another vlan dedicated for live migration

- Let's start with creating trunk network. Go to Network->Networks-> New

Do not enable any VLANs:

This network is special. You don't want rhv to perform any filtering on it. After creating the network simply select it and under VNic Profiles tab, change 'Network Filter' to "No Network Filter"

Note: If you are planning to use this cluster to deploy Undercloud on or OpenStack controller nodes, you should also ensure you PXE network has no filters applied (otherwise OpenStack inspection and provisioning won't work). 

- Enter all the remaining VM networks that you are anticipating to use and that will act as native VLANs. Example:

Again, ensure no filtering is applied on OpenStack-PXE

- Finally create Management networks of your choice. In screenshots below is example of Gluster network that is also used for VM migration:

- Attach all created networks to appropriate hosts nics:

 

- at the end you hosts networking should look similar to this:

Your RHHI cluster is good to go. In next section you will learn how to create empty VMs that will be used for OpenStack controllers.

 

III. Integrating HCI OpenStack with RHHI

1. Let's start by logging back into RHV Manager and creating empty VMs that will be used for controllers. In this example I am creating openstack-controller-0 and creating new disk for the instance. Also ensure you are attaching all Openstack related network that you have pre-created for it.

- ensure enough memory and CPU is being assigned to your controllers

- in the "Boot Options" modify a boot order to start with network (PXE) boot followed by Hard Disk

- Finally select "OK" to finish VM definition

- You can take advantage of the clone functionality to create openstack-controller-1 and openstack-controller-2

2. Build or re-use existing underlcoud and continue rest of the steps there.

This blog is not going to walk you through building vanilla undercloud simply follow the official steps in here -> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/director_installation_and_usage/

Note: during undercloud installation you want to ensure following parameter is present in your undercloud.conf:

[stack@undercloud ~]$ grep staging-ovirt undercloud.conf 
enabled_hardware_types = ipmi,redfish,ilo,idrac,staging-ovirt

 

3. In order to manage power control of your RHV controllers you need to install ovirt-engine drivers and restart conductor:

[root@undercloud stack]# sudo yum install ovirt-engine-sdk-python python-ovirt-engine-sdk4

(undercloud) [stack@undercloud ~]$ sudo systemctl restart openstack-ironic-conductor.service

4. Define you instackenv.json to include newly create openstack-controller-X created in RHHI. Example:

{
    "nodes": [
        {
          "pm_user": "admin@internal",
          "pm_type": "staging-ovirt",
          "pm_addr": "172.31.8.54",
          "cpu": "1",
          "memory": "1",
          "disk": "1",
          "arch": "x86_64",
          "name": "rhv-controller0",
          "capabilities": "profile:control,boot_option:local",
          "mac": [ "00:1a:4a:16:01:06" ],
          "pm_password": "Passw0rd",
          "pm_vm_name": "openstack-controller-0"
        },
        {
          "pm_user": "admin@internal",
          "pm_type": "staging-ovirt",
          "pm_addr": "172.31.8.54",
          "cpu": "1",
          "memory": "1",
          "disk": "1",
          "arch": "x86_64",
          "name": "rhv-controller1",
          "capabilities": "profile:control,boot_option:local",
          "mac": [ "
00:1a:4a:16:01:0c
" ],
          "pm_password": "Passw0rd",
          "pm_vm_name": "openstack-controller-1"
        },
        {
          "pm_user": "admin@internal",
          "pm_type": "staging-ovirt",
          "pm_addr": "172.31.8.54",
          "cpu": "1",
          "memory": "1",
          "disk": "1",
          "arch": "x86_64",
          "name": "rhv-controller2",
          "capabilities": "profile:control,boot_option:local",
          "mac": [ "00:1a:4a:16:01:12" ],
          "pm_password": "Passw0rd",
          "pm_vm_name": "openstack-controller-2"
        },
...
        {
            "arch": "x86_64",
            "mac": [
                "0c:c4:7a:c8:75:be"
            ],
            "pm_addr": "ipmicomp3.home.lab",
            "pm_password": "ADMIN",
            "pm_type": "pxe_ipmitool",
            "name": "compute2",
            "capabilities": "profile:compute,boot_option:local,boot_mode:uefi",
            "pm_user": "ADMIN"
        }

    ]
}

 

5. Import instacken.json to undercloud ironic and ensure nodes have been added properly

(undercloud) [stack@undercloud ~]$ openstack overcloud node import ~/instackenv.json
Started Mistral Workflow tripleo.baremetal.v1.register_or_update. Execution ID: de77f4b0-3f0f-4b07-95d4-62408a3fda88
Waiting for messages on queue 'tripleo' with no timeout.

7 node(s) successfully moved to the "manageable" state.
Successfully registered node UUID 818a7384-8ef5-432d-a858-f942237f6afc
Successfully registered node UUID 232bc8df-18be-40ed-91ec-2f4f6b6b0c20
Successfully registered node UUID b01466c2-19bf-42c6-bc91-0c2b0f80c06f
Successfully registered node UUID 20d647cf-634a-4a4b-8663-7eaf6148a250
Successfully registered node UUID b406172e-2d3b-425e-9735-ec89a4353ba2
Successfully registered node UUID bfc3d4f1-d8a0-434d-83b3-f716ed2dbb79
Successfully registered node UUID 09d03b50-c487-47f1-b72d-bc246b4c41fa
(undercloud) [stack@undercloud ~]$ openstack baremetal node list
+--------------------------------------+-----------------+---------------+-------------+--------------------+-------------+
| UUID                                 | Name            | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+-----------------+---------------+-------------+--------------------+-------------+
| 818a7384-8ef5-432d-a858-f942237f6afc | rhv-controller0 | None          | power off   | manageable         | False       |
| 232bc8df-18be-40ed-91ec-2f4f6b6b0c20 | rhv-controller1 | None          | power off   | manageable         | False       |
| b01466c2-19bf-42c6-bc91-0c2b0f80c06f | rhv-controller2 | None          | power off   | manageable         | False       |
| 20d647cf-634a-4a4b-8663-7eaf6148a250 | rhv-inspector0  | None          | power off   | manageable         | False       |
| b406172e-2d3b-425e-9735-ec89a4353ba2 | compute0        | None          | power off   | manageable         | False       |
| bfc3d4f1-d8a0-434d-83b3-f716ed2dbb79 | compute1        | None          | power off   | manageable         | False       |
| 09d03b50-c487-47f1-b72d-bc246b4c41fa | compute2        | None          | power off   | manageable         | False       |
+--------------------------------------+-----------------+---------------+-------------+--------------------+-------------+

 

Note: the output above includes not just monolithic rhev controllers but also takes advantage of extra composable role that will be dedicated to Ironic Inspector.

6. Complete your openstack deployment using standard process.

https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/director_installation_and_usage/

Defining OpenStack HCI nodes are described in here:

https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/hyper-converged_infrastructure_guide/

 

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.