The Heart of the System

OpenStack is a cloud management system that combines computing, networking, storage and ancillary services to make virtual computers easy to manage: i.e., deletion, creation, snapshotting, templating, migration, and so on. OpenStack is not a hypervisor (such as KVM, VMWare ESX, Xen or Hyper-V), a storage system (such as VNX, NetApp, GlusterFS, Ceph, NFS, good ol' local filesystems) or virtual network product (such as Cisco Nexus or vSwitch). However, it manages all these elements into a nearly seamless cloud through its own APIs, much like VMWare vSphere. It is used to create very scalable High Availability (HA) systems. If the business desires, it can also achieve this with cheap off-the-shelf hardware to avoid high capital costs. In fact, the OpenStack folks themselves suggest no longer using RAID hardware.

I recently gained the Certified OpenStack Administrator (COA) certification. The course I took with The Linux Foundation went further than just giving enough info to pass the exam: the instructor was keen enough to go over quite a few technologies that could be adapted to OpenStack. Here are the technologies I want to explore further in this blog, including mastering some of the technologies that I have already used:

Cyberduck
Pacemaker/Corosync
Galera
HAproxy
Crowbar
Chronyd
MySQL/MariaDB
KVM kernel optimization
Ceph vs GlusterFS vs Swift
Puppet vs Ansible
DNS-as-a-service

In order to get this done, I can either use a bunch of virtual machines or use metal. Since one of my goals is to try different OpenStack architectures and use it, metal it is.

The first goal will be messing around with Ceph and see what throughput I can get over different failure modes. I would be happy to compare it with GlusterFS using exactly the same hardware.

Hardware

OpenStack, in most of the diagrams I have seen, uses three to four physical networks per stack. The number of nodes required will vary, of course:

API (optional): Used to control the stack from the office network. This network is not the same LAN or VLAN as the office network; it is routed, preferably through a firewall.
VM: Used for VMs to communicate with each other through virtual switching and VLANs within the virtual switches.
Management: All the back-office traffic: calls to APIs, messages through messaging queues, and the storage network.
Public: Basically, the office VLAN.

To begin with, I have only one host powerful enough to use as a compute node, so there is no need for a VM physical network yet. I am not using any dedicated network nodes to start with either, so the API and Public networks are one and the same. That leaves just the management node and my practice storage nodes. My architecture will be as follows:

The OpenStack controller, compute and network node is my trusty custom Micro ATX box (openstack1), with the ATI video card I use for video games removed to make room for the second NIC (controller1) used for the management network.

Intel i5-6600K 3.5GHz, quad-core CPU
1x 8GB Kingston DDR4 2133MHz RAM (not nearly enough, so it's CirrOS images all the way)
1x Intel 80GB SSD
1x onboard FastEthernet NIC (API/Public network)
1x RealTek RTL8169 Gigabit Ethernet NIC (Management network)

I am not blowing too much cash or electricity on this experiment yet. I have three Raspberry Pi 3s (calvin, hobbes, and susie) that will operate as the headless Ceph storage cluster.

I am aware that microSD, depending on the manufacturer, is unreliable. Happily, the goal is to experiment with storage nodes going down, so microSD is just fine.
The Raspberry Pi has only 100Mb/s Fast Ethernet, connected to a consumer Gigabit switch (1000Mb/s).
The ATX box uses a 1000Mb/s NIC, so perhaps Ceph will multiplex the 100Mb/s NICs into something faster? To be seen.
The 2 Amps of electricity per RPi3 is easy to handle. I bought a Vantec 7-USB smart charger (upper left-hand). Works a treat.

Deployment

The Raspberry Pis are headless and I like CentOS 7.

Raspberry Pi images

I downloaded CentOS-Userland-7-armv7hl-Minimal-1611-RaspberryPi3.img from http://mirror.centos.org/altarch/7/isos/armhfp on to my trusty MacBook. I then directly copied it to three separate SD cards:
dd \
if=/Users/gmc/Downloads/CentOS-Userland-7-armv7hl-Minimal-1611-RaspberryPi3.img \
of=/dev/disk2 bs=8192; sync
The SD cards went directly into calvin, hobbes and susie.

Packages installed on the controller

Before even thinking of installing OpenStack (and its required services, such as MariaDB, RabbitMQ (or 0MQ), or an http server), I want to control calvin, hobbes and susie.

Ansible. Things will go badly, so a configuration management system will be needed to not only reduce typing on three different systems, but to permit one to recreate everything from a vanilla CentOS on the SD card. Eventually I will try Puppet.
ISC DHCPD. It will log the MAC addresses of calvin, hobbes and susie, and then from the static host definitions I will create from that information, it will manage the management network. I do not want to screw up my home network and make my family angry with me, so ISC DHCPD only listens to the management NIC.
NTP. Raspberry PIs do not have a battery backup for their system clocks. Not only does OpenStack require synchronized clocks for its authentication system, but anything installed with git and pip will need SSL certificates to work.
Squid. I do not want management traffic on my home network, and vice versa. However, since calvin, hobbes and susie need rpm, git and pip packages, I will use squid to fetch those packages on their behalf.
Unbound as the DNS caching proxy. Once again, the management network has no business talking directly to my home network.

Method

Controller1

Install CentOS 7 and update the packages on openstack1/controller1.
Use Clonezilla to image everything to a USB stick, which gets tucked away. Eventually, I will use the same technique on openstack1 once OpenStack is installed, since OpenStack takes time to install even with an automatic deployer such as RDO.
Install ansible
Configure and use ansible to install the "common," "master_user," and "ntp" roles as root.

"Common" sets hostnames, the hosts file, localtime, etc.
"Master_User" sets up a deployment user, its ssh public keys and sudoers.d file, and the shells, packages, configurations and colours I prefer personally.
"NTP" sets up the server (openstack1) and clients (the three Raspberry PIs). In addition, the Raspberry PIs need a script to fire up ntpdate on boot.
All three roles are fired through the "common.yml" playbook.

"common," "master_user, " and "ntp" will also be executed on the Raspberry Pis, so they need to have the intelligence to deal with the armv7l architecture (i.e., not x86_64). For example, there is no EPEL available for armv7l as of writing, so curl would have to be used to pull python2-pip. The Raspberry Pis also need to use a proxy for curl, git, yum and pip, and the controller does not.
Configure and use ansible to manage ansible itself: configured, group permissions set and a custom bash script created to set up role directories as per Ansible Best Practices
Configure and use ansible to deploy DHCPd
Configure and use ansible to deploy Unbound

Calvin, Hobbes and Susie

For each Raspberry Pi:

Fire it up individually. Look in /var/log/messages on openstack1 to get its MAC address.
Set the host entry in the ansible-controlled openstack1:/etc/dhcp/dhcpd.conf file to the same address, and then add the desired hostname and address in the ansible-controlled *:/etc/hosts file. Run ansible again on openstack1 to apply the hosts and dhcpd.conf file.
SSH to the Raspberry Pi as root/centos to set the ~/.ssh/known_hosts file on openstack1. Log off again. This step should be eliminated since it is manual.
Using the ansible ad hoc system, apply the same "common" and "master_user" role used on openstack1:
ansible-playbook -i production --limit ${remote_hostname} -u root -k common.yml

Once all Raspberry Pis are configured:

Reboot all Raspberry Pis (using ansible, of course). Test for connectivity and login.

To come

Configurations on DHCPD, Unbound, NTP. Ansible is a story in itself.