Qualifying your Kubernetes Cluster

Timothy St. Clair
Heptio
Published in
3 min readAug 16, 2017

--

In the Kubernetes ecosystem there exists a plethora of tooling to stand up a cluster, but how do you know if your cluster is fully functional and ready to be loaded? In this post, we will outline how to leverage Heptio Sonobuoy as part of your preflight checklist to ensure your cluster is functional and ready to run workloads. We will also outline some of the motivations that lead to its design.

Background

Any Kubernetes distribution has a series of versioned components with a well defined set of configurations. For example, a typical stack may have a number of variations across different components of the stack; Provider, Operating System, Container Runtime (CRI), Container Network Interface (CNI), Storage Subsystem, …

This list is by no means complete, but it provides a glimpse into the matrix of stack variations that can exist. Also, each component has its own versioning which may, or may not, be compatible with the rest of the stack.

To help ensure consistency across these variations, there exists a corpus of behavioral tests that were created over time, called the “Kubernetes end-to-end test suite”, or “e2es”. This test suite focuses on the behavioral aspects of the system by relying on the Kubernetes API and abstracts away the lower details of the system. This test suite is a fundamental component to how Sonobuoy helps to qualify clusters.

Motivation behind Sonobuoy

A common question that typically arises is, “Why can’t I just use the Kubernetes end-to-end test suite and if I pass, it’s all good … right?”. The motivation behind Sonobuoy isn’t just to run the e2es, but instead, to provide cluster operators an extensible means to produce a well-versioned artifact that can be curated over time to help diagnose and inspect your cluster’s configuration when there are issues. Clusters aren’t stood up once and work forever; they are constantly maintained and upgraded over time. These changes to a cluster’s configuration require a constant level of vigilance to ensure they are operating properly. When the cluster goes sideways, it helps to have a history to understand what has changed, and how it has changed.

Qualifying your Cluster

By default, Sonobuoy’s quickstart example only runs a single e2e test, due to the fact that running a full “Conformance” suite can take on the order of 30+ minutes, depending on the size of your cluster. To begin, simply follow the download instructions that are outlined on the main Sonobuoy page:

$ git clone git@github.com:heptio/sonobuoy.git

Prior to submitting your Sonobuoy run you will want to comment out the “value” field in the examples/quickstart/10-configmaps.yaml file, shown below:

Next you can follow the rest of the instructions on the main page. The end result will be a versioned tarball that contains a series of cluster information as well as the results of the Conformance test.

To inspect the results of the e2es simple expand the tarball and navigate to ./plugins/e2e which will contain the results of your test run, this result directory will also need to be expanded. Once complete, you will see an e2e.log file which contains the details of the test run. Open that file and when you scroll to the bottom you will see a summary of the test results, where “SUCCESS!” means you are ready for flight!

Having Troubles?

We’d love to help, feel free to reach us at <support@heptio.com>.

--

--

Polyglot • Distributed Systems • Cluster Management • HPC • HTC • Scheduling