Motivation

I’ve been training up some large companies recently on working with Cassandra, and have been using Sylvain Lebresne’s excellent Cassandra Cluster Manager to quickly spin up local clusters. This is an awesome tool, but currently only works for local setups. When I’m solving DevOps problems, I really like to have my local environment as close as possible to the production environment, so I started looking for an alternative solution.

Cloning the Repo

This tutorial builds up the tests and steps that are already in docker-bdd, so to follow along, check out an earlier version of the code before we added Cassandra.

$ git clone https://github.com/coshx/docker-bdd
$ git reset --hard e9f807c

BDD

Behavior-Driven Development (BDD) is the art of specifying what system interactions look like as a driving force behind implementing the system itself. This is similar to Test-Driven Development (TDD), but is focused on the interaction between a system and its users, whereas TDD focuses on specific system components (unit testing) and their interactions with each other (integration testing).

For a web site, BDD typically means describing what should happen in different workflows when a person clicks around and enters data into forms. For DevOps, the user is a system administrator who interacts with systems and programs by typing in commands. This user interface also specifies how different programs and system services should interact with each other.

We’ll use the Gherkin syntax for describing desired system behavior.

First Cassandra User Story

When building software, it’s easy to get stuck into an analysis-paralysis loop, where we could keep researching the best way to do something until we realize it’s been weeks and we haven’t built anything. BDD is one tool to help us get over this, where we can just write down in natural(-ish) language what we really want to accomplish. So let’s dive in:

Feature: Cassandra Cluster
  As a system administrator
  I want to deploy a cassandra cluster
  So I can store and query lots of data

Great, we’ve stated our main high-level goal of what we’re doing. Now let’s jump into some more specific scenarios. At a basic level, I know I’ll want to fire up a CQL shell on one of the cassandra nodes:

Scenario: Launch a CQL Shell
  Given the services are running
    And I run "cqlsh -e 'show version'" on "cassandra"
   Then I should see "CQL spec 3.4.2"
    And I should see "Cassandra 3.7"

Running Tests

Assuming you’ve cloned docker-bdd and added the previous scenario to a file features/cassandra.feature, then you can run the tests with the command cucumber features/cassandra.feature. You’ll need bundler and the cucumber gem installed, so you may need to run the following (assuming you have ruby installed);

$ gem install bundler
$ bundle install
$ cucumber features/cassandra.feature

What’s happening / why so slow?

The first time you run cucumber, that line in your scenario "Given the services are running" causes all of the docker images (specified in dockerfiles) to be built, so this can take some time. On my laptop, it took about 10 minutes.

When the tests are finished, you should see a failure, and an error like:

No such service: cassandra

Get our test passing

The error message from the previous cucumber run lets us know that we didn’t create a service named cassandra. So let’s add it to our docker-compose.yml and try again:

# add to the end of docker-compose.yml

cassandra:
  image: cassandra
$ cucumber features/cassandra.feature

...

Connection error: ('Unable to connect to any servers')

Here you might be thinking, but we fired up a cassandra node, why can’t it connect? The answer is that the cassandra container that we launched as part of the cluster (that we specified in docker-compose.yml) isn’t the same container as the cassandra that we used to run the command cqlsh in the cucumber feature. This is an important distinction - they both use the same docker image, but are two separate containers. One’s running cassandra in the background, and one fired up just for a single step in our test (and ran cqlsh instead of cassandra).

To resolve this, let’s define a new service, cassandra-dev and link it to our cassandra service:

# add to the end of docker-compose.yml

cassandra-dev:
  image: cassandra
  links:
    - cassandra

And then let’s change our cucumber test to indicate that we want to run cqlsh on cassandra-dev, but point it to the cassandra node:

  And I run "cqlsh cassandra -e 'show version'" on "cassandra-dev"

Run cucumber again and the tests should pass.

Multi-Node Clusters

So now we have a single Cassandra node and we can interact with it, but what if we want to scale to multiple nodes?

Let’s start by writing the simplest scenario of something we want to accomplish with multiple nodes:

Scenario: Two Nodes Running
  Given "cassandra" is running with "2" nodes
  And I create the keyspace "test_two" on "cassandra_1"
  And I list the keyspaces on "cassandra_2"
  Then I should see "test_two"
  Then I drop the keyspace "test_two"

We’ll implement these steps in a moment, but we’re not going to define new services cassandra_1 and cassandra_2. Instead, we’re going to rely on the default naming and use docker-compose scale cassandra=2 to create a larger cluster.

# step definitions in features/step_definitions/cassandra_steps.rb
When /^I create the keyspace "([^\"]*)"(?:| on "([^\"]*)")$/ do |keyspace, node|
  strategy = "SimpleStrategy"
  replication_factor = "2" # hard coding until we need to test different ones
  node ||= "cassandra"
  run_cmd "docker-compose run cassandra-dev cqlsh #{node} -e \"create keyspace #{keyspace} with replication={'class': '#{strategy}', 'replication_factor': #{replication_factor}}\""
end

When /^I drop the keyspace "([^\"]*)"(?:| on "([^\"]*)")$/ do |keyspace, node|
  node ||= "cassandra"
  run_cmd "docker-compose run cassandra-dev cqlsh #{node} -e \"drop keyspace #{keyspace}\""
end

When /^I list the keyspaces(?:| on "([^\"]*)")$/ do |node|
  node ||= "cassandra"
  run_cmd "docker-compose run cassandra-dev cqlsh #{node} -e \"desc keyspaces\""
end

Now when we run cucumber, the test fails, because while we’ve successfully started two cassandra nodes, they don’t know about each other.

Clean-up

The way we specified our steps, we’ve created a keyspace on one of the cassandra nodes but not the other, so the next time we try to create it, it will fail with a KeyAlreadyExists error. So let’s fix our scenario by making it idempotent.

Scenario: Two Nodes Running
  Given "cassandra" is running with "2" nodes
  And I drop the keyspace "test_two" on "cassandra_1" if it exists
  And I create the keyspace "test_two" on "cassandra_1"
  And I list the keyspaces on "cassandra_2"
  Then I should see "test_two"
  Then I drop the keyspace "test_two"
When /^I drop the keyspace "([^\"]*)"(?:| on "([^\"]*)")(?:| (if it exists))$/ do |keyspace, node, if_exists|
  node ||= "cassandra"
  if_exists = "IF EXISTS" if if_exists
  run_cmd "docker-compose run cassandra-dev cqlsh #{node} -e \"drop keyspace #{if_exists} #{keyspace}\""
end

Now we can run this scenario as many times as we want and will get the same expected failure.

Connecting the Cluster

While getting predictably failing tests is important, we ultimately want them to pass. So to do this, we’ll create a seed node running cassandra, and link that to the cluster. Technically this means that we’ll have one more node in our cluster than we scale it to using docker-compose scale, but that’s okay. We’d also want to use at least two seed nodes in production to avoid a single point of failure.

With a seed node, our docker-compose.yml should now contain:

cassandra-seed:
  image: cassandra

cassandra:
  image: cassandra
  links:
    - cassandra-seed
  environment:
    CASSANDRA_SEEDS: cassandra-seed

cassandra-dev:
  image: cassandra
  links:
    - cassandra

Running cucumber features/cassandra.feature should now pass.

Next Steps

In the intro we mentioned wanting the same environment between development and production, which may mean using our own cassandra docker image with a custom cassandra.yml. Let us know in the comments how you’re using docker and cassandra in production.