A few days ago we open source’d Coyote , a tool we created in order to automate testing of ourLandoop Boxes, which features a large range of environments for Big Data and Fast Data (see Kafka).
Coyote does one simple thing: it takes a .yml file with a list of commands to setup, run and check their exit code and/or output. It has some other functionality too, but its essence is this. The source code is short, I don’t expect any praise for it; coyote is a tool, a useful one.
We use it for environment and operations testing, as well as runtime meta-testing.
Environment and operations testing is to verify that an environment is set up and working as expected, such as having access to a specific port or software, or running a command and getting certain results, like compiling a program that requires tools and libraries present and set-up or a command that needs some environment variables set. Performance testing can also be seen as a subset of environment and operations testing.
Runtime meta-testing is a shortcut to proper software-run-tests. If you trust your software to be robust enough, then instead of verifying the actual results (e.g entries in a database), you can scan the output of the software for (un)expected logs. Of course this kind of tests should not be used without awareness of the underlying dangers.
OutputCoyote has two important outputs; (a) a html report with the commands ran, their exit status, stdout, stderr and some statistics, (b) its exit code which up to 254 indicates the number of errors occured and at 255 is like saturation arithmetic and means that 255 or more errors occured. The html report is for humans, the exit code for machines. We use it with Jenkins CI where, amongst other things, we need quick visibility of failures and verbose output for debugging.
You may see some example html outputshere andhere.

Configuration
Test configuration is set via a YML file, partially inspired by Ansible. Let’s see a real world example before we delve into specifics. Below are two of our Box tests, a basic HDFS/Hadoop test and a Spark test. I’ve add some comments to explain what happens for the non-intuitive parts.
- name: coyote # name is the name of a group of tests. “coyote” is a reserved keyword for # global settings, such as html title and universal timeout (can be overriden) title: Box CDH Tests # Hadoop/HDFS test - name: Hadoop Tutorial skip: _hadoop_tutorial_ # When skip is set to “true”, the group will be skipped. Useful for automation. entries: - name: Clone Landoop Intro Repo command: git clone https://gitlab.com/landoop/landoop-intro.git workdir: /home/coyote - name: hdfs put command: hadoop fs -put ../README.md input.md workdir: /home/coyote/landoop-intro/Hadoop-101 - name: wordcount example command: hadoop jar hadoop-examples.jar wordcount input.md results workdir: /home/coyote/landoop-intro/Hadoop-101 - name: build code command: hadoop com.sun.tools.javac.Main WordCount.java workdir: /home/coyote/landoop-intro/Hadoop-102 - name: create jar command: | jar cf wc.jar WordCount.class WordCount$IntSumReducer.class WordCount$TokenizerMapper.class workdir: /home/coyote/landoop-intro/Hadoop-102 - name: run jar command: hadoop jar wc.jar WordCount input.md results-2 workdir: /home/coyote/landoop-intro/Hadoop-102 - command: hadoop fs -rm -r input.md results results-2 nolog: true # nolog will not log the test results, neither count them as successes # or failures # Spark Tests - name: Spark Tests skip: _spark_tests_ entries: - command: mkdir -p coyote-spark-test-app/src/main/scala nolog: true - name: Create Test Scala App command: tee coyote-spark-test-app/src/main/scala/CoyoteTestApp.scala stdin: | /* SimpleApp.scala */ import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf object CoyoteTestApp { def main(args: Array[String]) { val logFile = "CoyoteTestApp.scala" // file should be in HDFS val conf = new SparkConf().setAppName("Coyote Application") val sc = new SparkContext(conf) val logData = sc.textFile(logFile, 2).cache() val numAs = logData.filter(line => line.contains("a")).count() val numBs = logData.filter(line => line.contains("b")).count() val lineCount = logData.count() println("Lines with a: %s, Lines with b: %s".format(numAs, numBs)) println("Linecount: %s".format(lineCount)) sc.parallelize(List(("Linecount: %s".format(lineCount))),1) .saveAsTextFile("spark-results.txt") } } - name: Create Sbt File command: tee coyote-spark-test-app/main.sbt stdin: | name := "Coyote Test Project" version := "1.0" scalaVersion := "2.11.7" libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.2" - name: Put into HDFS a Test File command: | hadoop fs -put -f coyote-spark-test-app/src/main/scala/CoyoteTestApp.scala - name: Build SBT Package command: sbt package workdir: coyote-spark-test-app - name: Spark-submit Application (Local) command: | spark-submit --class "CoyoteTestApp" --master local[4] coyote-spark-test-app/target/scala-2.11/coyote-test-project_2.11-1.0.jar stdout_has: [ 'Linecount: 19' ] # stdout_has takes an array of regular expressions to check against stdout # there is also stdout_not_has, stderr_has, stderr_not_has - name: Verify via HDFS command: hadoop fs -cat spark-results.txt/part-00000 stdout_has: [ 'Linecount: 19' ] - command: hadoop fs -rm -r -f spark-results.txt nolog: true - name: Spark-submit Application (CDH Cluster) command: | spark-submit --class "CoyoteTestApp" --master yarn --deploy-mode cluster coyote-spark-test-app/target/scala-2.11/coyote-test-project_2.11-1.0.jar - name: Verify via HDFS command: hadoop fs -cat spark-results.txt/part-00000 stdout_has: [ 'Linecount: 19' ] - command: hadoop fs -rm -r -f spark-results.txt nolog: true - command: rm -rf coyote-spark-test-app nolog: trueYou can see the output for these two testshere. Your first reaction may be that the configuration is too verbose. This is the ansible inspired part. In practice you write your tests once or a few times and run them many times. The person running the tests and/or evaluating the results may be not the one that wrote them. Even the creator after a couple months can forget the purpose of each command. In Landoop we also use the coyote configuration files as reference examples for both newcomers and ourselves.
Currently coyote supports this settings for each command:
workdir: run the command in this directory stdin: pass this as standard input to the command nolog: do not count, nor log this command, usually we use it for small cleanup tasks env: array of environment variables to make available to the command timeout: if the command hasn’t completed in this time, kill it. It takes golang time duration strings, such as “30s”,”2m15s”,”1h”. There is a global default timeout of 5 minutes, which you can override globally and/or per command stdout_has, stdout_not_has, stderr_has, stderr_not_has: arrays with regular expressions to check against standard output and standard error. ignore_exit_code: do not fail the test if exit code is not zero, it still may fail from a timeout or a stdout/stderr check skip: if set to true, skip this test (or the group of tests if “skip” is at the group level), useful for automating test runs by using sed to select testsAlso it supports some special strings:
%UNIQUE%: if this string is used inside a command, stdin, or env variable, it will be replaced by a unique numeric string at runtime %UNIQUE_[0-9A-Za-z_-]+%: if such a string (e.g: %UNIQUE_VAR1%, %UNIQUE_1-B%) is used, all its instances inside commands, stdin and env variables will be replaced by a common unique numeric string at runtimeSome features you may miss are:
loop constructs (such as ansible’s “with”) global variables / templating abort if a test fails ―this is a design choice for now, we always run all tests Jenkins Integration at LandoopMost of the time it is Jenkins that runs our tests.

We’ve set it to use Coyote’s exit code not only to mark the build as failed, but to add the number of errors to the job name, to grant us quick visibility of the current status. We also keep a history of test reports.


Enabling or disabling tests is easy, add a boolean variable for each test group and then a single bash line to set its skip flag to true :

[[ ${HADOOP_TEST} == false ]] && sed -e 's/_hadoop_tutorial_/true/' -i box-configuration.yml [[ ${SQOOP_TEST} == false ]] && sed -e 's/_sqoop_tutorial_/true/' -i box-configuration.yml [[ ${KAFKA_TEST} == false ]] && sed -e 's/_kafka_tests_/true/' -i box-configuration.yml [[ ${SPARK_TEST} == false ]] && sed -e 's/_spark_tests_/true/' -i box-configuration.yml Final Words
Although code testing is such a renowned practice, our research didn’t reveal many testing tools for the latter part of the devops pipeline, such as the deployment and execution phase. Coyote’s turnaround has been great until now and this is the reason we wanted to share it.
Of course, as it is with every such effort, we are bound to re-invent the wheel for some part. The codebase is kept small and golang makes it easy to add features as we go.
For the time being we don’t have a release plan, just grab the latest coyote commit from master, it works:
go get github.com/Landoop/coyoteTo use it:
coyote -c configuration.yml -out test.htmlTo make changes to the source code:
cd "$GOPATH/src/github.com/Landoop/coyote/" # edit code and/or html go generate go buildShould we make any breaking changes to the configuration format, we will explore how to communicate about it and protect Coyote’s current users.
If you are interested in examples or extending the code, visit Coyote’s github repository .
Thank you for your time,
Marios.