Behave Tests for Automated Regression Testing

Testing our software is ingrained as an essential part of the Belvedere procedure before a code change gets delivered to the hands of the traders. For new features, the first part of testing is to ascertain that the feature works as it should without crashing. The second aspect of testing is a bit more tricky. The Quality Assurance tester’s (QA) role is to ensure that the status quo is maintained. In other words, everything that worked before the new feature was added should work as expected.

Unit tests are very useful safeguards against releasing any "breaking" changes into production and should run quickly when code changes are made. Unit tests are a simple means for testing logic. In addition to these quick tests, we find that regression tests are useful for testing how certain components interact with each other.

The Behave framework is a Python framework that enables for testers and developers to collaborate on features. In Behave, a feature is a suite of related functionalities against which we run various scenarios to assert that they all pass. A scenario is a more granular statement, written in the "Given-When-Then" gherkin syntax that will identify an acceptance criteria of this feature. The tester will write a "feature" file (a text file with the .feature extension) with one or more scenarios. The below snippet is an example of a scenario:

Scenario: Making sure the calculator can add 2 numbers
 Given calculator is running
 When the calculator adds "2" and "3"
 Then the sum should display "5"
Alternatively, we can consolidate multiple number pairs into a table to avoid excessive copy/pasting:
Scenario Outline: Making sure the calculator can add 2 numbers
 Given calculator is running
 When the calculator adds "<number1>" and "<number2>"
 Then the sum should display "<expectedSum>"

 | number1 | number2 | expectedSum |
 | 2       | 6       | 8           |
 | 3       | 8       | 11          |
 | 9       | 14      | 23          |
 | 14      | 0       | 14          |
 | 20      | 3       | 23          |
 | 1000    | 45      | 1045        |

The Table following the "Examples:", line, applies the test scenario outline to each set of numbers that we wish to add. Each line, which is prefixed with a "Given," "When," or "Then" has corresponding Python code that will run. The trick is that there are Python files called step files, which have various functions that are mapped to these feature lines. The matcher will match the text from a given line in the feature to a function heading, and will then execute that corresponding function. Here is an example of the function that will be run after the "When" statement:

def parse_number(arg):
 return int(arg)

@when('the calculator adds "{number1:Number}" and "{number2:Number}"')
def AddNumbers(context, number1, number2):
 context.sum = context.calculator.Add(number1, number2)
If you notice, in the "@when" statement, there is a variable called number1 and one called number2. These are configurable by changing the feature file. A useful addition to our matched line in the @when statement is "{number1:Number}". When we call register_type above, we are creating a custom type that we can use to parse inputs in each step. So things of class Number will be parsed and converted into ints. In our example, we want to be able to add 2 and 3, but these numbers can be changed in a different scenario. Another thing to note is the context variable being set in the function. This variable is basically a mock object, to which you can assign any attribute or any values to be persisted for the duration of the test. You would want to persist the ip/port of the service to which you’re connecting, as this information is static. Other information, like context.sum, may have been set in the "Given" or "When" functions. We want the state of the sum to be saved for future steps as it is passed into the step of every function, so we will store it in the context. Below, is the next step in our calculator scenario:
@then('the sum should display "{expectedSum:Number}"')
def CheckSum(context, expectedSum):
 expectedSum = int(expectedSum)
 assert expectedSum == context.sum
If len(context.sum) = sum (which, in this case, is 5) ends up being false, then this scenario will fail. If it passes, then the next scenario will be run. In order to make sure no state carries over from one scenario to the next, we need to make sure context.sum is no longer 5 for the next test. To do so, we can take advantage of a file that Behave provides called is a Python file that allows one to load this context with information before all features are run. See below for a useful example:
def before_scenario(context, scenario):
  context.sum = 0

def after_scenario(context, scenario):

def after_feature(context, feature):

def after_all(context):
Some of the functions are blank on purpose, but a useful application of this is resetting any state that would have been created in a given scenario. Each test should be isolated, and no side effects of a previous scenario should impact the current one.

Another useful facet of the Behave framework is environment-based configuration. These tests can be running in a variety of different environments: dev, prod, jenkins build server, etc. For each of these environments, we may want to connect to the tested service at a different ip/port. We can outline these ip/port pairs in a config.ini file. The file can load the necessary configs on start-up. Here is an example of a config.ini file:

     Ip =
     Port = 10011
     Ip =
     Port = 10010

At Belvedere, we have learned that regularly dev-testing our software before handing it over to the QA has reduced repetition and has also reduced the amount of rejections from QA. As we build more applications from the ground up, we hope to devote more time to ensuring that a full regression testing suite accompanies each feature.

In a future post, we will dive into how we take advantage of this framework in our own unique way, to build fully automated regression tests for our system.