Saturday, February 8, 2014

Lightweight Integration Testing with Dapper

Motivation, Requirements, and Challenges


Motivation


Recently I worked on a project that used Dapper as the data access layer, creating repositories based on Dapper queries for each aggregate root in the domain model. Dapper (https://github.com/StackExchange/dapper-dot-net) if you aren't aware, is "a simple object mapper for .Net." Dapper was developed for and is used on StackOverflow.com. Because it's built as extensions to DbConnection it isn't
A Dapper Gentleman tied to any specific database and works with a variety of database providers. There are some micro-ORM implementations built on top of Dapper, but we used it directly. In it's base form, Dapper consists of generic query/execute methods that take SQL statements as a string, run the SQL on the database, and map the resulting output to either a collection of objects (type specified in the call) or as a count of the results returned.

Dapper is pretty simple and simple to use. That's one of the reasons that it is also pretty fast. That was the reason that we chose to use it. I say "we", but really the decision had been made prior to my involvement in the project and my involvement was limited to extending the implementation for more use cases. Because it's simple, you might be able to convince yourself of the correctness of your implementations and may not design any specific tests for your repositories.

That was the situation I found when I joined the project. There were plenty of unit tests, but they stopped at the repository layer. All of the Dapper-based repositories had interfaces and those interfaces were mocked in the unit tests. There were no tests - unit or integration - for the repositories themselves. Again, the queries were pretty simple, so code inspection was used to verify correctness along with an extensive QA process.

That is, until I added a feature that required more complex behavior, specifically a relation that had some rather complex logic that involved left joins, multiple result sets, and transactional behavior. Dapper was able to handle all that I threw at it, but I felt uncomfortable that (1) the only testing that my queries had was my own manual, exploratory testing and (2) I'd need to be careful to manually test around any code changes that I wanted to make.

Because Dapper is implemented as extension methods, it doesn't lend it self to mocking and I didn't really want to introduce another layer around Dapper that would have similar problems. Moreover, I felt that I needed to not only verify that my SQL commands were being execute but also that my SQL actually did what I expected. What I really wanted/needed was a set of lightweight integration tests that would be performed against a test database, ideally with a minimum amount of manual set up, that could be executed as if they were unit tests.

Requirements

Automated

For me, for tests to be useful, they need to be automated. I wanted these tests to be able be run along side my true unit tests.

Fast

Because I wanted the tests to run regularly, they need to be reasonably fast. The tests can't be too big or too extensive. Simple, usable tests against a local database are my ideal.

Simple

I'm not looking to replace acceptance tests or the real integration tests for the application. Keeping the tests simple will help keep them fast. Keeping them simple also means having to develop the minimum amount of code to create a new test.

Minimal Tooling

I want these to run on any developer's machine and, potentially, on a build or continuous integration server.

Independent

In the spirit of unit tests, each test should be independent. The data created for and by the test should be unavailable and should not interfere with other tests. The tests should be able to run in any order. To keep things as simple as possible, though, we'll relax this constraint for catastrophic failures. Basically, when the per-test clean up tasks runs, we want to guarantee that each test gets a fresh start.

Self-contained

Tests, or the context they are run in, will perform any data set up required for the test. There should be no manual steps or external scripts that need to be invoked before the tests can be run. Any database or data set up that needs to be done, including any scripts that need to execute, should be performed by the test code.

Safe

Because these tests are going to be used to develop code, the environment that they run in should be isolated. The tests shouldn't connect directly to other application databases, even to read data, to maximize safety.

Based on these requirements I created the tests in a unit test project. Each repository gets it's own test class. I opted to go with LocalDB as the database provider. I added the MDF file (and log) to the test project as a content item, updated when newer. Using a unit test project will allow me to tie into the standard test running and ensure that my Dapper integration tests are performed along with the rest of my unit tests. Using LocalDB will keep my tests isolated.

Challenges

It may not seem obvious at first glance but these choices along with the requirements and good software practices raise some interesting challenges.
  1. Creating the database schema. To be self-contained, our test framework will need a way to ensure that the database has the correct tables and other database objects for each test run.
  2. Data clean up. We could have each test be responsible for cleaning up after itself but in order to keep them simple I think we can do better. We want an automated method to remove data, including relational data, created by any of our tests that can run in the test clean up phase.
  3. Test isolation. Because the tests need to be independent and yet share a database, we need to ensure that they are run sequentially, including initialization and clean up tests. Different unit test runners may or may not guarantee the behavior that we need so we need a mechanism to guarantee this regardless of the test runner.
  4. Providing a connection to a repository for use during the test that doesn't rely on any manual configuration yet maintains good development practices.
In my next post I will detail the solutions to two of the challenges I've presented and get a start on the code required to implement the solution.