Monday, February 10, 2014

Solving the Challenges of Dapper Integration Testing: Part 2


This is the third in a series of posts on lightweight integration testing with Dapper. These tests are intended to enable us to do Test-Driven Development in creating our Dapper-based repositories. As a reminder our goals – see the first post for the complete explanation – are tests that are:
    A Dapper Gentleman
  • Automated
  • Fast
  • Simple
  • Minimal Tooling
  • Independent
  • Self-contained
  • Safe

In my last post I covered the two easier challenges: providing a database connection in a way that lets us easily substitute our testing database and isolating our tests. Because these are integration tests, the latter is important as the database is a common interaction point for our tests and running them in isolation keeps the test data from one test from causing failures in other tests.

In this post I’m going to cover one of the two remaining challenges: cleaning up test data after each test. We'll leave the last challenge for the next post.

Data Clean Up


One of the difficulties with integration tests, especially the isolated kind that I am interested in, is the effort of making sure that your data is set up exactly as you want pre-test. Generally, this will involve writing some code that cleans up any data from previous tests or test runs. This is complicated by relationships in the data, especially in the case where deletes do not cascade. In those cases, a delete can fail when the data in the “child” relation hasn’t been deleted first.

One way to handle this is writing database-specific delete code that understands the relationships and performs deletes in the correct sequence to prevent failures due to non-cascading deletes. To me, this is painful – and certainly not simple.

Another way to handle it would be to simply delete the database tables and related objects between each test run. As we’ll see later, this will be necessary when maintaining currency with the database, but it seems overkill – and certainly slower than we would like, deleting and re-creating the schema per test.

It would be better if we could to leverage the database structure itself to automatically create delete statements in the proper order. Rather than writing the code once-per database, we could write the code once per database provider and it would suffice for most, if not all, databases that use that provider. At least it would suffice for the kinds of databases that I’m concerned with most of the time.

This idea is not original with me and I didn’t want to reinvent the wheel if I could possibly avoid it so I did a bit of research into automated integration testing. During my research, I came across a DatabaseDeleter class by Jimmy Bogard, who blogs at LosTechies.com, that I thought was the perfect solution to the problem of automatically cleaning up data generated in an integration test. The only problem – it was written for use with nHibernate. I adapted it to work using SqlConnection and SqlCommand for Microsoft SQL Server (the version I’m using is Microsoft SQL Server 2012).

The most interesting method is the BuildTableList method (full code is available on GitHub):BuildTableList
private static IEnumerable<string> BuildTableList(ICollection<string> allTables,
                                                  ICollection<Relationship> allRelationships)
{
    var tablesToDelete = new List<string>();

    while (allTables.Any())
    {
        var leafTables = allTables.Except(allRelationships
                                  .Select(rel => rel.PrimaryKeyTable))
                                  .ToArray();

        tablesToDelete.AddRange(leafTables);

        foreach (var leafTable in leafTables)
        {
            allTables.Remove(leafTable);

            var relToRemove = allRelationships.Where(rel => rel.ForeignKeyTable == leafTable)
                                              .ToArray();

            foreach (var rel in relToRemove)
            {
                allRelationships.Remove(rel);
            }
        }
    }

    return tablesToDelete;
}

This method takes a collection of tables (all the tables in the database) and a collection of parent-child relationships between tabels. If you think of the relationships as forming a set of trees, it performs a bottom-up traversal across these trees, adding the leaf nodes discovered in each pass to an ordered list of tables to delete data from. Simultaneously, it removes those leaf nodes from the trees, by removing the relationships and removing the tables from the set of tables to consider, so that in the next pass the parents of those leaf nodes become the new leaf nodes. In this way, if we delete data from the ordered table collection in order, we are guaranteed to not violate any foreign key constraints. This gives us a safe order in which to build the delete commands to avoid failures.

This collection is used by BuildDeleteTables to generate the delete command associated with a particular database. This is constructed once per test run and subsequently reused per test clean up to remove any data that the test may have added.

BuildDeleteTables/BuildDeleteSql
private void BuildDeleteTables()
{
    if (_initialized.Contains(_database))
    {
        return;
    }

    lock (_lockObj)
    {
        if (!_initialized.Add(_database))
        {
            return;
        }

        using (var connection = new SqlConnection(_database))
        {
            connection.Open();

            var allTables = GetAllTables(connection);

            var allRelationships = GetRelationships(connection);

            _tablesToDelete = BuildTableList(allTables, allRelationships);

            _deleteSql[_database] = BuildTableSql(_tablesToDelete);
        }
    }
}

private static string BuildTableSql(IEnumerable<string> tablesToDelete)
{
    return string.Join(" ", tablesToDelete.Select(t => string.Format("delete from [{0}];", t)));
}
The DatabaseDeleter class is used in the Dispose method (following Chris Patterson’s IDisposable Done Right pattern) in our abstract TestContextBase class.

TestContextBase (Dispose implementation)
internal abstract class TestContextBase : IDisposable
{
    // Remainder of class omitted for brevity
    
    // IDisposable implementation. On dispose of the context, we clean up any data inserted by the test
    
    protected bool _disposed;

    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }

    ~TestContextBase()
    {
        Dispose(false);
    }

    protected virtual void Dispose(bool disposing)
    {
        if (_disposed)
        {
            return;
        }

        if (disposing)
        {
            var deleter = new DatabaseDeleter(_connectionString);
            deleter.DeleteAllData();
        }

        _disposed = true;
    }
}
The Dispose method is invoked in our test classes in the clean up method, which we’ve seen before, but is now shown with the context-related code for a sample DapperUserRepository below. After each test, the test context is disposed, and the DeleteAllData for the related database is run, cleaning up any data that was inserted by the test.
[TestInitialize]
public void Init()
{
    Start();
    _c = new DapperUserRepositoryTestContext();
}

[TestCleanup]
public void Cleanup()
{
    if (_c != null)
    {
        _c.Dispose();
    }
    End();
}
That wraps up our data clean up task. Using our base test context and it's IDisposable implementation lets us make sure that each test starts with a clean database, as long as our clean up method runs (see the discussion in first post in the series). Next time we'll see how to handle creating the database schema so that our test database is automatically set up for each test run as our database changes.