Showing posts with label C#. Show all posts

Wednesday, August 3, 2016

Converting a Console Application to an Azure WebJob

I recently gave a talk at Iowa Code Camp on Working with WebJobs. I got a follow up request for resources on converting an existing console application to a WebJob. The process is relatively simple and involves three steps.

Create an Azure WebApp to host your WebJob.

Let’s say you have a solution that contains a console application such as the one shown below. We have a Program class that contains the Main method. This method creates the object that performs the task and its dependencies, then invokes one or more methods on the class to perform the work. You’d probably have some logging and some error handling, but we’ll omit that for the sake of clarity.

Now, add a WebApp project to the solution. We’ll call it WebJob.Host. I’m creating an MVC project because I want to have a single action that can be used to display the version of the code that’s deployed as a sanity check. If you have an existing site you’re deploying as an Azure WebApp, you could use that instead. It will need to be in the same solution as the project that we will be converting to a WebJob. Remember to update all the packages once you’ve created the project and to clean up any boilerplate that you don’t want. I’m going to get rid of all but the HomeController and Index action. Don’t be surprised if VisualStudio needs to restart to complete updating your packages.

The best way to autogenerate your version information is using the capabilities of your build server. Both TeamCity and AppVeyor support this. If you’re interested in a way to update your Assembly from your repo with version information without using a CI server, I’ve written up a way to do it using MSBuild Community Tasks. I’m going to take the easy way out and manually keep my AssemblyInformationalVersion up-to-date for this project.

Make sure your hosting web application works before adding a WebJob to it.

Convert Your App to A WebJob

First, we need to add the appropriate WebJob Packages. At this point I’ll assume you’re not using any Azure resource or, if you are, you can figure out the additional WebJob packages you’ll need to work with those resources. If you’re using a ServiceBus queue or topic listener, it might be easier to start from scratch and create a WebJob with an appropriate listener, then merge your existing dependencies into that rather than try to convert the older service bus client code over.

For a standard WebJob that is run on a schedule, you’ll need the following package:

Microsoft.Azure.WebJobs

(note: This has several dependencies, which in turn have more dependencies. Don’t be alarmed by this.)

Again, it’s a good practice to make sure you update your dependencies after you add them to your project to make sure you have the latest versions.

This is what my packages.config file looks like after installing only the WebJobs package and it’s dependencies.

<?xml version="1.0" encoding="utf-8"?>
<packages>
  <package id="Microsoft.Azure.KeyVault.Core" version="1.0.0" targetFramework="net452" />
  <package id="Microsoft.Azure.WebJobs" version="1.1.2" targetFramework="net452" />
  <package id="Microsoft.Azure.WebJobs.Core" version="1.1.2" targetFramework="net452" />
  <package id="Microsoft.Data.Edm" version="5.7.0" targetFramework="net452" />
  <package id="Microsoft.Data.OData" version="5.7.0" targetFramework="net452" />
  <package id="Microsoft.Data.Services.Client" version="5.7.0" targetFramework="net452" />
  <package id="Microsoft.WindowsAzure.ConfigurationManager" version="3.2.1" targetFramework="net452" />
  <package id="Newtonsoft.Json" version="9.0.1" targetFramework="net452" />
  <package id="System.Spatial" version="5.7.0" targetFramework="net452" />
  <package id="WindowsAzure.Storage" version="7.1.2" targetFramework="net452" />
</packages>

Now that we have the appropriate packages, we can convert our Program to get the job set up and invoked as a WebJob. I suggest the following approach. Create a class named Functions (this is the convention) and add a method – it can be static or an instance method, whichever fits your needs best. Migrate the code from your Main method to the new method in the Functions class. Decorate the new method with [NoAutomaticTrigger].

For reference here is the Program that I started with.

internal class Program
{
    public static void Main(string[] args)
    {
        var task = new OfflineTask(new NotificationService(), new WorkDataSource());

        task.DoTask();
    }
}

Here is the Functions class that I created to replicate it's functionality. You'll notice that I'm using manual dependency injection to allow this to be testable. Don't let that throw you. I think it's a good idea, but I could have just copy the contents of Main and pasted them directly into the Execute method. Note also that Execute now has a TextWriter parameter named log. This is the hook into the WebJobs logging facility. For now, we just add log messages that we are starting and finishing. You can continue to use whatever logging you already have in place. If you choose to integrate with the WebJobs logging you can by passing the log parameter to your classes/methods as needed. There are also ways to connect this log facility to an existing one but that's beyond the scope of this article.

internal class Functions
{
    private readonly NotificationService _notificationService;
    private readonly WorkDataSource _workDataSource;

    public Functions()
        : this (new NotificationService(), new WorkDataSource())
    {
        
    }

    public Functions(NotificationService notificationService, WorkDataSource workDataSource)
    {
        _notificationService = notificationService;
        _workDataSource = workDataSource;
    }

    [NoAutomaticTrigger]
    public void Execute(TextWriter log)
    {
        log.WriteLine("starting job");

        var task = new OfflineTask(_notificationService, _workDataSource);

        task.DoTask();

        log.WriteLine("job completed");
    }
}

Now that we have our functions created, we’ll modify the Main method to set up a JobHost and use it to invoke our new Execute method. Before we do that, though, we need to add some configuration values. We need connection strings for the WebJobs dashboard and for WebJobs storage, by convention these are named AzureWebJobsDashboard and AzureWebJobsStorage. I recommend that you store these in a file that is not checked into source control and reference it from your App.Config file.

Create or use an existing storage account. Copy the connection strings from the Azure portal and add them to your connection strings.

<connectionStrings>
  <add name="AzureWebJobsDashboard" connectionString="your-connnection-string-copied-from-Azure"/>
  <add name="AzureWebJobsStorage" connectionString="your-connnection-string-copied-from-azure"/>
</connectionStrings>

Now, in your Main method, remove the code you copied over to Functions.Execute and insert the code to create a JobHostConfiguration, then a JobHost using that configuration. Then use reflection to get the method or methods that you want to execute when the program is run from your Functions class. For each of these methods, use the Call method on the JobHost to invoke the method. The JobHost knows to add the TextWriter to the parameters when calling the method. You can also supply a custom activator to the JobHostConfiguration so that you can hook into your favorite dependency injection framework if you want. There’s an example of this in my WebJobs talk demonstration code.

Here’s the updated Program code after changing it to invoke the Function as a WebJob.

internal class Program
{
    public static void Main(string[] args)
    {
        var config = new JobHostConfiguration();

        var host = new JobHost(config);

        var tasks = typeof(Functions).GetMethods()
                                     .Where(m => m.GetCustomAttributes(typeof(NoAutomaticTriggerAttribute), false).Any());
        foreach (var method in tasks)
        {
            host.Call(method);
        }
    }
}

Lastly, to get the job to run on a schedule, add a settings.json file as a Content item (I use Copy Always) to the project specifying when the job should run. This is used by the scheduler in Kudu, the underlying framework that manages WebJobs. It has a relatively simple format. The only property we are going to set is the "schedule" - that's always been enough for me. The "schedule" property is a cron-like entry ({second} {minute} {hour} {day} {month} {day of the week}) specifying (in UTC) when to run the program. The following example runs the WebJob at 5AM UTC every day.

{ "schedule" : "0 0 5 * * *" }

Add the WebJob to the WebApp

Now that we have both the Web App and the WebJob set up, we just need to add the WebJob to the WebApp so that when we publish the WebApp, the WebJob is published as well and will run on its schedule.

Right-click on the host project, I named it WebJob.Host, then choose Add, then choose Add Existing Project as Azure WebJob.

This brings up a configuration Wizard. Choose the project you want to add, give it the name you want to see in the Azure console – there are some name restrictions, for example you can’t use dots in the WebJob name. Set the run mode to OnDemand – our schedule is included in the settings.json file so we don’t need to set a schedule here. Click OK to add the WebJob to the WebApp.

This will install the WebJobs publishing package. It adds a webjobs-list.json file in the Properties folder of your Web App. This lists the jobs that get published with the Web App and their relative location in the project. It also adds a webjob-publish-settings.json file to your WebJob project, again in the Properties folder. You might see some JSON validation errors in this file. I haven’t found these to cause problems when deployed, but I generally clean up the unused properties to remove the errors.

Now, we’re essentially done. Verify your configuration values and publish your host WebApp. Check, using the Azure portal – see the WebJob pane in settings – to make sure your job was deployed and will be run on the schedule that you’ve chosen. Use the Logs to view the log messages written to the TextWriter. You can also access Kudu directly via the Tools pane on the Web App to dig deeper to test and debug your Web Job.

Code used for this post can be found on GitHub. Note that both the original console app and the converted app (now WebJob) are included in the code so you can compare the before and after conversion states. You would probably only have the single project, converted in place, in your solution.

Wednesday, October 29, 2014

Autofac Extensions: It’s the Little Things

I get tired of reading, and writing, dependency registrations in Autofac. Very descriptive, but also very repetitive. Imagine a large application with a couple of dozen small, focused services each with a registration like:

builder.RegisterType<DescriptiveService>()
       .As<IDescriptiveService>()
       .InstancePerHttpRequest();

It's not a pretty sight.

To make this easy to read and, especially with syntax highlighting, relatively easy to spot errors I created a small extension method as part of my dependency configuration class.

public static class DependencyConfig
{
    private static void RegisterPerRequest<TImpl, T>(this ContainerBuilder builder)
        where TImpl : class, T
    {
         builder.RegisterType<TImpl>()
                .As<T>()
                .InstancePerHttpRequest();
    }

    public static void Configure(Action<IDependencyResolver> setResolver)
    {
         var builder = new ContainerBuilder();
         builder.RegisterPerRequest<DescriptiveService,IDescriptiveService>();

         // ...
    }
}

I'll grant you that it works better in a wide window or with shorter class names but I'm happy to not be repeating so much code.

Saturday, March 1, 2014

Runtime-typed Generic Collection Extensions

Another post inspired by a Stack Overflow question (see my answer).

The specific situation encountered by the asker of the question was prompted by a situation in which the asker knew the specific type that was being returned, but only at runtime. The asker wanted to be able to invoke a method delegate that accepted a IEnumerable<Foo> or IEnumerable<Bar> that was, presumably, passed as an argument along with the type, Foo or Bar, to a framework method that is unaware of the specific type at compile type. The problem is that when the ToList() method is used, it was returning List<object> instead of List<Foo> or List<Bar> as required as required by the delegate that the asker was attempting to invoke. This resulted in an ArgumentException for the delegate's Invoke method as the underlying type was not convertible.

While I suspect that there is probably a better way to create the framework there wasn’t enough information in the question for me to comment on that. Instead I decided to try and construct some extension methods that would produce a collection of the specific type that could be used by the delegate using ideas from the accepted answer and an answer on a similar question by the indubitable Jon Skeet.

To provide a test case for the solution I mocked up some classes similar to those in the question. First, there is a Dog, which knows how to Bark(), and a Bird, which knows how to Sing().

public class Dog
{
    private readonly int _id;

    public Dog(int id)
    {
        _id = id;
    }

    public string Bark()
    {
        return string.Format("Woof...{0}", _id);
    }
}

public class Bird
{
    private readonly int _id;

    public Bird(int id)
    {
        _id = id;
    }

    public string Sing()
    {
        return string.Format("Squawk...{0}", _id);
    }
}

Then we have a Framework class that does a query that returns a collection of IEntry objects, each being an Entry object with a Data property that is either a Dog or a Bird.

public class Framework
{
    public IEnumerable<IEntry> QueryOfAllType(Type type)
    {
        var range = Enumerable.Range(0, 10);

        if (type.IsAssignableFrom(typeof(Bird)))
        {
            return range.Select(i => new Entry
                        {
                            Data = new Bird(i)
                        })
                        .ToList();
        }

        return range.Select(i => new Entry
                    {
                        Data = new Dog(i)
                    })
                    .ToList();
    }
}

public interface IEntry
{
    object Data { get; }
}

public class Entry : IEntry
{
    public object Data { get; set; }
}

Lastly, we have our program which will query the Framework for IEntry objects of the proper type, select the Data property, and use the new extensions to convert the collection to a collection of the proper type to be used by a delegate to perform the appropriate action for that object. Note: the code below has been updated from the previous version to more accurately replicate the original problem - that is, that the type information can't be inferred from the delegate or the invoking method.

class Program
{
    static void Main(string[] args)
    {
        DoThatThing(typeof(Bird) , Vocalize);
        DoThatThing(typeof(Dog), Vocalize);
    }

    private static void DoThatThing(Type type, Action<IEnumerable> thingToDo)
    {
        var framework = new Framework();

        var result = framework.QueryOfAllType(type)
                              .Select(e => e.Data)
                              .ToListOfType(type);

        thingToDo.DynamicInvoke(new [] { result });

    }

    private static void Vocalize(IEnumerable animals)
    {
        foreach (var animal in animals)
        {
            if (animal is Dog)
            {
               Console.WriteLine(((Dog)animal).Bark());
            }
            else if (animal is Bird)
            {
                Console.WriteLine(((Bird)animal).Sing());
            }
        }
    }
}

Below is my solution, using reflection to cast the returned collection to the appropriate type and convert the IEnumerable to a list, retaining the behavior that the actual collection rather than an iterator is produced.

public static class EnumerableExtensions
{
    private static readonly Type _enumerableType = typeof(Enumerable);

    public static IEnumerable CastAsType(this IEnumerable source, Type targetType)
    {
        if (source == null)
        {
            throw new ArgumentNullException("source");
        }

        var castMethod = _enumerableType.GetMethod("Cast").MakeGenericMethod(targetType);

        return (IEnumerable)castMethod.Invoke(null, new object[] { source });
    } 

    public static IList ToListOfType(this IEnumerable source, Type targetType)
    {
        var enumerable = CastAsType(source, targetType);

        var listMethod = _enumerableType.GetMethod("ToList").MakeGenericMethod(targetType);

        try
        {
            return (IList)listMethod.Invoke(null, new object[] { enumerable });
        }
        catch (TargetInvocationException e)
        {
            ExceptionDispatchInfo.Capture(e.InnerException).Throw();
            return null; // to satisfy the compiler, never reached
        }
    } 
}

This is what I do for fun on Saturdays. The extension code and tests are available on GitHub.

Saturday, February 22, 2014

Unit testing JsonResult with Nested Anonymous Types

Recently I had a need to test some controller actions that returned nested anonymous types. Since the Data property on the JsonResult is of type object this can involve jumping through some hoops with reflection to get access to the properties in the anonymous types. My first attempt was to simply serialize the object, then deserialize it to an anonymous type. This worked, but I found that my nested types were JObjects and I needed to use cast and convert these to the actual type.

public class TestContext
{
    // remainder omitted
    
    public dynamic ToDynamicObject(object obj, object anonymousTemplate)
    {
        var serialized = JsonConvert.SerializeObject(obj);
        return JsonConvert.DeserializeAnonymousType(serialized, anonymousTemplate);
    }
}

Used as

var model = _c.ToDynamicObject(result.Data, new { success = true, data = new SubModel() });

var subModel = ((JObject)model.data).ToObject<SubModel>();

I found one solution at http://blog.m0sa.net/2011/02/mvc-unit-testing-jsonresult-actions.html which seemed to improve upon this by using a wrapper class that implements DynamicObject. This got me very close to what I wanted except, again, the nested anonymous objects themselves had to be wrapped individually. Here using my initial implementation of an extension method that used it.

var model = result.Data.AsDynamic();

var cardModel = ((object)model.data).AsDynamic() as SubModel;

To fix this I added a bit of code to the wrapper class so that the result of TryGetMember was an anonymous type, it created and returned a DynamicObject wrapper around it. To test if the type was anonymous I used the code found at http://www.liensberger.it/web/blog/?p=191 (referenced from this Stack Overflow question, http://stackoverflow.com/questions/2483023/how-to-test-if-a-type-is-anonymous)

Here’s the final implementation as extensions to object. Note: these are internal to my test project so the restriction to object doesn’t bother me too much. JsonResult.Data has that type and that was the problem I was trying to solve.

internal static class ObjectExtensions
{
    public class DynamicWrapper : DynamicObject
    {
        private readonly Type _subjectType;
        private readonly object _subject;

        public static dynamic Create(object subject)
        {
            return new DynamicWrapper(subject);
        }

        private DynamicWrapper(object subject)
        {
            _subject = subject;
            _subjectType = subject.GetType();
        }

        public override bool TryGetMember(GetMemberBinder binder, out object result)
        {
            try
            {
                var propertyInfo = _subjectType.GetProperty(binder.Name);

                var getter = propertyInfo.GetGetMethod();

                result = getter.Invoke(_subject, null);

                if (result.IsAnonymous())
                {
                    result = Create(result);
                }
                return true;
            }
            catch
            {
                result = null;
                return false;
            }
        }
    }

    public static bool IsAnonymous(this object obj)
    {
        if (obj == null)
        {
            return false;
        }

        var type = obj.GetType();

        return Attribute.IsDefined(type, typeof(CompilerGeneratedAttribute), false)
                  && type.IsGenericType && type.Name.Contains("AnonymousType")
                  && (type.Name.StartsWith("<>") || type.Name.StartsWith("VB$"))
                  && (type.Attributes & TypeAttributes.NotPublic) == TypeAttributes.NotPublic;
    }

    public static dynamic AsDynamic(this object obj)
    {
        return DynamicWrapper.Create(obj);
    }
}

And the above code sample now becomes much cleaner and I am much happier:

var model = result.Data.AsDynamic();

var subModel = model.data as SubModel;

Wednesday, February 12, 2014

Putting It All Together: TDD for Dapper Repositories

In the last several posts I've set up the infrastructure for doing Test Driving Development for Dapper with lightweight integration tests. You might want to review these if you haven't already looked at them.

I'm going to keep it relatively simple for this example. Essentially my repositories are just going to implement CRUD operations for each aggregate root. There will be Create, Get and GetAll (Read), Update, and Delete methods. I’m also going to make the choice to have most exceptions bubble up. This might not be the best choice but it will keep the example simple. In fact, you might want to catch most exceptions and transform them into application-level exceptions which you’d handle at higher levels. Testing exceptional behavior is more difficult to set up in integration tests and we can more easily test exceptions with true unit tests in the classes that use our repositories.

Because these are actually integration tests we run into a chicken and egg issue. Unless we write some other (duplicate) code, we can't test Get() without Create() or Create() without Get() in a self-contained test method.
My solution is pragmatic, start with Get(). The logic is relatively simple and easier to convince yourself of it's correctness. You could write some code using a data reader to read and verify that something is created but you'd still have to convince yourself that code is correct. Since we're not so much testing Dapper - it already has tests - as we are our SQL, I prefer to start off with Get() and not write the extra code. As they say, "your mileage may vary," you might want to start with something else.

Note you won't be able to create tests for Get() until you've done Create(), but once you've created your Get() implementation and convinced yourself of its correctness, you can move onto Create() and use TDD to create that code. After you can create and retrieve an object, you're ready for GetAll(), Delete(), Update(), and any specialized methods. Do go back and create tests for Get(). These are after all, integration tests, we can write those after the fact. As we'll see that will come in handy later when we extend our Get() implementation to get some related data from another table.

We’re going to create two separate repositories, one for Users and one for Posts. Let's start with the DapperUserRepository first. All of the code referenced below can be found, though in it's final form, at https://github.com/tvanfosson/dapper-integration-testing

DapperUserRepository

First, we'll solve our "chicken and egg" problem by implementing Get().

public User Get(int id)
{
    const string sql = "SELECT * FROM [Users] WHERE [Id] = @userId";

    var user = Fetch(c => c.Query<User>(sql, new { userId = id }).SingleOrDefault());

    return user;
}

Now, if you’re paying attention, you should be saying “Wait a minute – that’s too much code there. You’re cheating.” Yes, I am.I’m showing you the results after some refactoring – refactoring done in another project.In another project I found it helpful to develop a base repository pattern. This base implementation provides some convenient methods for performing the Dapper Query and Execute operations. Below is the base class implementation for the repositories we'll create. Since I've tested this pattern in other projects, I feel comfortable starting with it as the basis for my new repositories.

public abstract class DapperRepositoryBase
{
    protected readonly IConnectionFactory _connectionFactory;
    protected readonly string _connectionStringName;

    protected DapperRepositoryBase(IConnectionFactory connectionFactory, string connectionStringName)
    {
        _connectionFactory = connectionFactory;
        _connectionStringName = connectionStringName;
    }

    protected DbConnection OpenConnection()
    {
        var connection = _connectionFactory.Create(_connectionStringName);
        connection.Open();
        return connection;
    }

    protected T Fetch<T>(Func<DbConnection, T> func)
    {
        if (func == null)
        {
            throw new ArgumentNullException("func");
        }

        using (var connection = OpenConnection())
        {
            return func(connection);
        }
    }

    protected int Execute(Func<DbConnection, int> func)
    {
        if (func == null)
        {
            throw new ArgumentNullException("func");
        }

        using (var connection = OpenConnection())
        {
            return func(connection);
        }
    }

    // remainder of class omitted
}

If you want to start without the base class implementation, your Get() implementation might look a little bigger, maybe like the following. For the rest of the examples, we'll use my base class and I hope it will be obvious where the refactorings came from.

public User Get2(int id)
{
    const string sql = "SELECT * FROM [Users] WHERE [Id] = @userId";

    using (var connection = _connectionFactory.Create(_connectionStringName))
    {
        connection.Open();

        var user = connection.Query<User>(sql, new { userId = id }).SingleOrDefault();

        return user;
    }
}

Now that we have (as we've convinced ourselves) a working Get() method, we can now turn to developing Create. First, we stub it in and have it throw a NotImplementedException.

public class DapperUserRepository : DapperRepositoryBase, IUserRepository
{
    public DapperUserRepository(IConnectionFactory connectionFactory, string connectionStringName)
        : base(connectionFactory, connectionStringName)
    {
    }

    public void Create(User user)
    {
        throw new NotImplementedException();
    }

    // Get omitted
}

Then, we'll create our first test. Typically I'll start with the success case first. In this case, we'll test that a User can be successfully created. We'll add some helper methods to the test context that allows us to get an instance of the repository and create a user with some pre-filled properties. Note that since we're using autoincrement ids, our create code will need to fill this value in when performing the insert.

public class DapperUserRepositoryTests : TestBase
{
    private DapperUserRepositoryTestContext _c;

    [TestMethod]
    public void When_a_new_user_is_created_the_data_is_inserted()
    {
        var repository = _c.GetRepository();
        var user = _c.CreateStandardUser();

        repository.Create(user);

        var retrievedUser = repository.Get(user.Id);

        Assert.AreEqual(user.DisplayName, retrievedUser.DisplayName);
        Assert.AreEqual(user.Email, retrievedUser.Email);
        Assert.AreEqual(user.Active, retrievedUser.Active);
    }
    
    // test context omitted
}

We run the test and it fails, as expected. Now let's provide the least code necessary, in an integration test context, to satisfy the test, that is, insert the record into the database so that when it's retrieved the stored values are matched by the retrieved values. We need to insert the DisplayName, Email, CreatedDate, and Active state.

public void Create(User user)
{
    const string sql = "INSERT INTO [Users] ([DisplayName], [Email], [CreatedDate], [Active]) OUTPUT inserted.[Id] VALUES (@displayName, @email, @createdDate, @active)";

    var id = Fetch(c => c.Query<int>(sql, new
    {
        displayName = user.DisplayName,
        email = user.Email,
        createdDate = DateTime.Now,
        active = user.Active
    }).Single());

    user.Id = id;
}

We run the test now and it passes. Next we might want to test that the created date is also updated when the user is created. This is a bit tricky because we can't simply compare dates between the CLR and SqlServer; since their resolutions are different we'll get spurious errors. One way around this is to take a snapshot of the time right before and right after the insert and make sure that the inserted date falls between those ranges within an acceptable bound - for me I only care about it being accurate to the nearest second. We'll use another helper method to round our times to the nearest second for comparison. Thus our next test is:

[TestMethod]
public void When_a_new_user_is_created_the_current_time_is_set_as_the_created_date()
{
    var repository = _c.GetRepository();
    var user = _c.CreateStandardUser();

    var before = _c.RoundToSecond(DateTime.Now);
    repository.Create(user);
    var after = _c.RoundToSecond(DateTime.Now);

    var retrievedUser = repository.Get(user.Id);

    var created = _c.RoundToSecond(retrievedUser.CreatedDate);

    // we only care about the nearest second
    Assert.IsTrue(before <= created && created <= after);
}

Running this test fails because we're not retrieving the CreatedDate. In this case I think it's easier to set the value on the object instead of storing it, so we'll implement it as follows.

public void Create(User user)
{
    var date = DateTime.Now;

    const string sql = "INSERT INTO [Users] ([DisplayName], [Email], [CreatedDate], [Active]) OUTPUT inserted.[Id] VALUES (@displayName, @email, @createdDate, @active)";

    var id = Fetch(c => c.Query<int>(sql, new
    {
        displayName = user.DisplayName,
        email = user.Email,
        createdDate = date,
        active = user.Active
    }).Single());

    user.Id = id;
    user.CreatedDate = date;
}

Note that we don't set any values on the object until the Fetch has succeeded - we don't want the object state to be updated if there is an exception. Even though the code (appears) to work that way, we'll create another test to ensure that.

[TestMethod]
public void When_an_exception_is_thrown_the_id_and_date_are_not_updated()
{
    var repository = _c.GetRepository();
    var user = _c.CreateStandardUser();
    var otherUser = _c.CreateStandardUser(1);
    otherUser.DisplayName = user.DisplayName;

    repository.Create(user);

    var expectedId = otherUser.Id = -1;
    var expectedDate = otherUser.CreatedDate = DateTime.MinValue;

    try
    {
        repository.Create(otherUser);
        Assert.Fail("Expected SqlException not thrown.");
    }
    catch (SqlException)
    {
        Assert.AreEqual(expectedId, otherUser.Id);
        Assert.AreEqual(expectedDate, otherUser.CreatedDate);
    }
}

And that test passes, as we hoped. We continue writing tests in this manner, testing small bits of functionality until we believe that our methods perform the way we think they should. While it won't be possible without developing a parallel implementation for our functions to develop the tests without interdependencies, this gives us a reasonably simple way to build up our code in a Test Driven fashion.

Hopefully this small example, with the code included in the GitHub repository, has shown you how to get started doing TDD with Dapper. I've left the code in the repository incomplete so you can experiment with it. In particular, there are some interesting behaviors in the Post repository. You might want to experiment with getting the posts associated with a specific user, which returns a filtered collection, for example.

Tuesday, February 11, 2014

Solving the Challenges of Dapper Integration Testing: Part 3

This is the fourth in a series of posts on lightweight integration testing with Dapper. These lightweight tests will enable us to do Test-Driven Development when creating our Dapper-based repositories. In my previous posts I’ve discussed the goals for our tests and covered three of the four challenges that were introduced. The first post in the series introduced our goals: tests that are automated, fast, simple, independent, self-contained, safe, and require minimal extra tooling to implement. The second and third have covered the challenges so far: providing a data connection, isolating the execution of the tests, and cleaning up our test data between tests.

In this post we’ll cover the last remaining challenge: keeping the test database schema in sync with our production database. This code and the code for all previous posts is available at https://github.com/tvanfosson/dapper-integration-testing

Synchronizing the Schema

To keep our database synchronized, and in fact, to provide a way to maintain our database schema we’ll borrow a concept used in Ruby/Rails and Entity Framework, namely database migrations. While we could do this entirely in code, I’m going to leverage the SQL Server Management Studio (SSMS) tooling and maintain the migrations as a collection of SQL scripts and automate their execution in code. This provides a good compromise, in my opinion, between the complexity of creating an entirely code-based migrations infrastructure and the burden of keeping the database schemas in sync manually.

Certainly, though, if there were one area of this solution that I could improve it would be this one. One possible alternative would be to use Entity Framework migrations with or without using Entity Framework as the ORM (object-relational mapper). Such a solution would, perhaps, be ideal if you were augmenting your Entity Framework-based application with targeted use of Dapper for specific queries requiring the speed advantage that Dapper provides. Because this series is Dapper focused, however, I’m going to use the solution described above: SQL scripts, executed automatically in our test framework. The solution I described is specific to Microsoft SQL Server and SSMS-generated scripts, but can be adapted as needed for other databases and tooling. If you’re not familiar with the Generate Scripts wizard in SSMS, you can find more information at http://technet.microsoft.com/en-us/library/bb895179.aspx.

Approach

I will be using a database-first approach. By that I mean that I will use SSMS to create the database tables, indexes, relations, etc. and then reverse-engineer the scripts that will be used to create and upgrade the production database. These scripts may be edited by hand to ensure that existing data is preserved and migrated as needed as the schema changes. There are three types of scripts that you might want to produce. Two of these will be used in maintaining the database itself, the “up” and “down” scripts. These are used to create/upgrade and back out upgrades, respectively. We’ll make use of the “up” scripts, which I typically name with “Create” or “Add” (as in “add this functionality”) in the script name. The third type of script, the “delete” script will be used solely for our testing purposes. This could be automated in the same manner as our database clean up, but since these scripts are easy enough to generate with the wizard we’ll leave that for a future exercise.

Create the Database

For my example, I’m going to use the tried and true (and trite) blog post. Note: the model is not intended to be realistic, but rather illustrative of the concepts that we’ll use for testing.

We’ll create tables representing Users, Posts, and PostDetails. The table names follow the plural convention and will typically correspond with our models directly. All of our tables will use artificial, auto-increment integer identity columns, named Id, as their primary keys. A User will have a DisplayName (nvarchar), Email (login id) (nvarchar), a CreatedDate (datetime), and a bit column, Active, indicating whether the user is currently an active user or not.

A Post has an associated OwnerId, a foreign key to the Id in the Users table, a Title (nvarchar), Slug (nvarchar), a PostedDate (datetime), EditedDate (datetime), and a bit column indicating whether the post is Deleted or not. Well, set the foreign key (FK) relationship up to cascade on delete. It has an additional non-unique index on the OwnerId column. I anticipate that our routes would incorporate the Post Id and so an index on the Slug column isn’t necessary.

Generate Create and Delete Scripts

Each Post has a related PostDetail through the PostId, foreign key on PostDetails to Post. The PostDetail will consist of a SequenceNumber (int) and a Text column containing the actual contents of one part of a Post. SequenceNumber will be maintained by our application and will represent separate “pages” in a post. For simplicity we’ll set the Text column up as nvarchar(MAX). It might contain HTML or MarkDown. The FK relationship will be set up to cascade on delete. We will add a unique index on PostId and SequenceNumber, enabling us to use index joins to get the related details for a post quickly.

This will be our simple database. Using Tasks→Generate Scripts we’ll create 000-Create-DapperTesting.sql and 000-Drop-DapperTesting.sql. These will be stored in a DatabaseScripts directory in our source code repository. To ensure success make sure you configure the options to check for existence when generating these scripts under the Advanced options setting in the wizard. You will also want to make sure that all of the necessary objects are included.

Automatically Provision the Test Database

Add the test database to the solution

Add New Items - Data - Service-based Database Dialog

Add a Service-based Database to the Unit Test Project

As I noted in previous posts we’ll be using a LocalDB database for our test database. Once we’ve created our unit test project, we’ll add a Service-based Database to our project. Right-click on the test project, and choose Add Item→New. Select Data, then choose Service-based Database. I gave mine the name DapperTestingDB.mdf. Once you've completed this step, you will want to make sure that the log file is also added to the solution using Add→Existing item, then selecting the LDF file associated with the database. Use the properties dialog to set both the MDF and LDF items as Content items. To be safe, I also set mine up to Copy Always. It's important that the MDF and LDF files be deployed at the same time since an out-of-date log file will prevent your LocalDB instance from being used. By copying both new for each build, this avoids that problem, though you could probably get by with Copy If Newer. I changed mine to Copy Always in the process of debugging the log file synchronization issue and simply haven't changed it back on the theory that you don't change things that aren't broken.

Update the TestContextBase

Now that we have our (empty) test database in the solution, we need a way to initialize it. In fact, we want to reinitialize it each run to capture any changes that may have been made. To do this we’ll update our TestContextBase class that I introduced in my previous posts.

Because we only want the database to be initialized once per test run, not once per test, we’ll create a static constructor that invokes the database initialization code. A static constructor is run once, before “the first instance is created or any static members are referenced.” Because our static constructor is on the base class, this means that it will be executed once for any and all test contexts that derive from it. This constructor is relatively simple. In it we find the “down” and “up” (or “delete” and “create” as they are called internally) scripts, then first execute the delete scripts, then the create scripts.

In this way, we make sure that each test run starts off with a fresh database that is synchronized to the latest schema represented in the scripts. By making sure that we check for object existence before we delete (or create) it, we can be sure that changes to the scripts don’t cause errors due to a new database object being introduced – that object won’t exist yet and so the existence check will fail and the associated delete will not be attempted.

internal abstract class TestContextBase : IDisposable
{
    private const string ScriptsDirectory = @"..\..\..\..\DatabaseScripts\";
    private readonly static string[] _createScripts = { "000-Create-DapperTesting.sql" };
    private static readonly string[] _deleteScripts = { "000-Drop-DapperTesting.sql" };
    private readonly static string _dbFile = Path.Combine(Environment.CurrentDirectory, "DapperTestingDB.mdf");
    private static readonly string _connectionString = @"Data Source=(LocalDB)\v11.0;Initial Catalog=DapperTesting;Integrated Security=True;AttachDbFilename=" + _dbFile;

    static TestContextBase()
    {
        var deleteScripts = _deleteScripts.Select(s => Path.Combine(ScriptsDirectory, s)).ToList();
        var createScripts = _createScripts.Select(s => Path.Combine(ScriptsDirectory, s)).ToList();

        Assert.IsTrue(deleteScripts.All(File.Exists));
        Assert.IsTrue(createScripts.All(File.Exists));

        ExecuteScripts(deleteScripts);
        ExecuteScripts(createScripts);
    }
    
    // remainder of class omitted
}

The implementation of ExecuteScripts for MS-SQL is interesting in that the scripts generated by SSMS cannot be run directly using SqlCommand due to the inclusion of comments, USE, and the SSMS-special GOcommand. To overcome this we need to first filter out any commands that are unnecessary to maintaining the database and/or will cause issues if invoked as a SqlCommand. After some experimentation I settled on filtering lines that started with "USE", "SET", or starts with "/*" and ends with "*/". This is specific to the scripts created by SSMS. Further the scripts are broken into batches of commands that can be executed together via a single SqlCommand by splitting on the GO keyword.

private static readonly Func<string, bool> _startsBatch = s => s.StartsWith("GO", StringComparison.OrdinalIgnoreCase);
private static readonly Func<string, bool> _isUse = s => s.StartsWith("USE", StringComparison.OrdinalIgnoreCase);
private static readonly Func<string, bool> _isSet = s => s.StartsWith("SET", StringComparison.OrdinalIgnoreCase);
private static readonly Func<string, bool> _isComment = s => s.StartsWith("/*") && s.EndsWith("*/");

private static void ExecuteScripts(IEnumerable<string> scripts)
{
    Func<string, bool> isValidCommand = c => !_isUse(c) && !_isSet(c) && !_isComment(c);
    using (var conn = GetConnection())
    {
        conn.Open();
        foreach (var script in scripts)
        {
            var commands = File.ReadAllLines(script).Where(isValidCommand).ToList();
            IEnumerable<IEnumerable<string>> batches = CreateBatches(commands).ToList();
            foreach (var batch in batches)
            {
                using (var query = conn.CreateCommand())
                {
                    query.CommandText = string.Join(Environment.NewLine, batch);
                    query.ExecuteNonQuery();
                }
            }
        }
    }
}

/// <summary>
/// In order to execute groups of commands that can be executed together (not all can),
/// we separate them by the GO command into distinct, executable sets.  This assumes
/// that the scripts have been generated by Tasks->Generate Scripts in SSMS but will
/// also work if each script only contains commands that can be grouped together.
/// </summary>
/// <param name="commands"></param>
/// <returns></returns>
private static IEnumerable<IEnumerable<string>> CreateBatches(IEnumerable<string> commands)
{
    var batch = new List<string>();
    foreach (var command in commands)
    {
        if (_startsBatch(command))
        {
            if (batch.Any())
            {
                yield return batch;
                batch = new List<string>();
            }
        }
        else if (!string.IsNullOrWhiteSpace(command))
        {
            batch.Add(command);
        }
    }

    if (batch.Any())
    {
        yield return batch;
    }
}

Now our TestContextBase is set up to (re)create our database afresh for each test run. This will add a little bit of overhead to our tests runs, but we minimize this by doing it once and it is more than offset by our ability to do TDD on our Dapper-based repositories.
There are some obvious or apparently obvious opportunities for improvement in the above. First we could maintain many of the settings in a configuration file rather than in code. I chose not to do this because this is test code. It’s really only changing when I’m actively developing the code (and tests) so maintaining changeable settings in a configuration actually doesn’t save us much since we’re modifying the code already.

Another, probably better, opportunity would be to invest in naming conventions – we already have a start on this with the names we’ve used – and standardized locations for our scripts. This would enable use to enumerate them in code at runtime rather than hard-code them in our test context base class. This is something that I would probably do but in a simple case like this I don’t find it necessary. Clearly, we would want to refactor this as the database schema grows.

This wraps up the set up portion of this series. In the next post, I’ll start doing the development of the repositories using TDD based on what we’ve developed so far.

Monday, February 10, 2014

Solving the Challenges of Dapper Integration Testing: Part 2

This is the third in a series of posts on lightweight integration testing with Dapper. These tests are intended to enable us to do Test-Driven Development in creating our Dapper-based repositories. As a reminder our goals – see the first post for the complete explanation – are tests that are:

Automated
Fast
Simple
Minimal Tooling
Independent
Self-contained
Safe

In my last post I covered the two easier challenges: providing a database connection in a way that lets us easily substitute our testing database and isolating our tests. Because these are integration tests, the latter is important as the database is a common interaction point for our tests and running them in isolation keeps the test data from one test from causing failures in other tests.

In this post I’m going to cover one of the two remaining challenges: cleaning up test data after each test. We'll leave the last challenge for the next post.

Data Clean Up

One of the difficulties with integration tests, especially the isolated kind that I am interested in, is the effort of making sure that your data is set up exactly as you want pre-test. Generally, this will involve writing some code that cleans up any data from previous tests or test runs. This is complicated by relationships in the data, especially in the case where deletes do not cascade. In those cases, a delete can fail when the data in the “child” relation hasn’t been deleted first.

One way to handle this is writing database-specific delete code that understands the relationships and performs deletes in the correct sequence to prevent failures due to non-cascading deletes. To me, this is painful – and certainly not simple.

Another way to handle it would be to simply delete the database tables and related objects between each test run. As we’ll see later, this will be necessary when maintaining currency with the database, but it seems overkill – and certainly slower than we would like, deleting and re-creating the schema per test.

It would be better if we could to leverage the database structure itself to automatically create delete statements in the proper order. Rather than writing the code once-per database, we could write the code once per database provider and it would suffice for most, if not all, databases that use that provider. At least it would suffice for the kinds of databases that I’m concerned with most of the time.

This idea is not original with me and I didn’t want to reinvent the wheel if I could possibly avoid it so I did a bit of research into automated integration testing. During my research, I came across a DatabaseDeleter class by Jimmy Bogard, who blogs at LosTechies.com, that I thought was the perfect solution to the problem of automatically cleaning up data generated in an integration test. The only problem – it was written for use with nHibernate. I adapted it to work using SqlConnection and SqlCommand for Microsoft SQL Server (the version I’m using is Microsoft SQL Server 2012).

The most interesting method is the BuildTableList method (full code is available on GitHub):BuildTableList

private static IEnumerable<string> BuildTableList(ICollection<string> allTables,
                                                  ICollection<Relationship> allRelationships)
{
    var tablesToDelete = new List<string>();

    while (allTables.Any())
    {
        var leafTables = allTables.Except(allRelationships
                                  .Select(rel => rel.PrimaryKeyTable))
                                  .ToArray();

        tablesToDelete.AddRange(leafTables);

        foreach (var leafTable in leafTables)
        {
            allTables.Remove(leafTable);

            var relToRemove = allRelationships.Where(rel => rel.ForeignKeyTable == leafTable)
                                              .ToArray();

            foreach (var rel in relToRemove)
            {
                allRelationships.Remove(rel);
            }
        }
    }

    return tablesToDelete;
}

This method takes a collection of tables (all the tables in the database) and a collection of parent-child relationships between tabels. If you think of the relationships as forming a set of trees, it performs a bottom-up traversal across these trees, adding the leaf nodes discovered in each pass to an ordered list of tables to delete data from. Simultaneously, it removes those leaf nodes from the trees, by removing the relationships and removing the tables from the set of tables to consider, so that in the next pass the parents of those leaf nodes become the new leaf nodes. In this way, if we delete data from the ordered table collection in order, we are guaranteed to not violate any foreign key constraints. This gives us a safe order in which to build the delete commands to avoid failures.

This collection is used by BuildDeleteTables to generate the delete command associated with a particular database. This is constructed once per test run and subsequently reused per test clean up to remove any data that the test may have added.

BuildDeleteTables/BuildDeleteSql

private void BuildDeleteTables()
{
    if (_initialized.Contains(_database))
    {
        return;
    }

    lock (_lockObj)
    {
        if (!_initialized.Add(_database))
        {
            return;
        }

        using (var connection = new SqlConnection(_database))
        {
            connection.Open();

            var allTables = GetAllTables(connection);

            var allRelationships = GetRelationships(connection);

            _tablesToDelete = BuildTableList(allTables, allRelationships);

            _deleteSql[_database] = BuildTableSql(_tablesToDelete);
        }
    }
}

private static string BuildTableSql(IEnumerable<string> tablesToDelete)
{
    return string.Join(" ", tablesToDelete.Select(t => string.Format("delete from [{0}];", t)));
}

The DatabaseDeleter class is used in the Dispose method (following Chris Patterson’s IDisposable Done Right pattern) in our abstract TestContextBase class.

TestContextBase (Dispose implementation)

internal abstract class TestContextBase : IDisposable
{
    // Remainder of class omitted for brevity
    
    // IDisposable implementation. On dispose of the context, we clean up any data inserted by the test
    
    protected bool _disposed;

    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }

    ~TestContextBase()
    {
        Dispose(false);
    }

    protected virtual void Dispose(bool disposing)
    {
        if (_disposed)
        {
            return;
        }

        if (disposing)
        {
            var deleter = new DatabaseDeleter(_connectionString);
            deleter.DeleteAllData();
        }

        _disposed = true;
    }
}

The Dispose method is invoked in our test classes in the clean up method, which we’ve seen before, but is now shown with the context-related code for a sample DapperUserRepository below. After each test, the test context is disposed, and the DeleteAllData for the related database is run, cleaning up any data that was inserted by the test.

[TestInitialize]
public void Init()
{
    Start();
    _c = new DapperUserRepositoryTestContext();
}

[TestCleanup]
public void Cleanup()
{
    if (_c != null)
    {
        _c.Dispose();
    }
    End();
}

That wraps up our data clean up task. Using our base test context and it's IDisposable implementation lets us make sure that each test starts with a clean database, as long as our clean up method runs (see the discussion in first post in the series). Next time we'll see how to handle creating the database schema so that our test database is automatically set up for each test run as our database changes.

Sunday, February 9, 2014

Solving the Challenges of Dapper Integration Testing: Part 1

In my last post I outlined the series of events that led me to want to write some lightweight integration tests for Dapper that would allow me to develop my Dapper repositories using Test-Driven Development (TDD). These tests wouldn't replace application-level integration tests, but would allow me to "unit test" my Dapper code so that I could develop it in a test-driven manner. My implementation would have to meet several requirements (see my previous post for a fuller explanation):

Automated
Fast
Simple
Minimal Tooling
Independent
Self-contained
Safe

Achieving these requirements could be a challenge with integration tests. I outlined several challenges that I would face including automatically setting up the database, cleaning up data between tests, isolating the operations of the tests, and programmatically providing a connection to the repository that could be mocked out. In this post I will describe how I addressed each of these challenges, including the code I created or adapted to use for my scenario.

All of the code for these examples can be found at https://github.com/tvanfosson/dapper-integration-testing. The inline code snippets may omit some details to save space but you can find the complete code in the repository.

Let's take them from simplest to most complex. In this post we’ll cover providing a connection and isolating the execution of the tests from each other.

Providing a Connection

Our repository will function as a "repository of queries." Each method encapsulates some query or command that interacts with the database. In some cases it might be several queries that function to provide complex data or complete a complex transaction. Importantly, each method in our repository will function as an independent unit of work. In this case, rather than provide a connection to the repository, instead we want to provide a mechanism to obtain a connection that can be created and disposed within the context of a single method.

We will use dependency injection to provide an IConnectionFactory instance. In production, the connection factory will be hooked up in our dependency injection framework to provide connections to our database provider of choice. As long as that provider supports DbConnection, it will work for our purposes. Since Dapper requires this to function, that doesn't restrict our choices of provider any more than the choice of Dapper itself does. IConnectionFactory needs only one method, Create(string connectionStringName), which takes the name of a connection string and provides a DbConnection to the provider specified by the connection string. In practices, our implementations of this class will likely be dependent on some configuration wrapper that allows it to retrieve the named connection string from a configuration file.
We'll provide a base class for our repositories that uses the factory and makes it available to all repositories derived from it.

IConnectionFactory

public interface IConnectionFactory
{
    DbConnection Create(string connectionStringName);
}

And here is one possible implementation for a SqlServer-backed repository using a configuration wrapper (code not shown).

public class MsSqlConnectionFactory : IConnectionFactory
{
    private readonly IConfiguration _configuration;

    public MsSqlConnectionFactory(IConfiguration configuration)
    {
        _configuration = configuration;
    }

    public DbConnection Create(string connectionStringName)
    {
        return new SqlConnection(_configuration.GetConnectionString(connectionStringName));
    }
}

DapperRepositoryBase (constructor)

public abstract class DapperRepositoryBase
{
    protected readonly IConnectionFactory _connectionFactory;
    protected readonly string _connectionStringName;

    protected DapperRepositoryBase(IConnectionFactory connectionFactory, string connectionStringName)
    {
        _connectionFactory = connectionFactory;
        _connectionStringName = connectionStringName;
    }
    
    // remainder omitted
}

In our tests we will use a SqlServer LocalDB instance added to the test project as a content item. Each of our test classes will have a test context that provides a mechanism to obtain the connection factory and pass back a mock that will return a connection to our LocalDB instance. We’ll see more of this test context base class later as it provides additional common behavior to support our test classes. Notice that it’s internal – it’s not needed outside our test project – and it implements IDisposable. That will be important later, but I'm not going to give it away how just yet.

Note: I’m using FakeItEasy, a dead-simple to use dynamic fake framework to provide a mock connection factory. Since these are integration tests, this is the only usage we’re likely to see. I’ve got it set up to provide a new connection instance every time it’s called since that’s the way our real implementations will work.

TestContextBase

internal abstract class TestContextBase : IDisposable
{
    private readonly static string _dbFile = Path.Combine(Environment.CurrentDirectory,
                                                          "DapperTestingDB.mdf");
    private static readonly string _connectionString =
        @"Data Source=(LocalDB)\v11.0;Initial Catalog=DapperTesting;"
        + "Integrated Security=True;AttachDbFilename="
        + _dbFile;

    protected static DbConnection GetConnection()
    {
        return new SqlConnection(_connectionString);
    }

    protected static IConnectionFactory CreateConnectionFactory(string connectionStringName)
    {
        var connectionFactory = A.Fake<IConnectionFactory>();

        A.CallTo(() => connectionFactory.Create(connectionStringName))
         .ReturnsLazily(f => GetConnection());

        return connectionFactory;
    }
    
    // Disposable implementation omitted (we'll see it later)
}

Isolating Test Execution

Because our tests are touching the same database, there’s a strong likelihood that we could get inconsistent or incorrect results if more than one test runs at a specific time. A test that inserts some data in the database, then runs a query against it may get different data than expected if another test, running at the same time, also inserts some data into the same tables being queried by the first test. To prevent this, we want to make sure that our tests run serially, one after the other, instead of in parallel.

One way to do this is to use Ordered Tests. This will force the tests to run in a specific order and, if we check the Continue after failure option, this should give us what we want, a way to run independent tests in an execution-isolated way. Doing this, however, adds some manual operations to maintaining our test suite and prevents us from simply running these tests along with our other (unit) tests in the same way that we are used to. My preference is to use the TestInitialze and TestCleanup methods with some synchronization code to enforce execution-isolation. To do that, I create a TestBase class that all of my test classes will derive from. This class will contain Start and Stop methods that can be used by our setup/teardown methods in the test class to enforce a single thread of execution through the test suite using a Monitor. Using a Monitor with a shared lock object will allow our tests to synchronize themselves through test setup/teardown.

Note: one downside to this method is that it may be possible for certain types of exceptions to fail our tests in a way that the teardown code does not get called. In this case, our tests will hang. This could be a problem if you do continuous integration and check in code that fails in that particular way. This should only happen in failure conditions, not when your tests are passing. You do make sure your tests pass before you check code into your repositories, don't you? The trade-off of being able to use my normal unit testing processes for these tests as well is worth the potential risk to me.

TestBase

public abstract class TestBase
{
    private static readonly object _lock = new object();

    protected void Start()
    {
        Monitor.Enter(_lock);
    }

    protected void End()
    {
        Monitor.Exit(_lock);
    }
}

Example set up and teardown

[TestInitialize]
public void Init()
{
    Start();
    // create our concrete test context
}

[TestCleanup]
public void Cleanup()
{
    // dispose of our concrete test context
    End();
}

That wraps it up for this post, in future posts I'll tackle data clean up, then database set up. These will require more complex solutions and some convention-based programming.

Saturday, February 8, 2014

Lightweight Integration Testing with Dapper

Motivation, Requirements, and Challenges

Motivation

Recently I worked on a project that used Dapper as the data access layer, creating repositories based on Dapper queries for each aggregate root in the domain model. Dapper (https://github.com/StackExchange/dapper-dot-net) if you aren't aware, is "a simple object mapper for .Net." Dapper was developed for and is used on StackOverflow.com. Because it's built as extensions to DbConnection it isn't
A Dapper Gentleman

tied to any specific database and works with a variety of database providers. There are some micro-ORM implementations built on top of Dapper, but we used it directly. In it's base form, Dapper consists of generic query/execute methods that take SQL statements as a string, run the SQL on the database, and map the resulting output to either a collection of objects (type specified in the call) or as a count of the results returned.

Dapper is pretty simple and simple to use. That's one of the reasons that it is also pretty fast. That was the reason that we chose to use it. I say "we", but really the decision had been made prior to my involvement in the project and my involvement was limited to extending the implementation for more use cases. Because it's simple, you might be able to convince yourself of the correctness of your implementations and may not design any specific tests for your repositories.

That was the situation I found when I joined the project. There were plenty of unit tests, but they stopped at the repository layer. All of the Dapper-based repositories had interfaces and those interfaces were mocked in the unit tests. There were no tests - unit or integration - for the repositories themselves. Again, the queries were pretty simple, so code inspection was used to verify correctness along with an extensive QA process.

That is, until I added a feature that required more complex behavior, specifically a relation that had some rather complex logic that involved left joins, multiple result sets, and transactional behavior. Dapper was able to handle all that I threw at it, but I felt uncomfortable that (1) the only testing that my queries had was my own manual, exploratory testing and (2) I'd need to be careful to manually test around any code changes that I wanted to make.

Because Dapper is implemented as extension methods, it doesn't lend it self to mocking and I didn't really want to introduce another layer around Dapper that would have similar problems. Moreover, I felt that I needed to not only verify that my SQL commands were being execute but also that my SQL actually did what I expected. What I really wanted/needed was a set of lightweight integration tests that would be performed against a test database, ideally with a minimum amount of manual set up, that could be executed as if they were unit tests.

Requirements

Automated

For me, for tests to be useful, they need to be automated. I wanted these tests to be able be run along side my true unit tests.

Fast

Because I wanted the tests to run regularly, they need to be reasonably fast. The tests can't be too big or too extensive. Simple, usable tests against a local database are my ideal.

Simple

I'm not looking to replace acceptance tests or the real integration tests for the application. Keeping the tests simple will help keep them fast. Keeping them simple also means having to develop the minimum amount of code to create a new test.

Minimal Tooling

I want these to run on any developer's machine and, potentially, on a build or continuous integration server.

Independent

In the spirit of unit tests, each test should be independent. The data created for and by the test should be unavailable and should not interfere with other tests. The tests should be able to run in any order. To keep things as simple as possible, though, we'll relax this constraint for catastrophic failures. Basically, when the per-test clean up tasks runs, we want to guarantee that each test gets a fresh start.

Self-contained

Tests, or the context they are run in, will perform any data set up required for the test. There should be no manual steps or external scripts that need to be invoked before the tests can be run. Any database or data set up that needs to be done, including any scripts that need to execute, should be performed by the test code.

Safe

Because these tests are going to be used to develop code, the environment that they run in should be isolated. The tests shouldn't connect directly to other application databases, even to read data, to maximize safety.

Based on these requirements I created the tests in a unit test project. Each repository gets it's own test class. I opted to go with LocalDB as the database provider. I added the MDF file (and log) to the test project as a content item, updated when newer. Using a unit test project will allow me to tie into the standard test running and ensure that my Dapper integration tests are performed along with the rest of my unit tests. Using LocalDB will keep my tests isolated.

Challenges

It may not seem obvious at first glance but these choices along with the requirements and good software practices raise some interesting challenges.

Creating the database schema. To be self-contained, our test framework will need a way to ensure that the database has the correct tables and other database objects for each test run.
Data clean up. We could have each test be responsible for cleaning up after itself but in order to keep them simple I think we can do better. We want an automated method to remove data, including relational data, created by any of our tests that can run in the test clean up phase.
Test isolation. Because the tests need to be independent and yet share a database, we need to ensure that they are run sequentially, including initialization and clean up tests. Different unit test runners may or may not guarantee the behavior that we need so we need a mechanism to guarantee this regardless of the test runner.
Providing a connection to a repository for use during the test that doesn't rely on any manual configuration yet maintains good development practices.

In my next post I will detail the solutions to two of the challenges I've presented and get a start on the code required to implement the solution.

Friday, June 21, 2013

Getting MiniProfiler to work with WebAPI and EF5 Code First

I have, in my head, a set of technologies that are candidates to add to my regular toolset. One of those tools is MiniProfiler. MiniProfiler has been around for awhile and, truth to be told, I did spend some time trying to get it to work a little over a year ago. Long enough ago such that I no longer remember why I couldn’t get it to work, but short enough that I do remember having problems with it. Since then I’ve switched jobs and have been working on a project where introducing MiniProfiler wasn’t really possible so I put the tool back on the shelf. Partly because it’s always bugged me that I never got it working for me and partly because it looks like it might now come in handy on a project, I decided to get back on the horse, as it were and get it figured out.

Let me first say that the online documentation is excellent and comprehensive, it does look a bit out of date. Currently the instructions are for Entity Framework 4.1 and 4.2. As of this writing I’m using EF5 and CTPs of EF6 are available. Fortunately, the instructions – minus the caveats for 4.2 – work pretty well. In fact, all of the information you need to use MiniProfiler with WebAPI and EF5 Code First is contained in the documentation, but not always where you would expect.

Also, if you’re like me, you’ve gotten used to just installing NuGet packages and using them without a lot of extra configuration. That’s not the case here. It’s important to read all of the instructions for installing and configuring the MiniProfiler for use. There’s a fair amount of hand-crafted set up that you will need to make it work, not all of it listed in the Getting Started section of the documentation.

Start by adding the MiniProfilerEF NuGet package to your solution. I used version 2.1 (published on 3/4/2013). It has a dependency on the MiniProfiler package (also version 2.1, published on the same date) so you will get both when installing the Entity Framework integration.

Next up, you’ll need a web page. Fortunately, my WebAPI project came with an MVC home controller equipped with an Index action and associated view. For my exploration I made use of this set up. In Views\Shared\_Layout.cshtml I added the following (I’ve included a bit of context). Note that I inserted the MiniProfiler scripts after jQuery and before any page-specific scripts are included. MiniProfiler requires jQuery 1.7+ and will attempt to load it if it isn’t installed.

<html>
    @using StackExchange.Profiling  
    <head>
         ...
    <head>
    <body>
        ... content ...
  
        @RenderBody()

        @Scripts.Render("~/bundles/jquery")

        @MiniProfiler.RenderIncludes()  

        @RenderSection("scripts", required: false)
    </body>
</html>

In addition to this, I had to make a change to my web.config file, adding the handler for the MiniProfiler resources (that is the CSS and JavaScript files used to render the results, located at /mini-profiler-resources/… ). I added the following to the default web.config in the /system.webServer/handlers section. This bit is listed in the troubleshooting section of the documentation and it was required even though my web.config did not have runAllManagedModulesForAllRequests=false. Extension-less handling for MVC is enabled in my configuration via handlers, not by setting the value of that parameter to true and thus I needed to add the MiniProfiler handler manually.

<add name="MiniProfiler" path="mini-profiler-resources/*" verb="*" type="System.Web.Routing.UrlRoutingModule" resourceType="Unspecified" preCondition="integratedMode" >

Now that the MVC application is set up to display the MiniProfiler results, it’s time to enable it in the application. For Entity Framework, this is the easy part. Simply add the initialization to the Application_Start method in Global.asax.cs, then initiate the collection of profiling data in Application_BeginRequest and stop the profiler in Application_EndRequest. I’ll note that I found I did not have to specify the SqlServerFormatter explicitly to get SQL commands that were readable and usable with SQL Server Management Studio (when no formatter is supplied, the raw SQL command is used). That may not be the case for all providers or different versions of Entity Framework. If so, simply uncomment the line setting the formatter.

public class WebApiApplication : System.Web.HttpApplication
{
    protected void Application_Start()
    {
        ...
        
        //MiniProfiler.Settings.SqlFormatter = new StackExchange.Profiling.SqlFormatters.SqlServerFormatter();
        MiniProfilerEF.Initialize();
    }

    protected void Application_BeginRequest()
    {
        if (Request.IsLocal)
        {
            MiniProfiler.Start();
        }
    }

    protected void Application_EndRequest()
    {
        MiniProfiler.Stop();
    }
}

Note also, that the pre-condition for running the profiler is that the request be a local request. If you plan on running this in a performance test environment, you’ll need to modify the condition for starting the profiler. I will probably drive it with values from the web.config file (AppSettings). See the documentation for more configuration options and usage for more complex examples.

The final step is to write the code to query your API from /home/ and observe the results. I picked a simple WebAPI example – and implemented part of it. Once I had enough that I could query my API, I executed the project in the debugger and my profiler results showed up on the page in the upper left corner.

Subscribe to: Posts ( Atom )