Re-evaluating the “*DD’s” of Software Development: Test Driven Development

EDIT 1/22/2021: Changed the title a little bit to make sure y’all weren’t getting NFSW ads when you pull this up. Mea culpa:)

A bazillion years ago (2007) I wrote a throwaway post on my old CodeBetter blog titled BDD, TDD, and the other Double D’s. At some point last summer we were having a conversation at an all hands meeting at Calavista where the subjects of Test Driven Development (TDD) and Behavior Driven Development (BDD) came up. We were looking for new content to post on the company website, so I volunteered to modernize my old blog post above to explain the two techniques, the differences between them, and how we utilize both TDD and BDD on our Calavista projects. And I said that I’d have it ready to go in a couple days. Flash forward 6-8 months, and here’s the first part of that blog post, strictly focused on TDD:)

I’ll be mildly editing this and re-publishing in a more “professional” voice for the Calavista blog page soon.

Test Driven Development (TDD) and Behavior Driven Development (BDD) as software techniques have both been around for years, but confusion still abounds in the software industry. In the case of TDD, there’s also been widespread backlash from the very beginning. In this new series of blog posts I want to dive into what both TDD and BDD are, how they’re different (and you may say they aren’t), how we use these techniques on Calavista projects, and some thoughts about making their usage be more successful. Along the way, I’ll also talk about some other complementary “double D” in software development like Domain Driven Development (DDD) and Responsibility Driven Development.

Test Driven Development

Test Driven Development (TDD) is a development practice where developers author code by first describing the intended functionality in small automated tests, then writing the necessary code to make that test pass. TDD came out of the Extreme Programming (XP) process and movement in the late 90’s and early 00’s that sought to maximize rapid feedback mechanisms in the software development process.

As I hinted at in the introduction, the usage and effectiveness of Test Driven Development is extremely controversial. With just a bit of googling you’ll find both passionate advocates and equally passionate detractors. While I will not dispute that some folks will have had negative experiences or impressions of TDD, I still recommend using TDD. Moreover, we use TDD as a standard practice on our Calavista client engagements and I do as well in my personal open source development work.

As many folks have noted over the years, the word “Test” might be an unfortunate term because TDD at heart is a software design technique (BDD was partially a way to adjust the terminology and goals of the earlier TDD to focus more on the underlying goals by moving away from the word “Test”). I would urge you to approach TDD as a way to write better code and also as a way to continue to make your code better over time through refactoring (as I’ll discuss below).

Succeeding in software development is often a matter of having effective feedback mechanisms to let the team know what is and is not working. When used effectively, TDD can be very beneficial inside of a team’s larger software process first as a very rapid feedback cycle. Using TDD, developers continuously flow between testing and coding and get constant feedback about how their code is behaving as they work. It’s always valuable to start any task with the end in mind, and a TDD workflow makes a developer think about what successful completion of any coding task is before they implement that code.

Done well with adequately fine-grained tests, TDD can drastically reduce the amount of time developers have to spend debugging code. So yes, it can be time consuming to write all those unit tests, but spending a lot of time hunting around in a debugger trying to troubleshoot code defects is pretty time consuming as well. In my experience, I’ve been better off writing unit tests against individual bits of a complex feature first before trying to troubleshoot problems in the entire subsystem.

Secondly, TDD is not efficient or effective without the type of code modularity that is also frequently helpful for code maintainability in general. Because of that, TDD is a forcing function to make developers focus and think through the modularity of their code upfront. Code that is modular provides developers more opportunities to constantly shift between writing focused unit tests and the code necessary to make those new tests pass. Code that isn’t modular will be very evident to a developer because it causes significant friction in their TDD workflow. At a bare minimum, adopting TDD should at least spur developers to closely consider decoupling business logic, rules, and workflow from infrastructural concerns like databases or web servers that are intrinsically harder to work with in automated unit tests. More on this in a later post on Domain Driven Development.

Lastly, when combined with the process of refactoring, TDD allows developers to incrementally evolve their code and learn as they go by creating a safety net of quickly running tests that preserve the intended functionality. This is important, because it’s just not always obvious upfront what the best way is to code a feature. Even if you really could code a feature with a perfect structure the first time through, there’s inevitably going to be some kind of requirements change or performance need that sooner or later will force you to change the structure of that “perfect” code.

Even if you do know the “perfect” way to structure the code, maybe you decide to use a simpler, but less performant way to code a feature in order to deliver that all important Minimum Viable Product (MVP) release. In the longer term, you may need to change your system’s original, simple internals to increase the performance and scaleability. Having used TDD upfront, you might be able to do that optimization work with much less risk of introducing regression defects when backed up by the kind of fine-grained automated test coverage that TDD leaves behind. Moreover, the emphasis that TDD forces you to have on code modularity may also be beneficial in code optimization by allowing you to focus on discrete parts of the code.

Too much, or the wrong sort of modularity can of course be a complete disaster for performance, so don’t think that I’m trying to say that modularity is any kind of silver bullet.

As a design technique, TDD is mostly focused on fine grained details of the code and is complementary to other software design tools or techniques. By no means would TDD ever be the only software design technique or tool you’d use on a non-trivial software project. I’ve written a great deal about designing with and for testability over the years myself, but if you’re interested in learning more about strategies for designing testable code, I highly recommend Jim Shore’s Testing without Mocks paper for a good start.

To clear up a common misconception, TDD is a continuous workflow, meaning that developers would be constantly switching between writing a single or just a few tests and writing the “real” code. TDD does not — or at least should not — mean that you have to specify all possible tests first, then write all the code. Combined with refactoring, TDD should help developers learn about and think through the code as they’re writing code.

So now let’s talk about the problems with TDD and the barriers that keep many developers and development teams from adopting or succeeding with TDD:

  1. There can be a steep learning curve. Unit testing tools aren’t particularly hard to learn, but developers have to be very mindful about how their code is going to be structured and organized to really make TDD work.
  2. TDD requires a fair amount of discipline in your moment to moment approach, and it’s very easy to lose that under schedule pressure — and developers are pretty much always under some sort of schedule pressure.
  3. The requirement for modularity in code can be problematic for some otherwise effective developers who aren’t used to coding in a series of discrete steps
  4. A common trap for development teams is writing the unit tests in such a way that the tests are tightly coupled to the implementation of the code. Unit testing that relies too heavily on mock objects is a common culprit behind this problem. In this all too common case, you’ll hear developers complain that the tests break too easily when they try to change the code. In that case, the tests are possibly doing more harm than good. The followup post on BDD will try to address this issue.
  5. Some development technologies or languages aren’t conducive to a TDD workflow. I purposely choose programming tools, libraries, and techniques with TDD usage in mind, but we rarely have complete control over our development environment.

You might ask, what about test coverage metrics? I’m personally not that concerned about test coverage numbers, don’t have any magic number you need to hit, and I think it’s very subjective anyway based on what kind of technology or code you’re writing anyway. My main thought about test coverage metrics are only somewhat informative in that the metrics can only tell you when you may have problems, but can never tell you that the actual test coverage is effective in any way. That being said, it’s relatively easy with the current development tooling to collect and publish test coverage metrics in your Continuous Integration builds, so there’s no reason not to track code coverage. In the end I think it’s more important for the development team to internalize the discipline to have effective test coverage on each and every push to source control than it is to have some kind of automated watchdog yelling at them. Lastly, as with all metrics, test coverage numbers are useless if the development team is knowingly gaming the test coverage numbers with worthless tests.

Does TDD have to be practiced in its pure “test first” form? Is it really any better than just writing the tests later? I wouldn’t say that you absolutely have to always do pure TDD. I frequently rough in code first, then when I have a clear idea of what I’m going to do, write the tests immediately after. The issue with a “test after” approach is that the test coverage is rarely as good as you’d get from a test-first approach, and you don’t get as much of the design benefits of TDD. Without some thought about how code is going to be tested upfront, my experience over the years is that you’ll often see much less modularity and worse code structure. For teams new to TDD I’d advise trying to work “pure” test first for awhile, and then start to relax that standard later.

At the end of this, do I still believe in TDD after years of using it and years of development community backlash? I do, yes. My experience has been that code written in a TDD style is generally better structured and the codebase is more likely to be maintainable over time. I’ve also used TDD long enough to be well past the admittedly rough learning curve.

My personal approach has changed quite a bit over the years of course, with the biggest change being much more reliance on intermediate level integration tests and deemphasizing mock or stub objects, but that’s a longer conversation.

In my next post, I’ll finally talk about Behavior Driven Development, how it’s an evolution and I think a complement to TDD, and how we’ve been able to use BDD successfully at Calavista.

What would it take for you to adopt Marten?

If you’re stumbling in here without any knowledge of the Marten project (or me), Marten is an open source .Net library that developers can adopt in their project to use the rock solid Postgresql database as a pretty full featured document database and event store. If you’re unfamiliar with Marten, I think I’d say its feature set makes it similar to MongoDb (but the usage is significantly different), RavenDb, or Cosmos Db. On the event sourcing side of things, I think the only comparison in .Net world is GetEventStore itself, but you can certainly piece together an event store by combining other OSS libraries and database engines.

The Marten community is working very hard on our forthcoming (and long delayed) V4.0 release. We’ve already made some big strides on the document database side of things, and now we’re deep into some significant event store improvements (this link looks best in VS Code w/ the Mermaid plugin active). At Calavista, we’re considering if and how we can build a development practice around Marten for existing and potential clients. I’ve obviously got a lot of skin in the game here as the original creator of Marten. Nothing would make me happier than Marten being even more successful and that I get to help Calavista clients use Marten in real life systems as part of my day job.

I’d really like to hear from other folks what it would really take for them to seriously consider adopting Marten. What is Marten lacking now that you would need, or what kind of community or company support options would be necessary for your shop to use Marten in projects? I’m happy to hear any and all feedback or suggestions from as many people as I can get to respond.

I’m happy to take comments here, or the discussion for this topic is also on GitHub.

Existing Strengths

  • Marten is only a library, and at least for the document database features it’s very unobtrusive into your application code compared to many other persistence options
  • The Marten community is active and I hope you’d say that we’re welcoming to newcomers
  • By building on top of Postgresql, Marten comes with good cloud support from all the major cloud providers and plenty of existing monitoring options
  • Marten comes with many of the very real productivity advantages of a NoSQL solution, but has very strong transactional support from Postgresql itself
  • Marten’s event sourcing functionality comes “in the box” and there’s less work to do to fully incorporate event sourcing — including the all important “read side projection” support — into a .Net architecture than many other alternatives
  • Marten is part of the .Net Foundation
  • If you need commercial support for Marten, you can engage with Calavista Software.

Does any of that resonate with you? If you’ve used Marten before, is there anything missing from that list? And feel free to tell me you’re dubious about anything I’m claiming in the list above.

What’s already done or in flight

  • We made a lot of improvements to Marten’s Linq provider support. Not just in terms of expanding the querying scenarios we support, but also in improving the performance of the library across the board. I know this has been a source of trouble for many users in the past, and I’m excited about the improvements we’ve made in V4.
  • The event store functionality will get a lot more documentation — including sample applications — for V4
  • An important part of many event sourcing architectures is a background process to continuously build “projected” views of the raw events coming in. The current version of Marten has this capability, but it requires the user to do a lot of heavy architectural lifting to use it in any kind of clustered application. In V4, we’ll have an in the box recipe that will be used to do leader election and work distribution through an application cluster in “real server applications.” The asynchronous projection support in V4 will also support multi-tenancy (finally) and we have some ideas to greatly optimize projection rebuilds without system downtime
  • Using native Postgresql sharding for scalability, especially for the event store
  • Allowing users to specify event archival rules to keep the event store tables smaller and more performant
  • Adding more integration with .Net’s generic HostBuilder and standard logging abstractions for easier integration into .Net applications
  • Improving multi-tenancy usage based on user feedback
  • Document and event store metadata capabilities like you’d need for Marten to take part in end to end Open Telemetry tracing within your architecture.
  • More sample applications. To be honest, I’m hoping to find published reference applications built with Entity Framework Core and shift them to Marten. This might be part of an effort to show Jasper as a replacement for MediatR or NServiceBus/MassTransit as well.

And again, does any of that address whatever concerns you might have about adopting Marten? Or that you’d already had in the past?

Other Ideas?

Here are some other ideas that have been kicked around for improving Marten usage, but these ideas would probably need to come through some sort of Marten commercialization or professional support.

  • Cloud hosting recipes. Hopefully through Calavista projects, I’d like to develop some pre-built guidance and quick recipes for standing up scalable and maintainable Marten/Postgresql environments on both Azure and AWS. This would include schema migrations, monitoring, dynamic scaling, and any necessary kind of database provisioning. I think this might get into Terraform/Pulumi infrastructure as well.
  • Cloud hosting models for parallelizing and distributing work with asynchronous event projections. Maybe even getting into dynamic scaling.
  • Multi-tenancy through separate databases for each client tenant. You can pull this off today yourself, but there’s a lot of things to manage. Here I’m proposing more cloud hosting recipes for Marten/Postgresql that would include schema migrations and distributed work strategies for processing asynchronous event projections across the tenant databases.
  • Some kind of management user interface? I honestly don’t know what we’d do with that yet, but other folks have asked for something.
  • Event streaming Marten events through Kafka, Pulsar, AWS Kinesis, or Azure Event Hubs
  • Marten Outbox and Inbox approaches with messaging tools. I’ve already got this built and working with Jasper, but we could extend this to MassTransit or NServiceBus as well.