I’m trying to walk a line here in this post between avoiding specifics about a client project for obvious reasons, but providing enough detail to make this post worthwhile for that client. One of our client’s development managers is interested in speeding up their testing, and I’m hoping to use this post to lay out some ideas and approaches to improve the testing procedures in this system.
I’ve been part of an integration project for the past couple years that validates, routes, and processes financial transactions coming from an external partner of our client’s all the way to a very large 3rd party hosted in our client’s environment. We’re in the middle of some significant changes in the integration to that 3rd party application that is going to trigger a round of regression testing of the entire system — and that’s where this post comes in. Testing this application has been very challenging and extremely time consuming. Any opportunity to make regression testing be quicker and more effective is going to make everyone’s jobs easier.
It’s not just that the testing itself is slower than desired. Because the testing is slow and not easily repeatable, the development team can’t really do much technical improvement through refactoring as they learn more about the system behavior and how the code structure is working out over time. That’s been a definite negative for code and architectural quality.
Before I get into the details of the existing system, know that what I’m showing and discussing here is a bit of an idealized version of how I wish we had architected the system and what we’ve recommended to the client for the longer term. The real system is a bit messier and significantly harder to test than what I’m presenting here — but there’s a lesson for you, testability should be a first class architectural goal in many cases (and Conway’s Law is legitimately something to work around).
From a 10,000 foot level, here’s the entire system:
The workflow is:
- A couple times a day, a new flat file containing new transactions will be dropped into a file share
- The File Reader console application is executed to find this file, parse it into little transaction messages, and publish those messages to Rabbit MQ. There’s a little bit of database tracking going on for reporting and just general activity tracking.
- Rabbit MQ publishes the transaction messages to the subscribing Transaction Processor application (an ASP.Net Core application with an active subscriber for these incoming messages).
- The Transaction Processor handles each transaction message by:
- Pulling in a helluva lot of information from the 3rd Party Application and other information from a Configuration DB related to the account number in the transaction message
- Using the information from the previous step to validate whether the transaction can be processed normally, or has to go into a queue for manual resolution
- For the valid transactions, use the information from step #1 to decide how the money in the incoming transaction will be applied (routing to sub-transactions)
- Send the routed sub-transactions from the previous step to the 3rd Party Application through its externally facing API.
While there are some unit tests and intermediate level integration tests today on some of the subsystems, the overall official testing effort to date has relied strictly on end to end, manual testing of the entire system. Some of the emphasis on black box, end to end testing is due to our client’s mandatory regulatory auditing requirements and that can’t completely go away. However, there’s worlds of opportunity and new willingness to explore other alternatives like white box testing techniques or new processes for testing as a complement to the formal audit-style testing, so let’s jump into some ideas for making things work more efficiently.
Some Necessary Shifts in Testing Philosophy
First off, there’s an important shift from trying to prove that the system is working perfectly with strictly black box testing to thinking about testing as a feedback mechanism to identify and remove problems in the code so that the code can be deployed to production. If you look at testing as more of a feedback cycle, you can utilize the testing pyramid idea to maximize feedback about how your system functions with more efficient testing techniques.
Secondly, I think you have to have collaboration between testers, developers, and architects to make white box testing more effective. Part of that is increasing the testability of the system architecture, and another part of that is trying to avoid duplication in effort between tests written by developers and other tests performed by the testers. Moreover, if developers are actively engaged in writing tests — and they should be in my world view — it’s very helpful to have the testers involved in the content of those developer-written tests. In other words, I think that having strict separation between testers and development can be very inefficient. I know there are folks who strongly believe that strict independence for the testers from the developers is necessary, but I think that does more harm than good.
For more information on whitebox vs. blackbox testing and improving test feedback, also see:
- Jeremy’s Only Rule of Testing (about choosing testing techniques to maximize feedback)
- Succeeding with Automated Integration Testing (for a discussion on white box vs black box testing)
- Martin Fowler on the Test Pyramid
If you’ll buy any of the two previous paragraphs, or you’re at least open-minded to continue, let’s see how some of this Test Pyramid thinking would play out in our big integration system.
At a high level, I would want the testing strategy to focus on:
- Some kind of Behavioral Driven Development approach for all the business rules for validating and routing the transactions.
- Mid-level integration tests on all the code that acts as a gateway or service proxy to the 3rd Party Application. This would include both the code that sends commands to the 3rd Party Application and the code that queries or reads the 3rd Party Application.
- Mid-level integration tests on the File Reader that probably stubs the outgoing Rabbit MQ and just measures how the File Reader parses the incoming files, writes tracking information, and what messages it publishes.
- A handful of fully end to end tests through the entire system to prove out all the integration points — but by and large you use finer grained tests to test out business rules and the integration with the 3rd Party Application.
The Transaction Processor
Most of the meat of the bigger transaction processing project is within what I’m calling the Transaction Processor shown below in a little more detail:
There’s a couple big responsibilities here:
- Querying data from the 3rd Party Application with its heinously unusable, custom Xml query language to use inside the business rules
- Look up some configuration parameters about accounts from a second Configuration DB
- Carry out validation rules against incoming transactions
- Route the incoming transactions into sub-transactions based on business rules
- Post the sub-transactions to the 3rd Party Application with its, shall we say, interesting XML API.
Channeling some Domain-Driven Design thinking here, let’s go straight into the business rules for validation and routing. The business rules required a lot of input parameters, there were a lot of permutations to build and test, and the developers new to the problem domain had plenty of misunderstandings early on about the desired behavior.
From an architectural standpoint, I think it is extremely important to completely isolate these business rules from the 3rd Party Application, the configuration databases, and even the incoming flat file format because:
- It was very difficult to set up test scenario inputs in the 3rd Party Application
- There’s a tremendous number of test cases because of the permutations on account state and transaction parameters involved, so there would be a large benefit to tests being quick to author and execute
- This logic is key to the business and has already evolved significantly since this project started. It’s imperative that this logic be safe to change over time, and that happens most effectively when it’s cheap to write new tests and quick to execute the existing test coverage.
- I probably shouldn’t say this too loudly, but I think this client should reconsider coupling their ecosystem to the 3rd Party Application
To that end, the business rules should only depend on a domain model that’s internal to the Transaction Processor. We’ll use the A-Frame Architecture idea from Jim Shore’s Testing Without Mocks paper to isolate the business rule behavior from the infrastructure. The domain model objects that implement all the business logic will have no dependency whatsoever on the external dependencies. Instead, we’ll effectively write our own mapping layer to take the data returned from the 3rd Party Application and the Configuration DB and build all the state the domain model needs, then hand that to the business logic code in the domain model.
From the perspective of testing, there’s a lot of opportunity to get the business rules wrong. Rather than depend solely on design or requirements documents, I strongly recommend using Behavior Driven Development (BDD) techniques here to author executable specifications that are readable and reviewed (if not written) by the business domain experts and testers. What I largely recommend here is that the developers mostly write the test harness code, but business domain experts and more likely testers will own the content and meaning of the BDD specifications. Working in this manner, we should be able to treat the BDD specifications as the official tests for the business rule behavior even though this doesn’t run the entire process.
So that handles the business rules, now on to the rest of the Transaction Processor. The “controller” code in the diagram is playing a coordination role to mediate between the business rules and the code that interacts directly with the 3rd Party Application’s external API endpoints. I’d mostly use unit tests and maybe even *gasp* interaction testing with mock objects to test out the workflow and error handling of this code.
The service gateway code that interacts with the 3rd Party Application was extremely problematic in both development and testing. In retrospect, I wish we’d hammered at this code in isolation much more before even bothering trying to run end to end tests. The big issue we never pushed through (yet) was how to establish known system state in the 3rd Party Application so that we could write reliable automated tests around just the service gateway code in the Transaction Processor. I think it would be worthwhile for domain experts and/or testers to be involved in this step as well to verify the expected results are really happening in the 3rd Party Application.
Lastly, I’d opt to do some bigger tests for just the Transaction Processor where you directly enqueue the transaction messages in Rabbit MQ and test the entire Transaction Processor stack all the way down to the external dependencies. The point of these tests are to prove out the integrations and configuration. You don’t try to recreate all the business rules functionality tests covered by the smaller, faster unit tests.
“Some” End to End Tests
There are absolutely some issues that can only be tested through true, end to end tests. Integrations, configuration, environments, and security are examples. We’ll still write and perform some end to end tests, but we won’t try to recreate the business functionality tests covered.
No matter what though, the tests need to be as easily repeatable as possible so there’s still going to be a level of automation to speed things along. Here’s my thoughts on what that might look like:
- The flat file format was originally used by mainframe applications, so as you can imagine, it’s not remotely user friendly to edit or read. I’d suggest using some custom code that can transform a much simpler format to the mainframe-friendly format so the testers can write new test cases more efficiently and everyone else can actually read and understand the test inputs
- The undeniable, cardinal rule of automated testing is that you have to have known inputs and expected outcomes. In this system, that means being able to set up the 3rd Party Application in a known state for each end to end testing scenario. The failure to do that (not a technical impossibility, but it’s a long story) is my single biggest regret from this project. See My Opinions on Data Setup for Functional Tests for more on what I recommend for test input data.
- Automating the testing of asynchronous workflows like this system can be very challenging. The biggest issue is making an automated test harness understand when the work is really done across multiple systems so it can proceed to the “assert” part of the standard “arrange, act, assert” test workflow. I’ve had some success with this in the past by making the test harness listen to the various application logging or some kind of visible side effect like data being written to a database to “know” when the work is complete.
- Tests do fail from time to time, so I’d actually try to have the end to end test harness able to gather up the relevant logs for all the systems active in the test. That’s even more valuable if you can somehow manage to correlate the logging activity with only the active test run.
- Finally, the big expensive end to end tests my client has to follow for official certification and auditing? Yeah, you have to do that, but my very strong recommendation and where I think they’re starting to head is to use finer-grained and more efficient testing techniques to remove problems first. Then come back and do the laboriously slow audit tests when you can justifiably expect success with few iterations.
There’s a couple big points I wanted to drive home in this post:
- Embrace the test pyramid idea, and try to get over any aversion to white box testing because of its advantages for efficiency
- Treat testing as a feedback mechanism more than a certification process
- Tests of all type need to be repeatable to be effective feedback. Manual testing, and especially manual testing where it’s time consuming to set up the necessary system state first, is not very repeatable
- I think you need to embrace the Agile idea of blurring the lines between roles. Developers and architects need to be involved in the automated testing for a better chance of success. Testers may need to get their hands dirty directly in the code or at least exploit their knowledge of the coding internals in order to make the testing more efficient
- Developers, testers, and architects need to collaborate to be truly successful in testing. Waterfall style testing where all testing happens at the end is just not the way to be successful
- Try to avoid duplicating effort between developer written tests and the tester activity, which might be just yet another way of saying the testers and developers need to be collaborating as the project goes on
- Feedback cycles of all kinds are valuable for quality software
One thought on “A Small Case Study in Test Automation (and other things)”
> The failure to do that (not a technical impossibility, but it’s a long story) is my single biggest regret from this project
Not long ago I was lead for an enterprise project with a ton of integrations, including a legacy, in-house, monolithic ERP system (parts of it were almost 50 years old!) – my biggest regret was the lack of a decent, consistent, repeatable test story for integration and E2E tests.
It would have been a lot of effort to get something working “properly”, but IMO it would have been well worth the effort. Unfortunately I could never get buy-in from others.
Would love to hear more about how others tackle this!