In this series of posts, I’m going to cover my views on software testing. I’ve spent a lot of time recently re-factoring some fairly large test suites, and would like to share some best practices I’ve learned a long the way.
In this first post, I’d like to cover some high-level testing strategies and my philosophy towards testing. In subsequent posts, I’m going to dive deeper into testing in each part of the stack.
The Test Pyramid
No conversation about testing is complete without talking about the test pyramid:
The test pyramid should guide your approach to writing tests. It shows you where to focus your time. The most important takeaways from the diagram are:
- The higher you move up the pyramid, the slower the tests run. This translates to longer feedback cycle times. Longer feedback cycle times translate to increased time tracking down the cause of failures because they will generally occur further from the source of the actual problem.
- The higher you move up the pyramid, the more expensive tests become to write and maintain. As you move up the pyramid, the elements of the system you are testing and their interactions become increasingly more complex.
- Having more tests in a lower level than a higher level is an aspirational goal. It’s highly desirable, but not always possible. Write the test where it actually fits. More on this below.
In terms of the types of tests at each level of the pyramid there are different opinions of what each type encompasses. Let me summarize my definitions. I will delve deeper into these definitions in future posts.
- Unit tests – test the smallest unit of code – usually a function, class, or data structure. The most important aspect of unit tests is that they are fast. Blazing fast. Stay tuned for part 2 of this series for a more in-depth discussion on unit testing.
- Component Tests – A component test is an integration test. It’s purpose is to isolate the correct function of a single component. You should probably have more component tests in your test suite. Stay tuned for more on component tests in part 2 of the series.
- Integration Tests – An integration test generally involves testing the interaction between two or more components in the system.
- System Tests – test an instance of (or part of) your actual system. They will generally correspond to a test environment configuration of your system. As such, these tests can be expensive to bootstrap and slow to run. This makes them as hard (or harder1) to troubleshoot than your real system. So be careful how much you rely on them in your test process. Browser tests are a type of system test which focus on complete end-to-end tests from the perspective of an end user. They should be limited to testing scenarios that cover multi-step or multi-page workflows that cannot be effectively tested in other ways.
Using the Test Pyramid
The test pyramid is instructive, but I think it’s often misunderstood. The pyramid is best used as a guide for order of preference. Basically, you should seek to cover a test case starting at the lowest applicable level of the pyramid.
For example while it is more desirable to cover a particular test case at the unit test level it’s not always possible or applicable. If you can write a meaningful unit test, do so — but only if it actually fits the unit test definition. Otherwise you risk ending up with a slow unit test that doesn’t add much value, or worse – fools you into thinking a particular case is adequately tested.
Another diagram which I think more accurately reflects many real-world projects is what is sometimes called the Test Diamond:
Many projects are web-based apps. They typically talk to a database, index data in a full-text search, have API’s, and probably some back-end infrastructure to run jobs asynchronously. In these types of projects, relative to the whole code-base, there probably isn’t going to be a lot of algorithmic code. It’s mostly putting components together with small pockets of tricky logic, and lots of configuration wiring everything together.
In this type of application, the diamond makes much more sense. Sure, you’ll have some unit tests, but your bread and butter tests will be integration tests. I’ll have much more to say about integration tests in part 2 of this series.
Identifying Critical Tests
So you agree, testing is important. You and your team are all jazzed up about writing tests! Over time, you will probably end up with lots of them. This can be good, or bad.
More tests doesn’t necessarily mean better test coverage. I’ve seen non-trivial percentages of tests in a code-base be either redundant, ineffective or downright useless. Tests aren’t free. They bring a running time & maintenance burden. Make sure your tests are providing a positive overall ROI2.
One thing that helps wrangle a massive test suite is defining a critical subcategory of your tests. In particular for the slower categories of tests, (integration and above) critical tests are great at helping provide faster feedback on critical functionality and systemic problems3. The definition of critical depends on your organization, but would commonly include the following:
- Functionality connected to SLA’s
- Functionality, that if broken, would constitute a significant degradation of the service to your users. Typically the basic fundamental use cases of your system. Not all features are created equal. It’s logical to prioritize a cross-section of super important functionality so you know early if you broke it in later stages of your CI pipeline.
- legal or regulatory requirements
- Things involving money
That’s it for this introductory post. Stay tuned for the next part in the series, where I’ll be diving into unit and integration testing.
1 For example, how good is the logging and monitoring on your test system? It typically lacks the robustness of a production system.
2 I first heard a discussion of tests in terms of ROI in Working Effective With Unit Tests, and it really hits the nail on the head. I really like the idea of thinking of tests in terms of ROI, because it really helps frame how you approach tests, in particular, legacy tests.