It has been a while since our last edition of Why You Don’t Have Good Automation. I am hopeful today. On Monday, I did a presentation on the reasoning and purpose behind my detailed labeling of test cases. The gist of the presentation is that placing emphasis on end-to-end testing as if it were the only way to do automated testing was going to lead us to a limited return on the investment we put into writing all that sweet, sweet automation infrastructure. Once I replaced the older smoke test suite, I was happy to see the dev manager and my boss go all gooey-eyed over it, but I am bothered by the false sense of security that it gives. I am not saying that it is useless, but it has limited potential.
I am disillusioned where end-to-end testing is concerned. It is hideously difficult to build everything you need to implement the plumbing to even get to the point where you can do one end-to-end test. I have been through a couple iterations of this nightmare. The first time was during the Frameworks Wars when my co-worker and I toiled mightily to automate a suite of forty or so manual tests. There was just so much to do and we were slammed over and over again every few weeks with an avalanche of manual testing work for the application. We were also new to the company and not all that familiar with the surrounding, flaky systems that our application interacts with for each of these end-to-end scenarios. Combine this with our total lack of organized training with the some very unreasonable expectations, we were set up to fail and fail hard.
At the time, both my co-worker and I were operating under the same limited perspective as everyone else. End-to-end testing was where it was at and if you weren’t doing that, you were not doing Good Automation. So, we coded and coded, all the while being asked, “Where is all that sweet, sweet good automation?” We actually built a fairly extensive library of page objects which later became of the basis of the current page object library, but because we didn’t have the plumbing to get all the way from A to Z, it amounted to nothing in the eyes of the non-coding people around us. We made some mistakes such as thinking we could not throw our code out there unless it was ‘done’ and leaving the results reporting to the end. We should have had a parade up and down the hallways once a week, given tech talks and otherwise tooted our horns relentlessly. Our unpolished code was better than most of the test code I had seen up to that point. If we had a simple HTML report with green and red pass and fail indicators, I think the non-coding cadre would have believed we were actually doing something.
Our biggest mistake, however, was trying to do end-to-end testing first. If we had broken those long end-to-end test cases up into smaller, atomic test cases that could be strung together in end-to-end scenarios or run as standalone tests, we would have been able to report each week that we had automated twenty-five new test cases. It’s very easy to automate a test that verifies that a dialog opens when you click its accessor. It’s easy to automate a test that verifies that the title on the dialog is correct. It’s easy to automate a test that verifies that the dialog is dismissed if you close, cancel or submit it. Since our application had a gazillion dialogs, we could have written an entire test suite for verifying the basic behavior of dialogs. It’s not easy to verify that a given type of user can log in, access the appropriate accounts, create a configuration, fill out a complicated data entry form, save their configuration and then make it active. There are just so many interactions with the UI along the way, the chances that some flaky thing will happen to short-circuit the test are too high. Maybe one of the dialogs gets stuck, or a page gets stuck and doesn’t load. Given the sorry state of our test environments, this kind of problem arises every other test run, especially during periods of active and intense development. I won’t go into the details, but one of the systems that the application must interact with at the end of the long process for making a configuration go live is the bane of my existence. Imagine what it is like to kick off a test run that has a lower bound on it of ninety minutes to go from start to finish and it fails at the VERY. LAST. STEP. when that final sub-system in the whole pipeline chokes because of a common environmental issue.
The advantage of writing lots and lots of small tests that get into your app quick, verify a few small things and get out quick is that the chain of interactions with the UI is so much shorter. There are fewer opportunities for an unstable test environment to screw up your test results. And if a test does fail? WHO CARES. You just retry it a few times to see if it passes on a subsequent attempt. It’s easy to retry because this isolated little test lacks all those dependencies that a step in a fifty step end-to-end scenario would have. And why do you have to verify everything in one end-to-end chain? Why is it not sufficient to find a way to verify all the little steps in isolation? I’m not saying that you shouldn’t have some end-to-end test automation. There is a place for this kind of test automation. It’s just not terribly comprehensive. I know that sounds crazy. It’s an end-to-end test. By definition, that’s comprehensive, right? Nope. Nope. Nope. It’s anything but comprehensive. It represents one, very narrow and defined pathway through your system. If you want a comprehensive test suite and you want it to deliver results sometime in the next decade, you need a test suite that is composed of thousands of atomic little tests that are scattered across your system’s functionality.
This brings me back to the beginning — Why all these labels on the test cases? This is a way for categorizing all the tests so that it is easy to search for them. When you write a test suite of atomic tests, you will end up with hundreds, if not thousands of tests. Hence, you need to be able to easily find test cases. Second, these labels make it possible to dynamically specify at commit time which tests the developer would like to run. The end-to-end tests, given their limited coverage, long running time, and relative instability are not helpful for this scenario. The developer wants relevant, fast feedback. We also don’t have a giant Selenium grid at our fingertips, so we have to be conservative about how we use these resources. If there are eight commits in a day, the suite of end-to-end tests would overwhelm our resources very easily.
So, in short, if you are a regular worshipper at the idol of the end-to-end test automation god, you won’t have good automation. This god is a fickle liar who tucks you into bed at night with his minion, the Smoke Test Teddy Bear. Smoke Test Teddy Bear whispers sweet nothings into your ear, telling you if the end-to-end test suite passes, all is good. Don’t worry about the thousands of other possible user interactions with your application that it _didn’t_ cover. They are inconsequential seeds of evil doubt that the faithful believers must banish from their thoughts. Shut the closet door on those demons!