Sunday, September 13, 2015

Unit testing microskills

In response to my Why we Test posts, George Dinwiddie had this to say:
The connection between why and how is important, but the details are not obvious. I'll pick a few values that people hope (unit) tests might offer, and give my thoughts on how to practice testing to deliver this value. (This is certainly not a complete analysis of the subject.)
prevent regressions due to future work
Most people pick up on this one right away: as long as you can get a green bar before making changes, and another green bar when you're done, your tests catch bugs before they get checked in. Great!

Speed, readability, and granularity of tests aren't as important as good coverage. They don't even have to be unit tests - any tests will do. Reliability with a clear pass/fail result is important, so that bug-induced test failures actually get recognized.

If a piece of code is a completely obvious expression of a business requirement, you still need to write a test for it, since the tests call out the intentional behavior.

"prevent regressions" does not appear to require test-first. In fact, teams that focus on this value tend to write many of their tests afterwards. Because the code isn't written for testability, it's hard to test (duh). Either we don't bother testing it, or we bend over backwards writing horrible tests that are hard to understand, and lock down implementation details, making future refactoring harder.
a safety net during refactoring
Readability and granularity of tests aren't as important as good coverage and speed. Slow tests mean you won't run as often, which means you won't catch mistakes as quickly, which makes refactoring more expensive. That changes the cost/value/risk equation for refactoring, so you won't refactor as often.

Test speed includes any time spent analyzing results and rerunning flaky tests, so make test results obvious and rock-solid.

Many organizations are nervous about the risk of bugs from refactoring, even though they tolerate bugs from feature work. In that context, great coverage is particularly important for the refactoring safety net.

In an effort to improve coverage, teams that focus on the refactoring safety net will often test implementation details, including breaking encapsulation and injecting mocks to access those details. In the process, they lock down those details, making refactoring more difficult. That's Irony Number One.

Getting proper coverage, for both "prevent regressions" and "refactoring safety net" can be difficult. Applying the Three Rules of TDD is an effective way to get the coverage that you actually need. As long as you avoid testing implementation details, you'll necessarily have to decouple your code to make this happen. So you'll naturally end up with a code base that is at least moderately well-factored, even before you try to use the tests as a refactoring safety net. That's Irony Number Two.
make DRY problems visible
DRY problems become visible in TDD when you find yourself writing the same test repeatedly. My favorite example is file path case insensitivity in Windows. Consider:

    if (File.GetExtension() == ".cs")

There's a bug here: if the file is named ".CS" then I want the software to work the same as ".cs". I can fix it locally, by switching to a case insensitive string comparison. And I diligently write a test for it. But then tomorrow I write another file extension check in another piece code, and I write another test. I may end up with a thousand expressions of this rule, and (if diligent) a thousand corresponding unit tests.

The rule I'm trying to test here is "File extensions are case-insensitive". I want to have exactly one test that describes and enforces that rule. Which means that rule must be expressed in exactly one place. That's DRY.

The correct response to "I'm testing this idea multiple times" is "extract the duplicated behavior from all the places it's used, and merge them to one place, and test that one place."

Note that test execution time is irrelevant here; you don't ever have to run your tests to get this value! However, responding to this design feedback leads to code that is factored in a way such that tests are naturally very fast (Irony Number Three!).

Readability is important: you have to be able to read the test to understand what requirement it's describing, to be able to detect the duplication.

Granularity is important: tests must each describe exactly one requirement, or the duplication won't be visible.

DRY reduces bugs, as it eliminates the risk of updating only 999 of the 1000 places a rule is expressed. DRY (along with Great Names / Coupling /Cohesion) is far more effective at eliminating bugs in shipped software than tests that are intended to catch bugs. (Irony Number Four)




Saturday, September 12, 2015

Why we test, Part 8: Because we are competent professionals

#15 in my list of reasons why we (unit) test, which I learned from James Shore:
Refactoring without tests is inherently unsafe, because of the risk of introducing bugs. As a professional, I would never take such risks. Therefore, I would only refactoring when I know I have good tests. In this way, TDD makes refactoring possible.
I may not be representing his idea with perfect fidelity; for that I apologize.

My comments:
  1. There is a class of programming languages* for which there exist reliable refactoring tools. With these tools I can safely refactor even without tests.
  2. The reliable tools work by following a recipe. If a human follows the same recipe carefully, they'll get the same result. That would work in strongly typed languages that lack good tooling.
  3. Plenty of people who make their careers as programmers ("professionals") do sloppy work, but not those who are competent.
  4. The tests have to be good. If you only write tests when it's easy, they won't give you enough protection. The only way I know to get this kind of test coverage is if you strictly follow the Three Rules of TDD.
  5. When naive** TDDers aim for 100% test coverage, they go to extreme lengths in their tests, including bad mocks and test cases that don't correspond to any business value. These common problems lock down implementation, which makes refactoring far more difficult; the opposite of Jim's goal.
* It's C# and Java

** most programmers

*** mocking is fantastic for Tell, Don't Ask, and problematic without TDA.