Saturday, March 29, 2014

Testing progress reporting

I was trying to get some code under test.

Originally it had been written as one long method. It was a slow process, so another programmer had added progress reporting. He wisely created a new, decoupled progress reporting service, with an interface like:

Loading IProgress.cs from Gist 9867358

My first step was Compose Method: Applying Extract Method repeatedly until the code read roughly like English.

Loading Example1.cs from Gist 9867358

With the exception of the progress reporting, there's no point in writing a test for a Composed Method. The test would just repeat the code, violating DRY. Test by inspection.

But what about the progress reporting? How do I test it? Some options:

  • Don't test it.
  • Test manually.
  • Use an integration/end-to-end/system test.
  • Mock out the Do*() methods in a unit test.
  • Change the design.

Each option has pros and cons; each has a context in which it is the right choice.

TDD teaches us that difficult-to-test is a code smell; it's design feedback. All other things being equal, I'd rather choose the last option and improve the design.

There are several ways to think about the problem that could lead us to a new design:


  • The number "4" violates DRY with the (often implicit) parameters to ReportWork. How can I test that it stays correct when editing this code in the future?
  • Most difficult-to-test problems are solved by decoupling. What could I decouple here?
  • This method is responsible for both doing the work, and reporting on the work. Can I write those separately?
  • Think about the difficult-to-test aspect of the code; what if I made a new class that did only that, free of any context?

Suppose there was a class that was responsible for calculating & reporting the work to be done. You could just pass it a list of steps, with costs, and let it calculate the sum. The result might look like:

Loading Example2.cs from Gist 9867358

Since lambdas are capable of capturing locals, it's easy to use the output of one step as the input of another:

Loading Example3.cs from Gist 9867358

Consequences

The code is far easier to test:

  • DoWork, etc. are more testable now as separate methods, since they were extracted.
  • The ActiosnWithCostsList class is trivial to unit test. (You might not bother... up to you.)
  • The ExecuteWithProgress extension is non-trivial, but still easy to unit test.
  • The original code is now very simple, and should be tested by inspection.

I would like a lightweight end-to-end test to hit this code, just to make sure I haven't missed anything. Maybe that's just because I'm still new to TDD and haven't developed the confidence in my TDD skills yet.

If you're not used to coding this way, it may feel surprising. Once you understand it, it may seem clever. Those are both feelings I want to avoid in my code. (Well, I do like looking clever, but it's still a code smell.) I believe that any competent C# programmer can spend a few minutes reading my tests and understand what's going on here.

The 4 rules place "passes tests" above "expresses intent", so I think it's worth keeping until we find a better solution.

Next steps

You wouldn't be reporting progress if the operation was fast. Since it's slow, you may want to free up the current thread while waiting. That means async/await. Extend this solution to make that easy.

If you call a method that could report fine-grained progress, it would be nice to pass it a child progress. Something like:

Loading IProgress2.cs from Gist 9867358



Sunday, March 23, 2014

sorting out complex refactoring with the Mikado method

My understanding of the Mikado method:

Problem: You see a mess in the code so you decide to fix it. Part way through you realize you gotta fix some other thing first. So you start working on that, which leads to more.

Now your code is all torn apart and you're late for dinner and nothing works.

Solution: When you realize you can't finish problem A until you fix problem B, write down A and abandon your changes so far. When B leads to problem C, write down B and abandon those changes, too.

You're discovering a dependency tree. Eventually you'll get to the leaf nodes. When you fix a leaf node you don't have to fix something else at the same time. So fix it and commit. Now you've accomplished something.

Each commit prunes the tree. Eventually you get back to your original problem, which you can now safely fix without further difficulty

I think this is applicable to other problems besides software, like organizing your kitchen or car maintenance.

Sunday, March 9, 2014

Conways Game of Life in Gherkin

I am trying to learn Gherkin, and needed a problem to practice on, so I chose Game of Life. I think my current result is interesting enough to post.

Loading Gist 9454787

I'm struggling at the moment trying to figure out the right level of detail and correct organization in the tests. I'm curious what I'll think of my attempt a week from now / a year from now.


Saturday, March 8, 2014

Stages of TDD

I've often heard people say this sort of thing about TDD:
I really believe in the importance of TDD. It helps me catch stupid bugs right away, instead of waiting for testers or customers to report them. I especially like that it catches any regressions.
TDD lets me refactor safely. Also, writing tests first help me think about my design from the caller's perspective.
I still need a comprehensive suite of end-to-end tests to make sure the whole system works. I am concerned that, since I'm writing both the code and the tests, my blind spots will appear in both, allowing bugs to slip through. 
When requirements change, or we refactor a subsystem, a thousand tests break. Then we have to spend a lot of time fixing them.
TDD is expensive, but it's worth it.  
While others say something like:
TDD is not a testing activity; it's a design activity. "Test" is a misnomer. With TDD I write better code, faster. 
I don't need a bug database. My whole team only has a couple bugs per month. We do root cause analysis on every bug.
From the first perspective, the second sounds weird. It's hard to believe it's even possible. Maybe it only works in that specific context.

From the second perspective, the first sounds backwards and unenlightened. Why don't they get it?

I now believe that learning about TDD takes time and moves through stages. From the early stages, it's hard to fully understand the later stages. From the later stages, the you wonder why you ever wasted your time in the early stages.

The stages that I see are:
  1. TDD is about testing. I write tests to prove that my code works. Thanks to the safety afforded by those tests, I refactor more.
  2. I commit to 100% TDD. Every feature and every bug fix will be represented in tests. I immediately discover how hard this is, which leads me to mocks, dependency injection, marking methods as "public" for testing, etc.
  3. Tests give me feedback. I see "hard to test" as a code smell. I refactor to make things easier to test. Tests get easier and code gets cleaner.
  4. My tests are always easy to write, easy to read, and super fast. Code is clean. Every idea has a single canonical location. There is no duplication. Classes have great names. Cohesion is high and coupling is low.

    There are no bug farms, no part of the code I'm afraid to touch. Working in this code is inherently low risk. Since I rarely create bugs, testing for correctness is rarely fruitful. So yes, tests got me here, but not by catching bugs.

I found #4 very hard to grasp a year ago. I'm not sure if seeing this roadmap would have helped. Perhaps the right way to begin is just to focus on #1: write tests to catch bugs. Make developers responsible for proving correctness of their work. Don't count a feature as "done" until the bugs are gone. Build from there.

There may be an additional stage in the middle. If you know it, tell me and I'll fill in.

I'm really curious if there's a stage 5. Something I'm blind to. Got any?

Various definitions of "Refactoring"

In conversations with other programmers, I have heard people use "refactoring" to mean four different things.

0. (Not refactoring). Making the minimum necessary change.

This is often expedient and may feel safest, where safety = not breaking existing functionality, and not getting yelled at.

1. Doing more than the minimum necessary change.

You could hack in your bug fix or new feature. Maybe that's hard because the code is already convoluted. Or maybe your hack would make the code convoluted. So you clean things up a bit at the same time.

2. Cleaning up code without changing existing behavior.

At least, you hope you're not changing behavior.

3. A highly disciplined process of known-safe code transformations.

Within a method body I feel confident that I can rename a local variable with Search/Replace if I first check that the new name isn't already in use.

Feather’s book (aka WELC) includes some highly-detailed methods of doing this to legacy code to get it to the point where you can start writing unit tests. By detailed, I mean tedious.

4. Using a mechanized refactoring tools.

For example, select a line of code, RClick, Extract Method.

Only #3 and #4 are safe enough that I do them without fear.

My preference is to do #4, over and over again. (Also, I commit each one separately.)



Some will say that #1 and #2 aren't "true refactoring". I find that arguing about definitions to be counterproductive and off-putting. Each of these activities have value in some context, and each are worth discussing. However, it should be noted that Martin Fowler is one of those people.