Friday, March 10, 2017

Three kinds of code

I propose a refactoring "Extract Open Source Project".

We build software systems to some purpose. But when I read code, I see that some of that code directly serves that purpose while other code does not. I see three categories:


This is the stuff you and your customers care about. It's the reason your software system exists.

In an e-commerce system, that's code that says "when a customer uses a discount code, the discount is applied to the order."

If you learn about code smells, great names, and duplication, and then refactor with those in mind, you'll find that some code is explicitly the feature and some that is not. That leads to:


Code that helps you write code, but has nothing to do with the problem domain you're working in.

It's often write in the middle of the rest of your code, but as you refactor to improve readability and reduce duplication, it can become visible. For example, consider this refactoring sketch:

If you refactor mercilessly, you'll end up with a lot of this stuff. It's not part of the value you offer, and it would be useful to the programming community. Factor it out to be an open source project and share with the world.

Some examples of this are Boost in C++ and Rails in Ruby.

Domain Libraries

You'll also have some code that is specific to your domain, but is not the feature you're creating. This is code that lets you describe your feature. A library for building features in this domain. Maybe a DSL.

In that e-commerce system, it might be class like "Money".

While this is not the value you are offering, it is key to offering that value. You get to decide whether to release it as open source (so other people can build more systems in the same domain), or keep it under wraps (so your competition has to build their own).

Monday, February 20, 2017

Test-as-spec and assertion syntax

I like to say that tests should, first and foremost, be a human-readable spec. Let's look at what that can mean for how we write assertions.

Suppose we're writing a card game, and we want to assert that a deck of cards is sorted the way you'd find them when you first open the box. (I'm using this simple example as a proxy for the kinds of more complex problems that we see in legacy code. It's up to you to map these ideas to that context.)

An approach I see in a lot of code is to iterate over the cards to assert. Perhaps something like:

This kind of code makes it obvious that an AssertEquals would be valuable, so that on failure you can see the expected and actual values in the test results.

If this test fails, you only know about one incorrect card. If there are more, you won't know until you fix the current error and rerun the test.

A richer assertion library might offer AssertSorted. It could even take a set of 1 or more sort key selectors. The result might look like:

(That's C++ lambda syntax, if you haven't seen it before).

Both of these approaches are "computer science" solutions - they work in the solution domain, and use the language of computer code. If I want my test to be a human readable spec, I need to use the language of the problem domain. I could take a step in that direction by extracting a method, giving:

But we're also doing TDD. In TDD, we want the tests to give us feedback about the design of the code. And this test is saying "the notion of being sorted that is missing from the code under test". Taking an intuitive leap, the class that should hold that notion is a "deck of cards", which is also missing from the code under test. That leads to:

I like the improvements to the design of the code and the way the test reads, but I am sad to lose the ability to provide a detailed report when this assertion fails. I'm not sure how I would fix that, or if it would ever actually matter.

It's interesting to me that we're back to the bool-only assertion from the first example.

Saturday, February 18, 2017


I strive to make all my tests be both microtests and acceptance tests, an idea I learned from Arlo Belshee.

When I say this to people, they are usually confused first, then doubtful when I explain. I don't think I'm ready to address the doubt, but maybe I can address the confusion today.


Coined by GeePaw Hill (see his article for his original definition), a microtest is like a unit test, but it has all the qualities I wish all unit tests had. It's fast and focused. It answers the question "does this little piece of code do what I intend it to do?"

Because microtests talk directly to the system under test, they are written in terms of the SUT.

It's obvious that a microtest can only be used on parts of the code that are simple and decoupled and isolated. An integration test is never a microtest.

Acceptance Test

An acceptance test describes expected software behaviors from the point of view of a user or other stakeholder. It answers the question "does this system meet the requirements I expect of to meet?"

Because acceptance tests are written in conversation with that user, they are written in the language of that user. They are organized like a spec.

My ideal tests

I want tests that hold to all of the above. My ideal tests are super fast, 100% reliable, simple, isolated, written in the language of the user, and easy to read. (This requires the SUT to be decoupled, cohesive, well-named, and DRY - properties I already want.)

Every test is both an acceptance test and a microtest.

The doubt

The usual objection I hear is "while isolated unit tests tell you about each of the little pieces, you still need some kind of integration test to confirm that all the parts work when you put them together." 

Well, that's true for most programs, but it's only necessary because of how your code is organized. "parts work when you put them together" means "the desired behaviors of the program (the acceptance criteria) are emergent properties of the system". But we know how to refactor. If two parts of the system need to work together, we can put them together in the code, and then use a microtest to assert that desired behavior.

Friday, February 17, 2017

AONW2017: Amazing Distributed Teams

My new job involves teams that are distributed over a bunch of locations, mostly on the West Coast of the USA. Each team has people at multiple locations.

I went to Agile Open NorthWest 2017 with the question "How can we make distributed teams awesome?" Here's what we came up with:

There are a bunch of known good practices to help distributed teams not suck too much. Doing them won't get us to "awesome", but at least we can get up to "not sucky". So let's start by writing down these practices:
  • Communicate a lot
  • Don't let remote people be 2nd-class. Make everyone equally remote, even if some are in the same building.
    • Chat room for all communication, even within an office
  • Experienced people who don't need to learn as much are at less of a disadvantage when remote*
  • Because remote pairing is more tiring, be deliberate about taking breaks. 
    • use Pomodoro
  • Do your homework before coming to meetings, so you don't need to rely as much on awkward VTC communication
  • Show up to meetings on time
  • Don't let people get blocked on questions. If someone raises a question in chat, don't leave them hanging.
  • Have the whole team mob for 1 hour to start the day, to get alignment on the day's work
  • Synchronize time of work
  • Meet face-to-face regularly. Have a budget to bring people together.
  • Many companies save money by having people work from home. Direct some of that savings to equipment, travel, etc. 
  • Consider paying out-of-pocket to make remote more awesome, and then ask for reimbursement if it helps.
  • Remember that patience online is short, and accommodate that fact.
  • Create team agreements - they're at least as important as for teams that sit together
  • Recognize the expression of Conway's Law: limited communications affect software architecture
  • Telepresence robots can help. Make sure they are human size (short robots get treated like children)
  • Retro often, with a relentless focus on the things that make remoting difficult
  • Build the team
    • Play games together remotely (poker, Halo)
    • Friday beers in VTC
Yet-unsolved problems that generally make remote work suck:
  • There is no good remote whiteboard
  • Estimating remotely is particularly bad (plug for #NoEstimates)
  • When tools/tech stop working right when we need them, we have a bad time
  • It's hard to influence the org beyond the team / hard to influence culture
And then we looked at advantages to remote work / distributed teams - how they can be better than teams that sit together:
  • 50 people can write on a Google Doc at once, while only a couple people can write on a whiteboard at once.
  • Better ergonomics are possible. No crowding around a single screen.
  • Remote breaks are real breaks. When you step away, no one can reach you. Go outside!
  • You have access to a broader pool of talent.
  • It increases diversity, even compared to the same people sitting together
  • Enables a 24-hour development cycle
  • Can accommodate people in new ways. For example, a person with a partial hearing loss can turn up their headphone volume, instead of asking everyone to remember to speak up. 
  • Can accommodate varying communication styles
We measure how awesome a team is with two questions:
  1. Are we delivering (steadily increasing) value?
  2. Are people happy?
*I don't think this is true, but it come up in the session, so I put it here in the list.

Thursday, February 9, 2017

Safely extract a method in any C++ code

Let me set the scene:

You're reading some code. Some old, gnarly C++ code. The method you're looking at is 5000 lines long. It's clearly really important. You need to understand it, but it's such a mess.

You notice a 1000 line block of code that seems to stand apart from the rest in some way. Your intuition as a programmer says "this chunk of code is different from the rest of the method, and it might make sense as a new method."

If you could Extract Method, that would help. It would let you shrink the problem space down as you endeavor to understand the code. The parameter list would make it obvious which variables are relevant to this block, and which are not - the data flow would become more clear.

Maybe you use CLion, or ReSharper, or Visual C++, or Visual Assist, and you try its built-in Extract Method feature. But these tools are far from perfect. They often introduce a behavior change, like copying a parameter instead of passing by reference. If you're lucky you get a compile error and back up; if you're unlucky, the behavior change is subtle and you've just introduced a bug.

With that in mind, I offer this recipe for C++ Extract Method. Follow the recipe exactly, and you can be sure the code will work the same after as before, or the recipe will kick you out, saying you can't extract this method. You won't have to do a bunch of careful analysis - we'll lean on the compiler for that. (If it ever does introduce a behavior change for you, that's a bug in the recipe. Let me know and we'll fix it.)

Thanks to Llewellyn Falco and Arlo Belshee who said the right things at the right time to make this recipe pop in to existence, and my coworkers at Tableau Software for filling in a bunch of details. They deserve 99% of the credit.

The recipe.

If you get to the end of step 1, the refactoring is possible - it will produce a valid result.

This recipe only works on whole blocks (surrounded by braces) or a single for/while/if statement. Consider using Introduce Block to get braces around the code you want to extract.

You need C++11 or later. You only need it for the duration of this refactoring, for this one file. So if you can't upgrade your whole project right now, that's fine. Get a C++11 compiler running on this file, refactor, and revert the tools.

Each step is a safe micro-refactoring. You can check in at the end of each step.

The underlying principle is Tennent's Correspondence Principle.

1. Introduce a lambda

Surround the block in question with:

 [&]() {
  // original code

If you haven't seen this syntax before, check out the C++ lambda docs.

Compile the file. Possible errors:
  • not all control paths return a value. You have an early return. Back up and either Eliminate Early Return/Continue/Break or extract something different.

  • a break/continue statement may only be used within.  You have a break/continue. Back up and either Eliminate Early Return/Continue/Break or extract something different.
Check the new lambda for any return statements. If there are any returns and it's obvious that all code paths return, then add a return statement. If there are any returns and it's not obvious that all code paths return, then back up and either Eliminate Early Return/Continue/Break or try extracting something different.

2. Introduce Variable on the lambda

 [&]() {
  // ...


 auto Applesauce = [&]() {
  // ...

Compile to make sure you didn't typo.

3. Set the return type

Set the return type on the lambda (even if it's `void`). In Visual Studio, the tooltip over `auto` will tell you the type.

 auto Applesauce = [&]() -> SOMETYPE {
  // ...

Compile to make sure you got the return type correct.

4. Capture explicitly

Replace `[&]` with `[this]` (or `[]` in a free function) and compile.

For each error about a variable that must be captured:
  • Copy the variable name
  • Paste it in to the capture list, prefixed with `&`
  • Repeat until green.
 auto Applesauce = [this, &foo]() -> void {
    // something with foo...

The order of the capture list will influence the order of the parameters of the final function. If you want the parameters in a particular order, now is a good time to reorder the capture list.

5. Convert captures to parameters

For each captured local variable (except `this`)
  • Go to the definition of the variable
  • Copy the variable declaration (e.g. `Column* pCol`)
  • Paste in to the lambda parameter list
  • Make the parameter const and by-reference
  • Remove the variable from the capture list
  • Pass the variable in to the call
  • Compile.
 Foo foo = ...
 Bar* bar = ...
 auto Applesauce = [this, &foo, &bar]() -> void {
    // something with foo and bar...

 Foo foo = ...
 Bar* bar = ...
 auto Applesauce = [this](Foo& foo, Bar*& bar) -> void {
    // something with foo and bar...
 Applesauce(foo, bar);
Note: even pointers must be passed by reference.

6. Try to eliminate `this` capture

  • Remove `this` from the capture list
  • Compile
  • If the compile fails, undo

7. Convert lambda to function

If `this` is captured, use 7A.
If `this` is not captured, use 7B.

7A. Convert this-bound lambda to member function

  • Cut the lambda statement and paste it outside the current function
  • Remove `= [this]`
  • Copy the signature line
  • Add `SomeClass::`
  • In the header file, add the function declaration in a private section.
  • Compile 
 auto SomeClass::Applesauce(Foo& foo, Bar*& bar) -> void {
// ... };

Note: this is using new C++11 syntax for functions. You may want to convert to the old syntax, like this:

 void SomeClass::Applesauce(Foo& foo, Bar*& bar) {
  // ...

7B. Convert non-this Lambda to free function

  • Cut the lambda statement and paste it above the current function.
  • Remove `= []`
  • Wrap it in an unnamed namespace
  • Compile
If the free function uses typedefs/aliases or classes nested in the original class, convert the free function to a private static function of the original class (7A).


namespace {
 auto Applesauce(Foo& foo, Bar*& bar) -> void {
// ... };

Note: this is using new C++11 syntax for functions. You may want to convert to the old syntax, like this:
namespace {
 void Applesauce(Foo& foo, Bar*& bar) {
// ... };

Tuesday, October 18, 2016

Pinning Tests

I wrote this on the C2 Wiki, with the hopes that other people would help improve it. But now that site is down, so I'm posting it here:

Definition: A simple-minded automated test that locks down the behavior of existing code that otherwise is not well-tested, as a safety net while refactoring.

Example: Run some code and collect logs as a baseline. Each time you make a change, run the program again and compare the logs against the baseline. As long as there is no difference, you have some confidence that things are still working.

Pinning tests can make it safer to refactor. (Pinning tests can never make refactoring completely safe, because you'll forget important cases in your pinning tests. For safety, use #3 or #4 from Various Definitions of "Refactoring"). Pinning tests are a safety net, just in case.)

The most important features of pinning tests are:

  1. Give an obvious, definitive pass or fail result.
  2. Good coverage. Professional testers get really good at this; ask them to help.
  3. Faster is better, so you can run them often.

Non-requirements for pinning tests:

Robustness. Professional testers get really good at making robust tests that work on different computers, or at different screen resolutions, or across UI changes. Ask them to refrain - these tests are short lived, and the behavior of the system won't be changing (by definition of "Refactoring").

You don't need to run your pinning tests in every environment that you ship. For a GUI, it's fine to record mouse clicks and keystrokes.

Long-lived. The goal is to hold behavior constant for just long enough to ReFactor.

Clean code. Hacking the test together is OK. For example,

  • Use the C preprocessor to redirect troublesome API calls to write to a log instead.
  • Edit your HOSTS file to hijack accessing a network resource.

Tuesday, September 6, 2016

Proposed Refactoring: Introduce Parameter in Lambda

Given a lambda with a captured local variable,
  1. Add a new parameter to the lambda
  2. Inside the lambda, replace uses of the local with uses of the new parameter
  3. Where the lambda is called, pass in the local.

I believe this is a refactoring: I believe that this transformation has no effect on the behavior of the code. But I'm not completely certain.

This operation is not allowed if the value of the local is changed inside the lambda.

This is almost the same operation as Introduce Parameter.

Sunday, September 4, 2016

Proposed refactoring: extract and execute lambda

Given a statement block, wrap it in a lambda assigned to an Action variable, and execute it immediately.

I believe this is a refactoring: I believe that this transformation has no effect on the behavior of the code. But I'm not completely certain.

I think a similar recipe for expressions is equally valid, using a Func<> instead of Action.

This is almost the same operation as Extract Method.

Tuesday, July 5, 2016

"pure unit test" vs. "FIRSTness"

Sometimes we categorize tests into groups like "pure unit test", "focused integration test", "end-to-end-test", etc. That's a fine approach, and useful for a lot of cases.

For example, I find that pure unit tests are extremely valuable in giving me feedback about my code design, especially coupling and duplication. I don't even have to run the tests to get that value! Other types of tests have their value, but they don't give me that feedback.

Another categorization I sometimes find useful is based on the FIRST Properties of Unit Tests. You should read that link for the full story, but I'll summarize here:

Isolated (tests have a single reason to fail. One aspect of behavior = one test)
Repeatable (same result every time)
Self-verifying (tests report an unambiguous pass/fail)
Timely (each test is created just before it is needed)

It's common for programmers to have one set of tests that they run with every edit-build-test cycle on their dev machine. They might have another set they run to validate each checkin before it merges in to source control. Another that runs nightly or weekly. Another that runs before each release.

I've noticed is the decision about which tests fit in each of these buckets is less about "unit" vs. "integration" and more about "FIRS" (without the T). That is, if a test is fast and the results are reliable and useful, programmers will tend to run them more often. If a test is slow, or results require investigation, they will tend to run them less often.

Ideally, I'd like to see 99.9% of tests run in 1ms or less, be perfectly repeatable, with a clear pass/fail, and for each failure to make it obvious what aspect of what desired behavior is not right. You should strive for that. But today, given the tests you have, you may find value in bucketing your tests as I've described.

Thursday, June 23, 2016

How to document your build process for an open source C# project

As an Open Source contributor...

I find an interesting open source project that I want to contribute to. I fork/clone the repository to my machine. Then I have to figure out how to build it.

Is there a solution file? Or a script?

I try something and the build fails. Do I need a certain SDK or Visual Studio feature installed? Which version?

I get it to build and then I try to run the tests. 1/3rd of them fail, because they are looking for something that isn't installed on my machine.

If I'm lucky (!?) I find a document in the repository that claims to be build instructions, but it is jumbled and clearly out of date. I try to follow it, but something I need to install is no longer available, or not compatible with my version of Windows. Will a newer version of that thing work OK?

Uggh, what a mess.

As an Open Source maintainer...

I put together a cool little project in my spare time and post it online. It's simple and straightforward to build and run tests. 

Then a contributor complains that they can't build it. What information could possibly be missing? It's simple and straightforward, right? I write a small text file explaining the obvious instructions. The contributor tries to follow it but is even more confused. I don't have time for this.

Uggh, what a mess.

A solution

My solution is AppVeyor. I treat AppVeyor as the reference build environment.

Here's how:
  2. New Project, select your project.
  3. Settings -> Do what you need to get a green build + tests
  4. Settings -> Export YAML. Add it to your repo.
  5. Delete the AppVeyor project
  6. New Project again, but this time configure nothing. It will use the settings from your repo.
  7. Confirm that build + tests are still green

Now the instructions for how to build + run tests are in source control. Anyone can read them. There won't be any missing details. If a dependency changes, I won't miss updating the instructions, because AppVeyor will report my build is broken. 

No more mess.

Saturday, June 11, 2016

An example of good engineering

I often advocate for good engineering practices and the path to Zero Bugs. Talking about these things is great and all, but concrete examples are important. I recently published an open source project that I think is a good example of this kind of work.

I hope you will copy some of the ideas to use in your own projects. You can read the source here:

Great tests

Code Coverage is a dumb measure, especially in this case. There are very few branches in the code; two tests would hit 100% coverage. 

In this project, you can pick any line of code and modify it to be incorrect, and you'll get a test failure that tells you exactly what is wrong. That is much more valuable than any code coverage number.

I can't guarantee that it has 0 bugs, but I can say that every type of bug I have ever imagined or experienced in this code is covered by a test. 

The tests are organized like a spec, using namespaces/folders to organize tests the same way as if you were writing a spec. Each name indicates what aspect of the system's behavior is being covered. 

The tests are super-fast, which makes the edit-build-test cycle a happy experience. 


ReSharper settings are included in the repository. All sources have been formatted with these R# settings. This makes it easy to keep formatting / style consistent. 

If a random person on the internet decides to make a contribution, I don't have to explain the project's style - they can just let ReSharper take care of that. 

ReSharper Code Inspections are 100% clean, further helping keep the code clean and consistent.


Every Pull Request is automatically validated by AppVeyor, including build + unit tests.

C# Warn-as-error is turned on for the AppVeyor build. I believe it's important to have 0 warnings - either heed the warning if it matters, or disable the warning if it doesn't. But I don't want to slow down my edit-build-test cycle just because if a warning, so I don't set warn-as-error on the desktop. But I dot set it in AppVeyor, to to ensure that all changes have 0 warnings before they hit master. 

AppVeyor runs ReSharper Code Inspections, again ensuring there are 0 issues before merging to master. This is especially important because not everyone has ReSharper.

The AppVeyor web site lets you edit build settings online. It's a convenient way to tune the settings. Once I had them just right, I downloaded the appveyor.yml file and added it to the repository. Then I deleted my AppVeyor project and recreated it from scratch, to ensure that no online edits were required -- everything is in the repo. If anyone wants to fork this project on GitHub and set up the same build, that will be easy.


Each AppVeyor build produces a NuGet package, which means we know that there aren't any problems in the .nuspec file or anything like that.

When a commit is merged to master, a special AppVeyor build runs to generate an "official" nuget package which is then automatically uploaded to the package repository. (The API key is encrypted). AppVeyor automatically updates the version number, and it includes a "-beta" tag so no one expects it to hold to any Semantic Versioning guarantees. 

When semver becomes important for the projet, I will implement a one-touch release process to nuget with non-beta versoin numbers.

The project itself

The whole purpose of this project is to help you get a step closer to zero bugs. 

It embodies all I have ever learned about how to implement equality in C#. Everything I have read; every mistake I have made; every mistake I can imagine making. It makes it easy to eliminate a class of errors: "C# class with incomplete or incorrect equality implementation".

It reduces the barrier to addressing Primitive Obsession, which means fewer bugs in the rest of your system, too.

The project is small

This is quite a small project. You may think that your codebase, being far larger and more complex, would not be amenable to this kind of engineering. I admit that I haven't proven otherwise. And even if you believe it would be possible and valuable to do it on your big project, you may not see how to map these ideas from here to there. Sorry.

But in some ways, the fact that it is small is part of its success. I have found a single need and satisfied that need in a single package. You can adopt this package without taking on any other requirements - no opinionated framework here. It adheres to the Single Responsibility Principle. It does what is needed and nothing else. Any time you can make a project do that, it's a win.


The Problem

Primitive Obsession is one of the most pervasive code smells out there. You can address it by moving a primitive in to a simple class.

I call the resulting class a "value type", but don't confuse it with C#'s notion of a value type, which doesn't get its own heap allocation, and is passed-by-value to other methods, and is a source of bugs if it's mutable. I mean "a type that represent a value in some domain".
If you want to implement equality on that class, there are a lot of tricky details that are easy to get wrong, at least in C#. For example:

This will throw an exception when trying to cast the Bar to a Point. So you try to fix it:

This will throw when trying to call null.GetType(). Uggh.

You probably want to override operator==() as well.

The compiler tells you to implement operator!=() to go with it, so you copy/paste and change the method name:

Oops, you forgot to negate the check. Bug.

If the value in question is a case-insensitive identifier of some sort, it's important that the GetHashCode() is implemented correctly. Don't do this:

Maybe you want to implement IEquatable<>, too, and you better get these details right there, too.

Many programmers don't test these details at all, or they test a few but not all, and they have to repeat the same set of tests each time they introduce a new class. If you discover a new rule (ToString() should follow equality, right?) you have to update all the tests.

Prior Art

Assertion libraries typically have an equality assertion. For example, in NUnit:

    Assert.AreEqual( new Point(7,8), new Point(7,8) );

That is insufficient. It only tells you that one of the equality checks you've written is correct, and doesn't catch all the other cases listed above.

The Solution

ValueTypeAssertions addresses all the mistakes I have ever made, or seen made, or can imagine when implementing equality in C#. Grab it from NuGet, and write a unit test like this:

    ValueTypeAssertions.HasValueEquality(new NtfsPath("foo.txt"), new NtfsPath("foo.txt"));

This says "these two objects should equal, in every way that C# recognizes equality".

  ValueTypeAssertions.HasValueInequality(new NtfsPath("foo.txt"), new NtfsPath("bar.txt"));

Which says the same thing about not being equal.

If some part of your value should be case insensitive, just add another assertion:

  ValueTypeAssertions.HasValueEquality(new NtfsPath("foo.txt"), new NtfsPath("FOO.TXT"));

If you wrap two values, assert the combinations:

  ValueTypeAssertions.HasValueInequality(new Point(1, 2), new Point(1, 8));
  ValueTypeAssertions.HasValueInequality(new Point(1, 2), new Point(0, 2));

You can find the source on GitHub.


Do you find this useful?

What change would make it more useful to you?

Is there a name for this that would be more obvious?

Wednesday, May 11, 2016

Extract Method introduces a bug in this corner case

I rely on automated Extract Method to do the right thing, regardless of test coverage. This is a key part of attacking legacy code that lacks tests. But if the Extract Method introduces a subtle bug, then I can't rely on it.

Here's the code:

As it is, the test passes. If you extract the indicated block, then the test fails. Extract Method should add a `ref`  to the parameter on the new method.

This repros with VS2013, VS2015, and ReSharper 8, 9, and 10.

Saturday, May 7, 2016

Examples of tiny test-induced design damage

Imagine you are trying to write a unit test for some code, but you're finding it difficult.

Maybe there's some complex detail in the middle of a method that is not relevant to the current test, and wouldn't it be nice to disable that bit of code just for the purpose of the test? Maybe you could add an optional boolean parameter to the method, which when set causes the detail to be skipped.

With the exception of getting legacy code under test to support you when refactoring, I see this as a bad thing, making the code worse just for the sake of testing.

Here's my list so far:
  • method marked 'internal' for testing
  • method marked 'virtual' for testing
  • method overload for testing
  • additional optional method parameter, only used for testing
  • public field that is only modified under test, to change behavior for testing
  • public field that is only read by tests
  • function replaced with mutable delegate field, only mutated for testing
Yes, TDD is about letting tests influence your design, but not in this way!

So how do you tell the difference? Here are a few ways:

  • Will this be used for both testing and in production? 
  • Do you feel the urge to add a comment saying why you did this?
  • If you removed the tests, would you keep this design?
  • Your own design sense. Do you think the design is better?
What to do about it?

Usually the desire to do this indicates that your class/function/module whatever is doing too much. 

Maybe you need to extract a class. If it's not obvious what belongs in the class, you might need to extract some methods first, to put in that new class.

A really common case is primitive obsession, like if the method deals with some string. If you move the string in to a new class, and then move that "deals with the string" code in to the class, then the class is small and easy to test and your code has improved. This is Whole Value.

Maybe there's something at the beginning or end of the method that talks to an external system, and that is making testing difficult. You could move those lines to the caller, and the method becomes testable.

I'd like to find some concrete examples.

Friday, May 6, 2016

Mob Programming conference 2016


Mobbing time lapse – a full day in 3 minutes

Woody Zuill keynote – how they found mobbing

Some Helpful Observations for successful Mob Programming (short slide deck)


Some of the people who I was glad to see at the conference:
  • Woody Zuill. Manager of the Hunter mob that discovered mobbing, and instigator of the #NoEstimates discussion
  • Llewellyn Falco. Creator of ApprovalTests, Teaching Kids Programming, credited with “strong-style” pair programming.
  • Nancy Van Schooenderwoert. Led a team of newbies to fantastic results, and wrote about it:

There were around 50 people total, including people from Cornwall, Sweden, Denmark, and Finland.


It was held at Microsoft’s New England Research and Development Center (“NERD Center”), right next to MIT. My cardkey didn’t work on the doors, though.

The 3 days beforehand were the Agile Games Conference, in the same space.


2 keynotes:
  • Woody Zuill on how they discovered mob programming
  • Llewellyn Falco on the science of mob programming (why it works)
4 mob programing workshops

2 open space slots.

The workshops were a chance to participate in a mob under the guidance of a mobbing expert. There were workshops at the introductory, intermediate, and advanced levels of mobbing.


The conference was less about teaching/learning, and more about experience. As such, most of my take-aways don’t fit in to an email. Hopefully I can facilitate these experiences for others.

Woody explicitly does not recommend mobbing. The important things he sees, which led to the discovery of mobbing + great results:

  • Kindness, consideration, and respect
  • The people doing the work should choose how they do they work
  • Turn up the good (work on making good things happen more, rather than fixing bad things – the bad things tend to melt away)

Mob programming is a skill. Don’t expect amazing results right at the outset.

At the conference I had the opportunity to experience mobbing at various levels, and this gave me a glimpse of what expert mobbing would look like. I can now see how that way of working would produce those amazing results.

I worker asked me to get the answer to the question “what is the ideal mob size?” The answer I found is largely about reframing the question:

If a team is not skilled at mobbing, then you won’t get great results, regardless of mob size. An expert mobbing team will be able to work well with 4 people or with 14. Get people that have each of the skills/knowledge/talents that will be needed, so they don’t get blocked.

I can now teach you to differentiate a male house sparrow from a male song sparrow, in less than a second.

Monday, April 11, 2016

Definitions of "Zero Bugs"

I am writing in response to this tweet:
A common definition of "Bug" is "Code that does not work according to spec." I see this as a deliberately narrow definition to cope with (coddle!) too many bugs. I want to come back to that, but first some definitions of Zero:
  • The normal known bug count is 0. Switch from counting bugs to counting days/months between bugs.
  • For every bug we've ever seen, we know that that class of bug will never happen again.
  • We no longer need a find-and-fix cycle before shipping a feature.
  • A mindset shift, from "bugs are inevitable" to "bugs are, uhh, evitable".
  • An ideal to aim for, which informs how we work each day.
  • A state where the rules of the game have changed, and we discard the protocols and cautions we had put in place to manage bugs.
As we approach Zero, you can change your definition of "Bug" to:
Are any of these definitions the same as "no customer will ever find a bug in this code, ever"? No, but that hardly matters. You certainly shouldn't let that be an excuse to argue that Zero Bugs is impossible instead of deciding to start down the path to #BugsZero.

Thursday, February 18, 2016

BugsZero @ Agile Open Northwest 2016

Neo: What are you trying to tell me? That I can catch all my bugs in testing?  
Morpheus: No, Neo. I'm trying to tell you that when you're ready, you won't have to. 

TLDR: You already know how to do it; no heroics required; go for low hanging fruit; start now.

Typically when I mention the idea of No Bugs to people, they respond with doubt and disbelief. They think I'm nuts, or they think I'm defining "bug" in a very narrow way, or that it could only be possible in some very specific context (no schedule pressure, a simple problem domain, greenfield development, etc.).

What is a bug?

The definition of bug I am using is very broad: anything that disappoints or surprises anyone.

The only people that use narrow definitions of bugs are the people who have lots of bugs. This is a coping technique that is unnecessary when you have no bugs.

If I wrote my code correctly, but something I depend on broke and now my site is down, is that a bug? Yes.

If the developer implemented code according to spec, but the spec was wrong, is that a bug? Of course it is.

I don't care about categorizing bugs. It's just bugs.

If you ever ask  "does X count as a bug", the answer should be "yes".

When is a bug?

Are we only talking about bugs that customers see? What if it's caught during testing?

I measure "bug injection" when the change is checked in to source control. When it escapes the developer's machine. In GitHub it would be when a pull request is merged in to master. I like this definition because it lets me lean on unit tests, static analysis, lint, etc. in an automated CI system.

Arlo wishes he could measure even earlier - if it gets typed in to the editor, it counts as a bug. More on that later...

What is zero?

At the AONW session Arlo asked the room how many bugs people currently have open in their bug tracking system. Answers looked like:
  • 1700
  • 250
  • 200
  • 200
  • 100

Then he asked Brian Geihsler about a project he was on. The answer had a very different shape:

  • 3 days to 3 weeks between bugs

(They also measured # of stories delivered between bugs.)

And then he asked Chris Lucian:

  • 12-18 months between bugs

Changing the rules

Are these zero? My inner mathematician says no, but my inner project manager says yes. If you can measure days between bugs, that changes the rules:

  • You no longer need to get the most expensive people in a room to triage bugs.
  • You never need to argue about whether something is a bug.
  • You never need to choose between fixing a bug and writing a feature. 
  • You can ship whenever you want.
How is this possible?

It's not about testing. It's about addressing the causes of bugs.

Where do bugs come from?

Bugs happen when a human makes an incorrect decision. 

The human brain is really good at making decisions, and doesn't let a lack of information get in the way. Even worse, it doesn't tell you that it's making a decision based on a lack of information. It just makes the decision and feels confident about it. Worse still, you have a limited short term memory, so even if the information you need is available to you, it may not all fit, but you won't know it.

Here are some ways that code can set you up to make bad decisions:
  • A variable is named "taxReturn" when it represents a "tax refund" (code that lies)
  • A variable is named "txRfnd" when it represents a "tax refund" (abbrs. obfuscate)
  • Two variables representing the same idea are named differently (unnecessary synonym)
  • One idea is expressed in more than one place
  • A function that is very long
  • Whitespace/indentation doesn't match the parse tree (Python wins here!)
Some examples out of code:
  • A dependency broke (add automated checking that the dependency still works)
  • I wrote a feature the customer doesn't want (pair with a customer)
How to get to zero bugs?

This is my favorite take-away from the AONW session: there's no secret. You already know how to get there. 

You already know how to get a little better. Rename a variable. Automate a step in your release process. Pair program on a kata for an hour. You can probably think of a dozen small improvements that you could make right now.

Each time there's a bug, look for some way you can avoid that class of issue. Pick the low-hanging fruit. The easiest, quickest, safest change that you know you can execute and get benefit from right away. Don't be ambitious. Do pick something that has been trouble recently.

Do it again. Keep iterating. 

How long will it take?

Assume it will take about 2 years to get to Zero Bugs. 

That means you need to progress 1% towards your goal each week. I know you know how to get 1% better right now.

It's a choice.

Now that you know how to stop writing bugs, the responsibility rests on your shoulders. If you're still writing bugs 2 years from now, it's because you decided to keep writing bugs.

Start now.

Wednesday, December 23, 2015

My ideal edit/build/test/commit/deploy/etc. system

There's a ton of variation out there in how teams set up the pipeline from "edit code" to "live in production". I want to talk about my ideal, to use as a reference point in further discussion.

TL;DR: When a change is pushed to master, it is proven ready for production.

"pushed to master" is equivalent to "makes it off a development machine".

It's common in Git to make multiple commits locally before pushing them up to the official repository. I am fine with those local commits not all passing tests. It's the "push" or "merge" that matters.

I take the term "master" from popular Git usage, but that's not important - it could be "trunk" or "Main" or whatever.

"Proven" here can mean a bunch of things. Obviously, it includes passing unit tests. It also includes compilation, so I will lean on the compiler. It also includes static analysis, which I will extend to eliminate classes of bugs.

It's important that this "proving" process be super fast, so that I never hesitate to run it. If it's slow, I'll want to separate the slow and fast parts, and require the only fast parts to be run on every change. The slow parts might run every few changes, or every night, or whatever, which means I don't know that master is always ready for production. So I look for ways to make it all super fast.

Sometimes a bug will slip through, and be caught by manual testing, or production monitoring, or by a customer. When this happens, I look for some way I can improve my "proving" to eliminate this class of bugs forever. In this way, over time my confidence in "ready for production" steadily grows.

Some teams have an "in development" branch, where changes can go before master, so that they can be shared between developers even if they're not production ready. In my ideal model, I don't need that. I use vertical slicing, safe refactoring, feature flags, etc. to be able to commit my changes quickly. My branches are short-lived. If my changes pass tests, I push them to master, and I'm done.

Some teams have an "in test" branch, where they'll take a snapshot of what's in master, and then run a testing pass before going to production (with some iteration for making additional fixes). In my ideal model, I don't need that. If my changes pass tests, I push them to master, and they're ready for production.

Ideally, there's an automated system that runs these builds + tests against proposed changes and then pushes them to master if they pass. In TFS they call this "gated checkin"; some people call it "Continuous Integration". The important thing is that you know for sure that master is always green - the validation always passes.

I want to reinforce the point that this is an ideal. I don't expect you to get there tomorrow. But I do want you to agree that this is both valuable and feasible, and start working towards this ideal today. Each step you take in this direction will make things a little better. You'll get there eventually.

And don't do something irresponsible like delete all your integrated tests, or fire your QA staff. Start moving towards this ideal, but keep your old process around until you can demonstrate that it is no longer giving you value.

Sunday, December 6, 2015

Types of integration/integrated test

I've noticed that people often use these terms interchangeably.

And when I look at the kinds of tests they're talking about, I see a bunch of different things. Each of these things is worth considering separately, but we lack crisp terminology for them. (I've touched on this before.)

1. Testing class A through B

2. Testing class A, but B is incidentally along for the ride

3. I have tested classes A and B separately, but now I want to test that they work together. 

That is, that they integrate correctly.

4. My business logic is testable in isolation, but then I have an adapter for each external system; I test these adapters against the real external system. I call this a focused integration test, and it happens when I use Ports/Adapters/Simulators.

5. I have unit tested bits of my system to some degree, but I don't have confidence that it's ready to ship until I run it in a real(ish) environment with real(ish) load. 

6. I am responsible for one service; you are responsible for another; our customers only care that they work together. We deploy our services to an integration environment, and run end-to-end tests there.

Every "Extract Method" starts with minus 1 points

Eric Gunnerson once wrote about the idea that, in programming language design, every potential language feature starts with "minus 100 points":
Every feature starts out in the hole by 100 points, which means that it has to have a significant net positive effect on the overall package for it to make it into the language. Some features are okay features for a language to have, they just aren't quite good enough to make it into the language.
Once a feature makes it in to a programming language, it's in there forever. If you later realize it could have been better if done a little differently, you're stuck. Features tend to join to create combinatoric complexity, so each feature you add now means potentially big costs down the line.

When refactoring, I say "Every 'Extract Method' starts with minus 1 points".

The default negative reflects the cost of looking in two places to understand your program, where previously everything was in one place. The extracted method has to provide some additional value to justify its existence.

If the new method lets you eliminate duplication, add points.

If the new method is poorly named (worse than good / accurate / honest), subtract points. If the name more clearly expresses intent, add points.

If the calling method is now easier to follow, add points.

It's not a very high bar, but if you can't get to positive territory before merging to master, throw away the refactoring.

Sunday, September 13, 2015

Unit testing microskills

In response to my Why we Test posts, George Dinwiddie had this to say:
The connection between why and how is important, but the details are not obvious. I'll pick a few values that people hope (unit) tests might offer, and give my thoughts on how to practice testing to deliver this value. (This is certainly not a complete analysis of the subject.)
prevent regressions due to future work
Most people pick up on this one right away: as long as you can get a green bar before making changes, and another green bar when you're done, your tests catch bugs before they get checked in. Great!

Speed, readability, and granularity of tests aren't as important as good coverage. They don't even have to be unit tests - any tests will do. Reliability with a clear pass/fail result is important, so that bug-induced test failures actually get recognized.

If a piece of code is a completely obvious expression of a business requirement, you still need to write a test for it, since the tests call out the intentional behavior.

"prevent regressions" does not appear to require test-first. In fact, teams that focus on this value tend to write many of their tests afterwards. Because the code isn't written for testability, it's hard to test (duh). Either we don't bother testing it, or we bend over backwards writing horrible tests that are hard to understand, and lock down implementation details, making future refactoring harder.
a safety net during refactoring
Readability and granularity of tests aren't as important as good coverage and speed. Slow tests mean you won't run as often, which means you won't catch mistakes as quickly, which makes refactoring more expensive. That changes the cost/value/risk equation for refactoring, so you won't refactor as often.

Test speed includes any time spent analyzing results and rerunning flaky tests, so make test results obvious and rock-solid.

Many organizations are nervous about the risk of bugs from refactoring, even though they tolerate bugs from feature work. In that context, great coverage is particularly important for the refactoring safety net.

In an effort to improve coverage, teams that focus on the refactoring safety net will often test implementation details, including breaking encapsulation and injecting mocks to access those details. In the process, they lock down those details, making refactoring more difficult. That's Irony Number One.

Getting proper coverage, for both "prevent regressions" and "refactoring safety net" can be difficult. Applying the Three Rules of TDD is an effective way to get the coverage that you actually need. As long as you avoid testing implementation details, you'll necessarily have to decouple your code to make this happen. So you'll naturally end up with a code base that is at least moderately well-factored, even before you try to use the tests as a refactoring safety net. That's Irony Number Two.
make DRY problems visible
DRY problems become visible in TDD when you find yourself writing the same test repeatedly. My favorite example is file path case insensitivity in Windows. Consider:

    if (File.GetExtension() == ".cs")

There's a bug here: if the file is named ".CS" then I want the software to work the same as ".cs". I can fix it locally, by switching to a case insensitive string comparison. And I diligently write a test for it. But then tomorrow I write another file extension check in another piece code, and I write another test. I may end up with a thousand expressions of this rule, and (if diligent) a thousand corresponding unit tests.

The rule I'm trying to test here is "File extensions are case-insensitive". I want to have exactly one test that describes and enforces that rule. Which means that rule must be expressed in exactly one place. That's DRY.

The correct response to "I'm testing this idea multiple times" is "extract the duplicated behavior from all the places it's used, and merge them to one place, and test that one place."

Note that test execution time is irrelevant here; you don't ever have to run your tests to get this value! However, responding to this design feedback leads to code that is factored in a way such that tests are naturally very fast (Irony Number Three!).

Readability is important: you have to be able to read the test to understand what requirement it's describing, to be able to detect the duplication.

Granularity is important: tests must each describe exactly one requirement, or the duplication won't be visible.

DRY reduces bugs, as it eliminates the risk of updating only 999 of the 1000 places a rule is expressed. DRY (along with Great Names / Coupling /Cohesion) is far more effective at eliminating bugs in shipped software than tests that are intended to catch bugs. (Irony Number Four)

Saturday, September 12, 2015

Why we test, Part 8: Because we are competent professionals

#15 in my list of reasons why we (unit) test, which I learned from James Shore:
Refactoring without tests is inherently unsafe, because of the risk of introducing bugs. As a professional, I would never take such risks. Therefore, I would only refactoring when I know I have good tests. In this way, TDD makes refactoring possible.
I may not be representing his idea with perfect fidelity; for that I apologize.

My comments:
  1. There is a class of programming languages* for which there exist reliable refactoring tools. With these tools I can safely refactor even without tests.
  2. The reliable tools work by following a recipe. If a human follows the same recipe carefully, they'll get the same result. That would work in strongly typed languages that lack good tooling.
  3. Plenty of people who make their careers as programmers ("professionals") do sloppy work, but not those who are competent.
  4. The tests have to be good. If you only write tests when it's easy, they won't give you enough protection. The only way I know to get this kind of test coverage is if you strictly follow the Three Rules of TDD.
  5. When naive** TDDers aim for 100% test coverage, they go to extreme lengths in their tests, including bad mocks and test cases that don't correspond to any business value. These common problems lock down implementation, which makes refactoring far more difficult; the opposite of Jim's goal.
* It's C# and Java

** most programmers

*** mocking is fantastic for Tell, Don't Ask, and problematic without TDA.

Sunday, August 23, 2015

My ideal backlog


There are two ways people seem to want to use a backlog:

A) To sort by priority, so the next thing we do is the most important thing to do next.

B) To make sure we don't forget anything important.

In both cases, the cost and value get worse as the list grows. Good ideas that are 1/2-way down the list will get duplicated by mistake, but with different phrasing, so the duplication is not obvious. Sorting, de-duping, and understanding the items gets more expensive, but none of that effort actually creates any business value.

I see a lot of teams with backlogs that would take a year to work through, if no new ideas came along. And of course new ideas always come along, at least if you're working on anything that matters.

Since items come in to the backlog faster than they go out, the list steadily grows, and most ideas never leave the backlog. People start to believe that the backlog is where good ideas go to die.


Keep the backlog short.

7 items seems ideal, because you can keep them all in your head long enough to understand the whole list.

When a new idea appears, compare it to the current backlog, and ask "is this item higher priority than any of the items currently on the list?" If not, then let it go. Don't worry about forgetting. Trust that if it becomes more important, it will grab your attention again, and can be added to the list at that time. More likely, you'll think of something even more awesome, and do that instead. That's a good thing: doing the more awesome things before the less awesome things.

Alternate Solution:

In many organizations, my proposal won't fly. People come to the team with requests, and would be upset if you said "It's not in our top 7, so we're letting it go."

In that case, keep two lists. The first list is the stuff you're going to do next (today/this sprint/whatever), and only has a few items on it. The second list is the bucket of possible future ideas, and can be any size. Spend as little time as possible grooming the second list.

When a new idea appears, compare it to the "To Do Next" list, and ask "is this higher priority than any of the items currently on the list?" If not, put it on the "Possible Future Ideas" list. Tell the requester that your idea is "on the backlog," and will be weighed against other items on the backlog when planning future releases. They'll understand that if you didn't do their idea, it's because something even better happened.

Sidebar: Hold prioritization very lightly.

We prioritize work by considering the estimated cost and value of that work. Both types of estimates are notoriously unreliable. You may believe you're working on the next most important thing, but you're probably wrong in some way that you can't know yet.

If you start working on an item, stay open to discovering that you should actually be doing something else. As Woody Zuill says:
This is another reason to slice work very thinly. The smaller the item, the sooner you can get to the point where you learn what you should really be doing, and the more likely it is that this current item will get completed and deliver some value before switching to your new discovery.

Sunday, July 19, 2015

GetRouteData() in ASP.NET WebApi

I've been trying to get System.Web.Http.HttpRouteCollection.GetRouteData() to work in ASP.NET WebApi recently, and had a hard time of it. In ASP.NET MVC it's really easy, but there are additional details I couldn't figure out in WebApi. There was even a detailed set of answers on StackOverflow, but when I tried them, they all failed in ways that didn't make sense to me.

And now I have seen it work, so I want to document it. Here's what I did:

  1. In VS 2013, New Project -> Web, ASP.NET Web Application
  2. Select WebAPI. Check "Add unit tests".
  3. Add the following unit test:

And here's a Git repository with the complete working solution.

(Thanks to this blog post for unblocking me.)

Thursday, May 14, 2015

The relationship between DRY and Coupling

I think that the DRY principle is a subset of* "Low Coupling".

DRY & Coupling:

If one rule is expressed in two places in your code (violating DRY), and you want to change the rule, you must edit both places. This is coupling.

byte[] OpenFile(string fileName)
    // Is it our file type?
    if (fileName.Extension == ".foo") ...

void AutoSaveFile(byte[] contents)
    path = Path.Combine(directory, DateTime.Now.ToString("dd_MM_yyyy") + ".foo");

If we decide to change our file extension to the much more reasonable ".bar", then we must edit both.

*possibly equivalent to

The Prime Refactoring

I used to believe that the two most important refactorings were Extract Method and Rename. The way they deliver value and the way they are used are quite different, so it's hard to compare, so I figured they had equal value.

Recently I've decided that Rename is slightly more urgent, if not more important. It is the first refactoring to learn; the first to teach; the first to apply. (Just slightly)

The problem is code that lies to you. It says it's doing one thing, but actually it's doing another. You either have to think really hard to figure that out (slow) or you misunderstand the code and write bugs.

Fix that first. It may lack cohesion, have tight coupling, and lots of duplication, but first introduce good names. Rename to make the code stop lying to you.

(Soon afterwards, start using Extract Method to give you more things to name.)

Monday, April 6, 2015

"good" names - a minbar

In code, naming things well is incredibly powerful. Names help with expressing intent, increasing cohesion, and identifying duplication.

Bad naming can do a lot of damage. Names that lie, mislead, or obfuscate will confuse a programmer, or at least make her work harder to get the job done.

I think a name is "good" when you don't have to examine what is behind the name to know what it does. It doesn't have to add additional value, it just has to avoid obfuscation. For example:

void AThenB()

If you see AThenB() in code, you'll know exactly what it does. Not a great name, but not a damaging name, either.

This is the minimum bar when naming a new entity in code. It's not a hard bar to meet. You can often do way better. But never check in any code that doesn't meet this bar.

JBrains calls it it "accurate names".

Arlo Belshee calls this "tweetable names":

Wednesday, March 18, 2015

The zeroth rule of software estimating

I realized that before even the first rule of software estimating must come:
Know why you are estimating.
We take it for granted that software estimating is something we must do. For many people, this is obvious. But when we start talking about why we estimate, I see many different answers. Perhaps it is not so obvious after all.

Some of the answers I have heard:

  1. To decide which work to do next.
  2. To decide how many items to start working on in an iteration.
  3. To decide how many people to hire.
  4. To sync up long-lead work (e.g. marketing).
  5. To evaluate and reward the performance of individuals.
  6. To evaluate and reward the performance of teams.
  7. To measure the impact of changes in process, tools, technical debt, etc.
  8. As a lever to push people to work harder.
It's common to choose more than one. This can produce really wacky results.

Whatever your reasons are, it's worth understanding them deeply. Is that something you really need? Is this approach really going to give you that result? Are there other ways that are more effective? 

Thursday, February 26, 2015

The second rule of software estimation

The more error there is in your estimates, the less precise you must be.

That's based on my past experience with being wrong a lot, and seeing other people be wrong a lot. If I tell you I can write a feature in a day, and sometimes I'm right, and sometimes it takes a month, then there's no reason to differentiate between 5-hour and 6-hour features when estimating.

I suspect that powers-of-n is a good model for many teams, where n depends on some combination of team familiarity with the code, technical debt, domain complexity, etc.

A statistician could certainly give some guidance here. Something about standard deviations.

A lot of teams like to use Fibbonaci numbers for their estimates, which seems weird to me. Why is this a good sequence? Why jump from 1 to 2 (a 100% increase) then to 3 (a 50% increase)? Can you really tell a 2 and a 3 apart, reliably enough to be useful?

In Fibonacci, the next number is "twice the average of the last two numbers", which is pretty close to "twice the last number". I doubt your estimates are reliable enough that the difference will matter. And powers of two are culturally familiar in software, easy to remember, and easy for programmers to add.

See also: the first rule.

Tuesday, February 24, 2015

The first rule of software estimating

Take a list of pieces of work you might do. Stories, features, products, I don't care. Find two that are the same size. Approximately.

Do them both. Measure how long they took. Did they come out the same?

If you can't reliably recognize two items as being the same size, then nothing else in estimation will work for you. It all builds on this.

How I write "contract tests"

This comes up in conversation often enough that I want to write it down..


My code talks to an external dependency that is awkward to use in unit tests.

I can refactor most of my code to eliminate the dependency. (See DEP and Whole Value). But I still have some code that talks to the external dependency. I wrap the dependency with an adapter (see Ports-and-Adapters) of significant thickness and abstraction (see Mimic Adapter). In test, I replace the real dependency with legitimate, but simplified test double (see Simulators). 


I can't be certain that my simulator has fidelity with my real system. They may behave differently, allowing my tests to pass when my system has a bug. (This is a common problem with mocks.)


Write one set of tests for the port, running the tests against both the real and simulated implementation.

In C#:

Tests on the simulator are fast enough to run with every build.

Tests on the real system may be slow; they may require awkward setup; they may cost real dollars to run. You may decide to run them only in your CI or once per sprint or whatever. Since adapters are relatively stable, that can be OK.

Tuesday, February 17, 2015

Bug metrics

Metrics are tricky. Plenty of ink has been spilled on that topic, so I'll leave it for now.

Around bugs, I know of 4 interesting metrics:
  • A: Count of active bugs
  • B: Time to fix
  • C: Fix rate
  • D: Injection rate
When I want to sound like I understand queuing theory, I call them Peak / Latency / Throughput / Load.

(I'm ignoring the disconnect between what we can measure and what is true. For example, bugs in the system that are impacting customers but are not currently tracked by the team. See

Customers only care about A and B.

Companies that I have worked at often give a lot of attention to A. For example, I've seen "Bug Hell", where any dev (or any team) with more than a certain number of active bugs must stop working on features until the bug count is lowered. 

In the orgs I'm familiar with, we tend to go immediately from A to C, with bad consequences. Focusing on C means devs will tend to choose narrower fixes; they'll allow tech. debt to accumulate; they'll forego testing; they'll fix cheap bugs before important bugs; they'll work when tired; they'll multitask. The inevitable bug bounce will be higher. This is all bad for customers; it's bad for business..

Getting B (latency) down is great, but it's not always directly actionable. You can prioritize bug fixes before feature work. You can strictly assign bugs back to the devs that created them, throttling the most prolific bug creators.

I see D (injection rate) as being a valuable thing to focus on (although it's difficult to measure). As you write fewer bugs, A and B will get better, which is good for customers. And C will become irrelevant.

Because A->C is such a deeply ingrained habit in our corporate culture, if you don't want that to happen, you have to actively exert effort to take things in a different direction. Every time someone says "we have N bugs", make sure they also say "remember to treat each bug as a learning experience - what can we do to make sure this kind of bug doesn't happen again?" and never say "we fixed M bugs this week."

(Thanks to Bill Hanlon for putting a lot of these ideas out there.)

using MS Fakes safely

MS Fakes can generate something called "Shims" which can override virtuals, and "Stubs" which can override anything, including statics and members of sealed classes.

If you decide to use them, I recommend using these rules:

Only generate the fakes you care about

Use Disable, Clear, and !.

<StubGeneration Disable="true"/>

    <Add FullName="Foo.Bar!" />

Enable diagnostics:

<Fakes xmlns="" Diagnostic="true">

Treat Fakes warnings as errors

Sadly, there's no easy way to do this. Edit:

C:\Program Files (x86)\MSBuild\Microsoft\VisualStudio\v12.0\Fakes\Microsoft.QualityTools.Testing.Fakes.targets

Target BuildFakesAssemblies, the GenerateFakes task sets the FakesMessages property, which you always want to be blank, so add:

<Error Condition="@(FakesMessages) != '' Text="Error generating fakes" />

Saturday, February 14, 2015

Write your own unit test "framework"

If you haven't already done it, I recommend you try writing your own unit testing framework. Actually, do it several times, in several different ways.

The existing unit testing packages are sizable pieces of software, and I'm not recommending you spend weeks on this effort. Keep it simple. In fact, the bare minimum to get started with TDD is almost nothing:

Sure, there is value in automatic test discovery, in rich asserts, running all tests even when one fails, reporting, etc. But you don't have to have those things to get started. (Remember this next time you are away from WiFi and have a programming idea.)

Starting from this point, experiment with different ways to write a unit test framework. Some ideas to consider:
  • What's the #1 feature you miss the most in the above example?
  • A natural way to extend asserts in to your domain.
  • How easy is make the mistake of writing a test that never gets run?
  • If my tests are super-fast, how much overhead is there in test discovery and reporting?
  • Reporting that points directly to the site of the failure.
  • How much boilerplate does a developer have to write?
  • Test discovery: reflection ([Test]), inline functions (describe(()=>{})), or something else?
  • If you only supplied one built-in assert, would it be "Assert True", "Assert Equals", or something else? What are the implications?
  • Try both traditional asserts (AssertFoo(result...)) and fluent asserts (Assert.That(result).IsFoo(...)).
Let me know what you find.

Thursday, January 29, 2015

Exceptions are a primitive type

I hold that Exception is a primitive type, and so using one directly in your code is a common example of Primitive Obsession.

The antidote is to wrap the primitive in a Whole Value. It's a pretty straightforward transformation when your code looks like this:

- Make a new exception class, typically nested at the current scope.
- Name it based on the message text.
- Parameters to the message become parameters to the constructor and properties on the new class.
- Override the "Message" property to hold the string.Format() call.

Like this:
Like all good design moves, this helps testing.

Note that in the 2nd example, I'm separating "I expect this exception with these properties" from "the exception should be able to format itself like this". There's a nice separation of concerns.

Wednesday, January 14, 2015

Why I write horrible code. (And so can you!)

EDIT: I may have been too subtle.

Some readers think this is a list of excuses for writing bad code. It is not. Instead, I want to analyze the reasons I have written bad code in the past, so that I can look for ways to make future code better. I want to acknowledge my own limitations, so that I can find ways to compensate. I also believe that many programmers have similar challenges, and may be able to learn from this analysis. Furthermore, I hope that by hearing about my imperfections, you can become less afraid of sharing yours, and that can open up opportunities for your growth.

Today I overheard a friend say something like "Who would write code like this? How could they think it was a good idea?"

I've written a lot of bad code, which makes me a kind of reluctant expert on the topic. It's possible that I'm just worse than average, but I've seen some great programmers write bad code too.

Here are the reasons I can see:

  1. Expediency

This is the most common reason that programmers cite. As in "I could spend some additional time to make this code more beautiful, but we need this change right away." 

I agree that the value of our work is time-sensitive, so delivering it sooner is better. And I agree that we are not being paid for the beauty of our code, but only for the value delivered to customers.

However, encoded in that statement are certain beliefs, about the cost to make the code more beautiful, how much better the code could possibly be, the value offered by that better code, and the risks in getting there. I say "belief" because I think they could vary by programmer, project, technology, market, organization, etc. I'll try to cover these beliefs as I go.

  1. Good design is unfamiliar

While all programmers have suffered from poorly-designed code, well-designed code is all too rare. We may know what we hate about this code, but we have a hard time knowing what "great" would look like. My college professors talked about "low coupling and high cohesion", but that conversation was always in the abstract - I didn't know how to make sure my code actually had those attributes.

I've often thought I knew a great design for something, only to discover that I missed many important details. If I ever get my code to the point where I can use it, I have compromised the design so much that it's not the huge win I was hoping for. I believe most programmers have had similar experiences. This feeds back in to the belief that attempting to make code beautiful won't give much return.

  1. We don't know what we need yet

When I start on a programming task, I usually have a bunch of questions I can't correctly answer yet:
    • What does my customer really need from my program?

    • Will the feature I have in mind really meet that need?

    • What is the true behavior of the externals I intend to depend on? (Do they have the capabilities I need? How do I call these APIs correctly? Do they scale? Are they reliable? Any bugs that will sting me?)

    • What is a good design for my code, based on answers to the above?

    • What future work will be difficult because of design decisions I make now?
Whatever I write, I will soon discover that I was wrong about my answers to these questions, and my design is no longer well-suited to the new answers. If I worked hard on that design, that hard work is wasted. If I work hard to revise the design, I may discover tomorrow that my new answers are wrong, too, so the revised design is also waste. This means I should take shortcuts to get my work done and in use, so I can get that feedback sooner and more cheaply.

Of course, when I get finally get the feature right, customers will not be interested in paying me to go back and rewrite it for no reason.

I used to think this meant that instead of working on good designs, I should learn how to work in poorly-design code, getting great at analyzing it in the debugger and finding minimal fixes. Now I know how to refactor.

  1. We don't know how to refactor

One time you tried to clean up a mess in the code, and you broke something. Your boss yelled at you. Customers were unhappy. You had to work extra hours to fix things up. Now you're wiser, and when someone says "I want to refactor this", you say "only a little, and only if you have great tests, and only if there's plenty of time." Which means it seldom happens. So we don't get any practice refactoring.

But refactoring is key: if you don't know what good design looks like (in general or specific), then the only way to get a good design is to start with a bad one and refactor your way to good.

More generally, remember that it's up to you to invest in your own skills. Refactoring isn't inherently slow or risky, but learning refactoring and other skills takes time and temporarily reduces your performance. You can't count on your employer to cover that, but it still matters.

  1. Too-big steps

Suppose you decide to clean up that code mess, once and for all. Part-way though, you get in interrupted. Maybe the live site goes down and you have to fix it, and that eats up the rest of your day. And tomorrow you have to work on some important new feature. By the time you get back to the cleanup, much of your work is no longer valid.

The antidote is to work tiny and get done. Do the smallest cleanup you can, check it in, and get back to work. Don't aim for "good", just for "better". Make things a little better each day. See Two Minutes to Better Code.

  1. We don't know what we're missing

So you're a smart programmer. Fueled by caffeine and isolated by headphones, you can get your job done. The code you work in is a mess, but you're still delivering value. Sure, you wish the code was nicer, but how much difference would it really make? Is it really worth the investment?

If you're only accustomed to working in code that is a mess, you're in no position to make this judgement. I know that is hard to accept. Really well-designed code doesn't just make things better; it makes things different. Ways that just aren't visible from the old way of doing things. For example:
  • No need to track bugs in a database, because there are no bugs.
  • No need to keep a list of future work (product backlog), because you can just pivot as needed.
  • Easy to test everything with super-fast unit tests, because everything is appropriately decoupled.
  • Ship at will, because you can verify ready-to-ship in a matter of minutes.
  • Any complexity in the code indicates an opportunity to reduce essential complication, since there is no accidental complication. (See 7 minutes, 26 seconds for definitions)
If you've never seen this it sounds impossibly far-fetched. A pipe dream. So of course you wouldn't invest the effort required to get there. (You probably believe that most of your code system complexity is essential; you're wrong again. Sorry.)

  1. We incorrectly compare short-, medium-, and long-term impact

Code mess creates a drag on development. As development gets slower, pressure increases. You take a shortcut. The mess gets worse. A vicious cycle. Exponential growth of the mess. (See Nobody Ever Gets Credit for Fixing Problems that Never Happened.)

In the (very) short term, we can deliver value sooner by taking shortcuts.

In the medium term, we will deliver features more slowly. Less value to customers = bad business.

In the long term, the cost of new features is so great that you must throw things away and rewrite, which you should never do. This isn't "pie in the sky" thinking; this is "we want to stay in business for more than 5 years".

  1. We don't ask for help

Even when my programming is going really well, as soon as another person sees my work, they'll notice a problem that I missed. Each person can offer a different kind of insight in to the design. I can learn a lot from that.

So turn that dial up, from code reviews, to pair programming, to mobbing.

  1. The code is just too horrible

How fast you learn something is heavily dependent on how fast you can iterate.

If you don't know what great design looks like, and you're not already good at refactoring, and your code is really really horrible, and your build takes forever, and your tests are crap, then every step you take will go extremely slowly.

If this is your situation, you could practice your skills in side projects and code katas, or you could switch jobs. Develop those design and refactoring skills in a better environment, then come back to this legacy code when you're ready for that challenge.