Thursday, January 29, 2015

Exceptions are a primitive type

I hold that Exception is a primitive type, and so using one directly in your code is a common example of Primitive Obsession.

The antidote is to wrap the primitive in a Whole Value. It's a pretty straightforward transformation when your code looks like this:

- Make a new exception class, typically nested at the current scope.
- Name it based on the message text.
- Parameters to the message become parameters to the constructor and properties on the new class.
- Override the "Message" property to hold the string.Format() call.

Like this:
Like all good design moves, this helps testing.

Note that in the 2nd example, I'm separating "I expect this exception with these properties" from "the exception should be able to format itself like this". There's a nice separation of concerns.


Wednesday, January 14, 2015

Why I write horrible code. (And so can you!)

EDIT: I may have been too subtle.

Some readers think this is a list of excuses for writing bad code. It is not. Instead, I want to analyze the reasons I have written bad code in the past, so that I can look for ways to make future code better. I want to acknowledge my own limitations, so that I can find ways to compensate. I also believe that many programmers have similar challenges, and may be able to learn from this analysis. Furthermore, I hope that by hearing about my imperfections, you can become less afraid of sharing yours, and that can open up opportunities for your growth.



Today I overheard a friend say something like "Who would write code like this? How could they think it was a good idea?"

I've written a lot of bad code, which makes me a kind of reluctant expert on the topic. It's possible that I'm just worse than average, but I've seen some great programmers write bad code too.

Here are the reasons I can see:

  1. Expediency

This is the most common reason that programmers cite. As in "I could spend some additional time to make this code more beautiful, but we need this change right away." 

I agree that the value of our work is time-sensitive, so delivering it sooner is better. And I agree that we are not being paid for the beauty of our code, but only for the value delivered to customers.

However, encoded in that statement are certain beliefs, about the cost to make the code more beautiful, how much better the code could possibly be, the value offered by that better code, and the risks in getting there. I say "belief" because I think they could vary by programmer, project, technology, market, organization, etc. I'll try to cover these beliefs as I go.

  1. Good design is unfamiliar

While all programmers have suffered from poorly-designed code, well-designed code is all too rare. We may know what we hate about this code, but we have a hard time knowing what "great" would look like. My college professors talked about "low coupling and high cohesion", but that conversation was always in the abstract - I didn't know how to make sure my code actually had those attributes.

I've often thought I knew a great design for something, only to discover that I missed many important details. If I ever get my code to the point where I can use it, I have compromised the design so much that it's not the huge win I was hoping for. I believe most programmers have had similar experiences. This feeds back in to the belief that attempting to make code beautiful won't give much return.

  1. We don't know what we need yet

When I start on a programming task, I usually have a bunch of questions I can't correctly answer yet:
    • What does my customer really need from my program?

    • Will the feature I have in mind really meet that need?

    • What is the true behavior of the externals I intend to depend on? (Do they have the capabilities I need? How do I call these APIs correctly? Do they scale? Are they reliable? Any bugs that will sting me?)

    • What is a good design for my code, based on answers to the above?

    • What future work will be difficult because of design decisions I make now?
Whatever I write, I will soon discover that I was wrong about my answers to these questions, and my design is no longer well-suited to the new answers. If I worked hard on that design, that hard work is wasted. If I work hard to revise the design, I may discover tomorrow that my new answers are wrong, too, so the revised design is also waste. This means I should take shortcuts to get my work done and in use, so I can get that feedback sooner and more cheaply.

Of course, when I get finally get the feature right, customers will not be interested in paying me to go back and rewrite it for no reason.

I used to think this meant that instead of working on good designs, I should learn how to work in poorly-design code, getting great at analyzing it in the debugger and finding minimal fixes. Now I know how to refactor.

  1. We don't know how to refactor

One time you tried to clean up a mess in the code, and you broke something. Your boss yelled at you. Customers were unhappy. You had to work extra hours to fix things up. Now you're wiser, and when someone says "I want to refactor this", you say "only a little, and only if you have great tests, and only if there's plenty of time." Which means it seldom happens. So we don't get any practice refactoring.

But refactoring is key: if you don't know what good design looks like (in general or specific), then the only way to get a good design is to start with a bad one and refactor your way to good.

More generally, remember that it's up to you to invest in your own skills. Refactoring isn't inherently slow or risky, but learning refactoring and other skills takes time and temporarily reduces your performance. You can't count on your employer to cover that, but it still matters.

  1. Too-big steps

Suppose you decide to clean up that code mess, once and for all. Part-way though, you get in interrupted. Maybe the live site goes down and you have to fix it, and that eats up the rest of your day. And tomorrow you have to work on some important new feature. By the time you get back to the cleanup, much of your work is no longer valid.

The antidote is to work tiny and get done. Do the smallest cleanup you can, check it in, and get back to work. Don't aim for "good", just for "better". Make things a little better each day. See Two Minutes to Better Code.

  1. We don't know what we're missing

So you're a smart programmer. Fueled by caffeine and isolated by headphones, you can get your job done. The code you work in is a mess, but you're still delivering value. Sure, you wish the code was nicer, but how much difference would it really make? Is it really worth the investment?

If you're only accustomed to working in code that is a mess, you're in no position to make this judgement. I know that is hard to accept. Really well-designed code doesn't just make things better; it makes things different. Ways that just aren't visible from the old way of doing things. For example:
  • No need to track bugs in a database, because there are no bugs.
  • No need to keep a list of future work (product backlog), because you can just pivot as needed.
  • Easy to test everything with super-fast unit tests, because everything is appropriately decoupled.
  • Ship at will, because you can verify ready-to-ship in a matter of minutes.
  • Any complexity in the code indicates an opportunity to reduce essential complication, since there is no accidental complication. (See 7 minutes, 26 seconds for definitions)
If you've never seen this it sounds impossibly far-fetched. A pipe dream. So of course you wouldn't invest the effort required to get there. (You probably believe that most of your code system complexity is essential; you're wrong again. Sorry.)

  1. We incorrectly compare short-, medium-, and long-term impact

Code mess creates a drag on development. As development gets slower, pressure increases. You take a shortcut. The mess gets worse. A vicious cycle. Exponential growth of the mess. (See Nobody Ever Gets Credit for Fixing Problems that Never Happened.)

In the (very) short term, we can deliver value sooner by taking shortcuts.

In the medium term, we will deliver features more slowly. Less value to customers = bad business.

In the long term, the cost of new features is so great that you must throw things away and rewrite, which you should never do. This isn't "pie in the sky" thinking; this is "we want to stay in business for more than 5 years".

  1. We don't ask for help

Even when my programming is going really well, as soon as another person sees my work, they'll notice a problem that I missed. Each person can offer a different kind of insight in to the design. I can learn a lot from that.

So turn that dial up, from code reviews, to pair programming, to mobbing.

  1. The code is just too horrible

How fast you learn something is heavily dependent on how fast you can iterate.

If you don't know what great design looks like, and you're not already good at refactoring, and your code is really really horrible, and your build takes forever, and your tests are crap, then every step you take will go extremely slowly.

If this is your situation, you could practice your skills in side projects and code katas, or you could switch jobs. Develop those design and refactoring skills in a better environment, then come back to this legacy code when you're ready for that challenge.

Thursday, January 8, 2015

Saff Squeeze on recursive code with NCrunch

NCrunch makes all unit testing better, but there's something cool that happens when combining it with the Saff Squeeze, and something even cooler when the code under test is recursive.

In case you missed Kent Beck's Saff Squeeze:

The Saff Squeeze, as I call it, works by taking a failing test and progressively inlining parts of it until you can't inline further without losing sight of the defect. Here's the cycle:
    1. Inline a non-working method in the test.
    2. Place a (failing) assertion earlier in the test than the existing assertions.
    3. Prune away parts of the test that are no longer relevant.
    4. Repeat.
(I add Step 0; make a copy of the failing test.)

NCrunch helps because you can quickly see how far in the test you're getting until an assert fails. If the code under test is recursive, then:
Repeatedly incline the recursive call until NCrunch's code coverage dots show uncovered code.
Now your test does its job without any recursion, and you can continue to apply the Saff Squeeze as normal.

Wednesday, January 7, 2015

Why we Test, part 7, The Dead Horse

I've seen a wide range of practices that the practitioner claimed was TDD. (Arlo Belshee identifies 7). Obviously, outcomes vary.

People claimed that TDD was or was not effective in some way based on those results. To make it worse, I see wide variation in the stated purpose of TDD. If we don't see the same purpose, then we aren't measuring effectiveness the same way. For example:

  1. ensure correctness of new code
  2. prevent regressions due to future work
  3. point me directly at my mistake
  4. be fast enough to run often
  5. a safety net during refactoring
  6. the only way to be sure my tests are comprehensive
  7. to stop me from writing code I don't need
  8. make code coupling obvious
  9. make DRY problems visible
  10. support cohesion
  11. create the context for entering a Flow state
  12. regular rewards as I make progress
  13. confidence (possibly false!) that my program will work
  14. explain to another human what my code is intended to do

(I sometimes group these into Bugs, Design Feedback, Psychological Benefits, and Specification.)

If you start by practicing TDD a certain way, and see it succeed at one of the above, you'll be tempted to argue that is the "true purpose" of TDD.

If you start with a belief about the true purpose of TDD, and select a practice that doesn't do that, you'll think TDD doesn't work. (See We Tried Baseball...)

I say all this because I hope people will shift to an "all of the above" mindset, and adjust their understanding and practice to make that happen.

Why we Test, part 6: BDD vs. TDD

I've seen BDD advocates say "BDD is just TDD done right." (e.g. here and here) They seem to be saying "It's important to write your unit tests at the appropriate level of abstraction, using language from the problem domain, phrased for a human reader. Domain experts (e.g. users, business analysts) should be able to read, and perhaps write the tests."

More recently, I've seen "TDD is just BDD done right." (e.g. here and here) These people seem to be saying "It's important to use your unit tests to drive the design of your code. BDD is missing that important action." I think they're noticing that BDD doesn't include a Red-Green-Refactor cycle.

I think they're both right. Striving to write tests for humans provides the best guidance for refactoring, carrying the Ubiquitous Language deep in to the system and improving DRY and Cohesion in the system.

Tuesday, January 6, 2015

Why we Test: part 5: Two reasons for regression testing

In Part 1, I wrote: "I count on tests to catch mistakes before our customers do" and "Having tests means I can refactor safely".

In both cases, I want tests to catch my mistakes, but I now realize I should consider these separately.

Regression

In the first case I'm relying on the tests to confirm that I have written my code correctly, or that future functionality changes don't break previous functionality changes. I have written bugs plenty often, and I'm looking to the tests to tell me about. I'm definitely going to keep writing new features and fixing old bugs and shipping software.

When we decide to change old functionality, we'll want to change the old tests. So they should be malleable to provide their value. They should be readable and granular, so when they fail I can decide whether to change the product or change the test.

Pinning

In the second case, my decision about whether to refactor is heavily influenced by whether I have those tests. In a legacy (i.e. tightly coupled) system without good tests, most people will just leave things as-is instead of refactoring.

If I'm just looking for a safety net while I refactor, I can use Pinning Tests. They tests don't need to be malleable, since the product behavior is not changing. If they are fast, they don't need to be granular, since I can run them really often. They do need to be very reliable. They need to cover as many cases as I can manage, but only in the sections of code I'm touching. It's OK if the tests are ugly, if I'm just going to delete them at the end of my refactoring session (when my decoupled code is now amenable to unit testing.)

(In this context, when I say "refactor", I don't mean "a highly disciplined process of changing code without changing behavior, according to a recipe", or "using a high-fidelity automated tool that will safely change code without changing behavior". You could say I mean "tiny rewrites.")

Sunday, January 4, 2015

Simplest possible Git workflow

I'm working with a group that is getting ready to transition to Git, from a traditional centralized version control system.

Some thing I learned while working to revitalize endangered languages is that the first lesson should get students working in the new system, for real, as soon as possible. Applying that to Git, I want to ask "what's the minimum to get you started using Git in a real way without creating a mess that is difficult to clean up later."

A friend complained to me that every time he asks a Git expert a question about Git, the response starts with, "Well, first you have to understand how Git works". I want to offer a workflow that does not require understanding how Git works.

In Scott Chacon's "Introduction to Git" video, he says if you're comfortable with a version control system that is not Git, you're going to hate Git. How can we get around that?

Can I create a microverse where Git is simple and easy to understand, and yet still comprehensive and self-consistent?

"in a real way" for us means "a team of people with a central 'official' repository, that anyone can push to", so I can't ignore remotes/pulling/pushing for now, but I can ignore pull requests. A single person working alone on a single machine can simplify even further than what I describe here.

Prerequisites

I assume that an expert is available to set things up and teach these basics. I advise the expert to avoid talking about any additional details of Git, no matter how juicy.

Whether you choose rebase or merge (linear or non-linear history in master) is up to your expert. If you want to use rebase later, you should use it now, to avoid "creating a mess that is difficult to clean up later", at the cost of expanding "minimum to get you started". Personally, I like rebase.

In our case, everyone sets up their development environments in the same way, and we're using Windows. We push these settings to every machine:

    git.exe config push.default simple
    git.exe config pull.rebase true
    git.exe config core.autocrlf true
    git.exe config core.safecrlf true
    git.exe config rebase.autosquash true
    git.exe config core.editor '"%ProgramFiles%\Windows NT\Accessories\wordpad.exe"'
    git.exe config merge.conflictstyle diff3


and we assert that git config user.name and git config user.email are set.

The expert should create the central repository and instruct everyone on cloning it and help maintain the .gitignore.

Simplest Development Workflow

We can treat Git like an old-fashioned centralized system with a single branch. Let everyone work in master. (Branches are awesome, but understanding them is more than the newbie is ready for.)

You only need these Git commands:

> git pull
When you want to update your machine with the latest from the central repository.

> git add FILENAME
When you create a new file

> git status
> git diff
To see what changes you have pending (ignore the difference between staged and unstaged changes, but watch out for unstaged adds)

> git commit -a
> git pull
> git push
When you like your changes and want to share them with the world

> git reset --hard
> git clean -fd
When you don't like your changes

> git log
To see what has been done

The biggest risk I see here is if there's a merge conflict when you pull before pushing. Stand by to help people through that the first time.

Release Workflow

Release from master. If your team needs time to stabilize master before you can release, make everyone stop what they're doing and focus on completing the release. When you are done, add a tag, then let everyone get back to work.

What's next?

As needs arise, you can build on this model. A dev can start making multiple commits before pushing, or work in a feature branch and merge it, without anyone else needing to learn something new. So you can grow incrementally.

You'll probably want to use branches for releases pretty soon.

At some point, you'll need to have a big conversation about the underlying model of Git, and what rebasing means, etc. Put that off as long as you can, and then go deep.

I find gitk helps people visualize what is happening as things get more interesting.