Hill Climbing (Wonkish)

Refactoring is a process of editing software that doesn’t change what the software does, but instead improves it by making it more clear, more concise, or more direct. Refactoring is to code what structural editing is to prose. The debate about the utility of refactoring is just a rehash of the perpetual debate about revision. Everyone agrees that you ought to revise and that you never can revise enough. There are people like Mozart and Trollope who never revise and who do just fine, but most writers agree that’s fine for them and not a great idea for you and me.

Most of the classic refactorings run in two directions. We can Extract Method to break a long procedure into small and reusable components, and we can Inline Method to get rid of a small method that’s not pulling its weight. We can Pull Up Method to let a bunch of subclasses share common behavior, or we can Push Down Method so only the subclass that uses this behavior needs to know about it. Refactoring then becomes a matter of hill climbing, gradually improving the entire work by making a series of minor changes that facilitate larger and larger changes. When refactoring, how do we know which way is up?

Two broad approaches to the topography of code have been proposed. We move away from code smells, and we refactor toward patterns. These make sense, as far as they go. We should repair blunders in code just as we’d fix grammatical accidents or mixed metaphors, and if we see an opportunity to take some scattered code and rewrite it in a clean, ordered, and structured manner, that’s always a good idea. But these are exceptional cases. How do we refactor when there’s no grand pattern at hand and where we don’t see a straightforward way to expunging a code smell?

This is a central question of the craft of software today, and it’s seldom discussed. We talk around the question by contriving ever longer lists of code smells, but that’s not really a solution: code smells should be things we agree are usually or always bad, not things that might often be perfectly fine. We talk around the question by assuming that management won’t let us refactor even the obvious cases and so worrying about subtle situations is a waste. But not everyone has a pointy-haired boss. I’m surprised there’s so little literature on this question. I think at this point I’ve covered the books, but perhaps I’ve missed great stuff in the journal literature. Email me.

My two cents holds that, when the direction of refactoring is not immediately clear, we drive the ponies in the following directions:

  1. Extract fields, methods, and classes when there is any prospect of eventual reuse by additional methods.
  2. Do not underestimate the impact of tiny methods on cleaning up your code. Invoking small methods let you omit needless words.
  3. Extract whenever extracting reduces the length of code — including comments (not counting doc comments required by coding standards, which don’t count).
  4. Prefer polymorphism to switch(), even if the code is longer.
  5. Prefer encapsulation to visibility.
  6. Break dependencies.
  7. Simplify constructors, even if it complicates the code.
  8. Encapsulate or avoid conditional expressions. Prefer guard expressions and avoid else clauses. The gate is straight down.
  9. Prefer a dedicated type to a primitive, even if the code is longer; it’s better to pass Money than an integer even if you know the integer is money.
  10. When in doubt, prefer composition to inheritance .
  11. When in doubt, move computation to the object that owns the data or move the data to the object that does the computation (aka Law of Demeter).
  12. Prefer small objects, loosely coupled, to large objects and to tight coupling.
  13. When in doubt, prefer recursion to loops.
  14. Isolate the network, the database, files, and the user interface with facades, fakes, and humble objects.
  15. When in doubt, add code to the model if you can, to the controller if you must. Regard any work in the view as a convenience function or syntactic sugar, but don’t let that stop you.
  16. Prefer blocks (apply, each, mapcar) to loops.
  17. Prefer new technology.

If you’re new ’round here, I’m deep in the weeds on Tinderbox Six. Tinderbox is a tool for notes, a spreadsheet for ideas that helps people visualize, organize, and share complex interlinked ideas. It’s a knowledge representation system for everyone’s everyday knowledge work. If you’re interested in the design, you might enjoy my book on The Tinderbox Way. You can download a free demo of Tinderbox 5. There’s a terrific user community.