Not all refactorings are created equal. Some get used a lot more than others. This is especially true with highly-indebted code. There’s a reason for this. Sets of refactorings are commonly used together to solve classes of problems, and some problems are more common than others. If you are learning to refactor, learn your tool in sets.
The most important set to learn are the Core 6 Refactorings, plus 3 critical utility functions.
These 9 operations will allow you to do the most important part of working with indebted code: read by refactoring. Fluency in these 9 is all you need.
The Core 6
The Core 6 refactorings are:
- Rename
- Inline
- Extract Method
- Introduce Local Variable
- Introduce Parameter
- Introduce Field
These are the Core 6 because the most important thing we need to do when reading indebted code is to name it. We execute our core understanding loop: look at something, have an insight, write it down, check it in. The write it down step is always a transformation on names.
CRUD for Names
The core 6 are simply CRUD for the domain of names.
- Create: Introduce Local Variable, Extract Method, Introduce Parameter, Introduce Field.
- Read: (performed by the human; no refactoring needed)
- Update: Rename
- Delete: Inline
In a typical refactoring IDE, Rename and Inline operate on anything which is namable, but there is a distinct Create operation per kind of thing you want to create.
In typical OO languages, there are 4 things* we can name: local variables, methods, parameters, fields. There are also classes, but those are different. I think of variables, methods, parameters, and fields as scalars. They are atomic, simple things. Classes are compositions; they compose scalars together.
Thus, the core 4 namable things define places where you can describe a single thing. Classes define places where you can describe the relationship between several things.
When reading indebted code, the first challenge is to understand each atomic thing. Thus we really need CRUD for names of scalars: the Core 6 Refactorings.
+ 3 Utilities
There are 3 other things that your computer should do for you. You should never have to do these things.
- Format your code
- Type basic language stuff / get things precisely right
- Look up what refactorings you can use and their keystrokes
The IDE has commands for these too:
- Code Cleanup / Format Document (used for structured, non-code files, such as XML or JSON)
- AutoFix (later add Generate Code and Live Templates)
- Refactor This
Stop wasting time formatting (or arguing)
The first allows you to define your coding standard as a set of options, then check that in with your source. Run one command on a file or a whole solution and all of your code will meet your standard.
If your group decides to change the standard, update the coding standard options file and re-run the format over the entire solution, then check that in as one commit. Completely safe, and you can change an entire 10k file codebase to whatever coding standard you want in about 10 minutes.
Do this and coding standard will stop being a religious war.
At first, there will be a lot of reformat the world commits, as people make some change to the style and apply it. But each is safe and cheap. People who care about some aspect of the style will simply change that style. Everyone else can stay out of it.
And pretty quickly the conversation will terminate successfully. The team will have found a style that works for everyone. 100% of the code will be in that style. And the IDE will keep it that way. Religious war over. This usually happens inside of 3 days, at a total team cost of about 30 minutes per person.
From then on, whenever you edit a file, just run the Code Cleanup command on that file whenever you want / when you are done. You no longer need to format source code yourself. Ever.
Let me repeat that: if you find yourself typing indentation, newlines, space around parens, or anything else like that, you are wasting your time. Do something smarter. Let the computer handle the BS formatting tasks for you.
If you find yourself typing indentation, newlines, space around parens, or anything else like that, you are wasting your time.
Stop wasting time getting things right
Computers are good at guessing what you might have meant. Better yet, your IDE will give several good guesses and let you choose which one you meant. This means you don’t have to get everything right. Just close enough.
Auto-Fix is your friend. Here are some things you can do with it.
- Never write a variable declaration. Just write the expression (the thing on the right side of the initial assignment) and the auto-fix to have the IDE write the rest of the line.
- Never write a method declaration. Just call it where you mean to (perhaps from a test) and then auto-fix to create it.
- Never remember what namespace / import something is in. Just use it, then auto-fix to get the imports.
- Never worry about what namespace your class should be in. Just move the file to the right folder, then auto-fix to correct the namespace declaration.
- Never write a new class. Just instantiate it with a call to new, then auto-fix to create the variable, then auto-fix to create the class, then auto-fix to move it to its own file, then drag that file to the right project, then auto-fix to update the namespaces.
A computer’s job is to get things precisely correct. Your job is to figure out what to do and convey your intent expressively. You are not a computer. Don’t do its job for it.
Learning refactorings
It takes humans time to memorize things. The computer already knows them. So use its memory to train yours.
If you want to make some change to your code, but don’t know how to do that without editing the code, then ask the IDE. Put your cursor in the thing you want to change, and choose Refactor This.
Refactor This shows you all the ways your IDE could transform the code around the insertion point. It also shows you the shortcut keys to do each transformation.
Use the menu to find the thing you want, then close it and use the keystroke to activate that thing. Train your muscle memory to use that keystroke. Pretty quickly you will have memorized all the keystrokes for the things you do often and have found many other things that you can do when you need them.
What if I don’t have an IDE?
If you use an editor instead of an IDE, then you are actively making the choice to waste time. Your editor may manipulate text better than an IDE. It may start up faster. But it can’t manipulate code as quickly as an IDE.
And you can fix the startup time by simply leaving your IDE open, or opening it before you grab your coffee / go to standup.
More importantly than wasting time, using an editor forces you to do things that human brains are bad at: precision and details. When you edit code directly, you have to get every detail precisely correct. You have to remember and focus on all those details. Which leaves you very little brain for focusing on intent.
Automated refactorings allow you to transform code in known-safe ways, without depending on your precision to get everything right or your detail-brain to have written unit tests for every important case. With tools, tests are unnecessary for refactoring.
Auto-fix and Cleanup Code allow you to ignore language syntax and human readability details. Express your intent to the machine, then let it handle the details.
And if someone tries to tell you that you aren’t a good programmer because you can only write your language with your language’s tools, remind them that you are a great programmer. So good that you don’t try to be a computer. You use your computer to be a computer. You know, like a programmer does.
Summary
Great programmers use their tools. Some aspects of those tools are more important than others. Attain fluency with those and you will accomplish awesome things with little effort. Start with the Core 6 Refactorings + 3 handy utility operations.
Do you want to learn to pay technical debt in a way that reduces your development cost in the short term? Learn to Read by Refactoring in the Legacy Code Workshop that I and my co-workers host for companies around the world.
For ease of remembering, the core refactoring are: Rename; Extract Method and it’s inverse; introduce the different kinds of variables.
Though I actually find it easier to remember as Create the 4 things (methods & 3 kinds of variables), its inverse (you can Inline any of those, not just methods), and Rename.
Thus CRUD.
I don't think I use the same kind of IDE/language than you, but I agree to nearly all of this (I don't get the part with "There are also classes, but those are different.").
Classes show up more when we get into context neutrality. The Core 6 are about getting great names – having each element clearly convey intent and tell a story.
After that we can use tests as a spec (writing down runtime stories). And in trying to do that we encounter the need for context neutrality (so we can verify code in one context – detached from anything, running in parallel) and have that verification demonstrate correctness in a different context. classes show up all over the place as a way to define a context boundary – the amount of context that is interdependent with itself and neutral/independent with everything else.
But those are the topics of 2 more blog posts.
Have you measured the productivity gain? How much faster, would you say, being fluent at these six makes you? 2x, 5x, 10x, more?
How about more correct/making fewer mistakes?
Measured yes, measured in a scientifically valid sense, no. Typically I see a reduction in the time to read and understand code by around 10% on simple code and upwards of 95% on complex code. As measured by the amount of time a person takes trying to figure out how a new story fits into existing code and writing the first partial implementation.
This means that big stories and high-debt regions see a larger payoff than short, simple stories in new or clean code.
Also, these payoffs take about 48 hours to start appearing (time to build fluency), and then ramp up over time (over the course of about a fortnight, until they are at the above-quoted levels). This time assumes a technical coach or trainer who knows the technique and can share work with you (pair or mob) to implement. It takes much longer to learn by reading blog posts and trying on your own (though that also works – it's how I discovered the approach).
Finally, recall that reading code typically accounts for about 60-70% of a developer's time (debugging is the next largest chunk, then specifying / writing tests, then actual code writing / editing). I got this number from an eclipse study (they exposed developer feature usage data and their interpretation of what a developer was doing during those time periods). So a payoff in read time reduction is the largest payoff you can get by improving developer experience.
An advantage of "don't write the declaration" is that we encounter the name first as a USER of the name, and the name will serve the USE of it. If we write the declaration first, then we are thinking as the author of the function about the functionality we're exposing.
This shift in thinking is crucial, and tends to produce better results (where better means "serving the many users of a thing over time, rather than the author at the time of writing).
I want to amplify what you say about standards and reformatting with two earlier references:
"the team adopts a single style so that they can freely work on any part of the system" (not just coding style, but key maps, tools, etc) http://agileinaflash.blogspot.com/2009/02/collect…
"Standardize to avoid waste" http://agileinaflash.blogspot.com/2009/02/coding-…
of course, one final note is that a "good" stylesheet should conserve vertical space, since we all are programming-through-keyholes in modern IDEs, because of windows and tool bars open left, right, and below the space we code in. 😉 But even a bad style guide is better than not having any or having one-per-developer.
Hi, Arlo!
I tried my own version of just this post a while ago: http://edmundkirwan.com/general/best-refactorings…
Interesting to see how we differ.
Interesting. Very different context / direction. I'm not looking for best.
I'm looking for identifying the main problem in a codebase that leads to bugs, and then giving a simple process to fix that – with tools to make that process easy. and I find the main problem to be unintelligibility.
More correctly, I find the problem to be incidental complexity. and that has 2 causes: "it is hard to read the code before me, so I make a mistake" (unintelligibility) and "the code before me fools me, as local reasoning is invalidated by some non-local aspect" (context non-neutrality). Your dependencies are related to context neutrality violations (I believe dependencies are a large, proper subset of context neutrality violations).
Most codebases I see suffer both of these problems. But most errors that I see come from local complexity – unintelligibility. Thus the core 6 refactorings target that problem.
I have a different set of refactorings, the Key 11, for attaining context neutrality. They also differ from your selection, because mine are again based in an implementation process. A rigorous and short (each cycle terminates in apx 10 sec) process for incrementally fixing problems in the small, such that problems in the large melt away and become problems in the small.
I'll write about that second part someday. And likely link to your article when I do. Thanks for the article.
I agree that refactorings are a 'higher level language' about code transformation. It takes even more power when mobbing, where 'driver as smart keyboard' makes it possible for the rest of the mob to discuss using this higher level language.
What do you think of the 'Encapsulate variable' refactoring? I found this particularly powerful in legacy code bases. It captures all accesses to a globals, I use it as a first step for further refactoring.
Thanks again for your post