Someone at work today asked someone else for an Agility index / KPI. The one gave the other a link.
I followed the link. I was underwhelmed.
- It measured practices, not outcomes.
- It ignored engineering (the most important practices of all),
- and learning (the second most important category).
- It heavily stressed planning (only slightly less important than head cheese to a well-run software organization),
- and management structure (mostly important in that it can really screw up an otherwise good team).
I figured that I could come up with a metric that’s at least as wrong as that one (I’ve been reading a book on wrongology recently). Here goes:
Pick a random day. Measure the following:
- How long since your last public release (in weeks)?
- How long since your last useful retrospective (in weeks)?
- How long since you learned something new at work (in days)?
- How long since you talked directly with a customer (email counts, but has to be with actual customer) (in days)?
- How long does it take you to complete a typical task (from when you finish picking your next task until when you know you will never have to touch it again) (in 30-minute chunks)?
- How long does it take to commit & know that there were no problems with the integration (in 5-minute chunks, from when you stop coding & start the pre-commit process)?
- How long does it take to refactor a common problem (say, split a dual-responsibility class in two or rename a DB column) (in 5-minute chunks)?
- How far away is your nearest teammate from you right now (in 5-foot increments, doubled if that person is not immediately visible to you – eg, through a wall or behind you)?
- Do you love your job (1=love it; 5=it’s fine; 10=hate it)?
Measure durations in wall-clock time. It doesn’t matter whether you spend the time waiting (for human or computer), doing something else, or working. Round all fractions up (yes, that includes you continuous deployment people who answered 0.01 to the first question).
On several of these, I feel I’m being rather generous with the unit intervals, but it’s about right. And better to have too-wide intervals: it means that the metric is useless for distinguishing among the top 10% of teams, but allows lesser teams to see progress.
Average them. Subtract the value from 6. Yes, you might get a negative score. That’d be an accurate depiction, and probably worth fixing!
And no, my current team wouldn’t score a perfect 5 on this metric. But I’ve been on two who would.
I would drop #8. It is a measure of an input, not an output. It can also swamp the metric if taken seriously and you have a distributed team.
It would be better to make it a qualitative scale if you are measuring it. For example:
1=right next to me, I'm pairing.
2=I can just look over or turn around and start talking to them.
3=I have to get up and walk over.
4=I have to go to another floor.
5=I have to walk to another building.
6= I have to drive across town.
7=I have to fly somewhere in my time zone.
8=I have to fly somewhere in a different time zone overlapping with my work hours.
9=I have to fly to somewhere that is in a time zone that does not overlap with my working hours.
10=I would have to fly to two or more locations, and none of the locations have overlapping working hours.
There is a good application of a radar chart here. Rather than average the metrics, define each metric such that when you approach the optimum you are high on the scale (as you did by subtracting from 6). A perfect score makes a perfect circle.