Scaling Agile – the Easy Way

I hear a lot about Scaling Agile these days. Every time I hear it I have to shake my head. The fact that people are asking how to scale Agile means that they don’t know how to do Agile. The fact that people are designing frameworks for scaling Agile means that they don’t know how to do Agile either. Agile done right (beyond the 1-star level) consists of two steps:

  1. Change the rules of the game (by changing the details of how software is created moment-by-moment).
  2. Adapt everything else to take advantage of the new reality.

This is why when you ask “how do I scale Agile?” to someone who can ship at will, they look at you with a blank stare. If you’re thinking about “scaling Agile,” then you probably haven’t really benefitted from your agile implementation. If you had, then the question wouldn’t make sense to you either. You would just scale your business in a few obvious ways, look around, and repeat.

There isn’t a special method to scale Agile. The team just does whatever it was doing before, only it stops doing whichever parts it doesn’t need anymore.

The alternative to scaling

I am cheating a bit. When I say Agile, I really mean 2-star Agile. Therefore my argument only applies to Agile software companies who are intending to keep their software for more than 3 weeks. If we are discussing an Agile team that does not build software, or one that intentionally throws away 100% of its code every 3 weeks (like, say, a game developer for mobile phones often does), then my argument doesn’t apply. 1-star may work best for those teams and they may have to scale it.

For everyone else, there is a better way: become competent at shipping software.

Not that this will be easy. Competence is a pretty high bar these days. There are a ton of companies and open source projects out there that write one bug once every 100 developer-days or so. Most of these bugs are found within an hour or two of being written. These teams can go as much as 1000 developer-days between bugs that actually make it even as far as testing.

That is basic competence. It used to be excellence, but now lots of people are doing it. It’s so common that there’s a formula for it and everything. We spend some effort, we run the formula, and we will get that bug rate. It takes about 2 months to see initial results and 2 years before we get to 1 bug per 100 dev-days, but we get there. And we see progress and return on investment the whole time.

The heart of Agile

In my mind, Agile is simple (but not easy). Work tiny. Prove it. Get done. Learn constantly. Work together. Tech first. This is all about focusing on the details, not the big ideas. This isn’t values and principles. This is simply doing every detail well.

It is a practical approach: we will drive the risk and cost out of every transaction (code commit cycle, design change, response to requirement change, …). As we drive out the costs, we make each cycle tinier (smaller in scope and in duration). This identifies more transaction risks and more transaction costs. We kill those and iterate.

After a while, we are operating with almost 0 risk and almost 0 transactional cost for each thing we do.

At that point, we just do the obvious thing. We have mastered shipping software. We have driven the technical risk out of our efforts. We have driven the technical overhead costs our of our efforts. We are left with the direct costs for the tech and the risk that we might be building the wrong thing. And that’s it.

In addition to not worrying about bugs, these teams typically have the following characteristics (as in, these traits are pretty common teams that do the Adaptive Engineering practices and have talked to me. See also Embedded Agile Project by the Numbers with Newbies):

  • Code changes cost the same whether they are in line with the current design or require a design change. The additional cost of changing the design is 0.
  • There are no technical dependencies; each item can be done in any order, without there being a lump cost for whichever one goes first.
  • A change in requirements or direction incurs no overhead cost. It may invalidate completed features (no longer needed for the product), but it does not invalidate partial work or change estimates for non-started work.
  • There is little technical risk or variance in stories. If two stories are estimated to be the same size, they take the same amount of time to finish, regardless of which parts of the product they touch (like within 5%).

Building on this foundation

Once the details of writing software are right, the rest of Agile comes as an obvious case of following your nose.

Take planning as an example. It really doesn’t matter, at all, how a team does planning. (I know I’ll be stoned for this one.)

Assume you can ship software whenever you want. Assume that when you estimate how long something will take to build, there is no technical variance in that estimate. Assume there are no technical dependencies between things, so you can freely change the order of your features and it will not affect the cost of any feature. Assume there are no “platform” costs that will be assigned to the first story to do X.

These assumptions hold (nearly) true for teams that master the art of delivering software. These costs and interactions are small enough that the optimal strategy is generally to ignore them. This has been achieved many times, by many teams, in many industries, and in many sizes of companies.

Given those assumptions, how do you plan? You plan however the hell you want to plan.

If you have a lot of partner teams and need to coordinate your efforts, then Waterfall is the best planning methodology out there for you. Use it.

If you need to make projections for marketing (or others) but want to reserve some of your capacity for responding to changing market needs, then iterative planning (like Scrum) is the best planning methodology out there for you. Use it.

If you want to run tons of experiments and adapt immediately to what you learn or if you can’t predict what you will do from day to day (professional services or IT are often in this situation), then a continuous-form planning methodology (like Kanban / Naked Planning) is best for you. Use it.

If you are delivering on fixed-bid contracts and want to make sure that you can get feedback from the customers at the right times and always both get paid and make them happy, then Spiral is best for you. Use it.

All of these are Agile methodologies (including Waterfall). The Agile part was in making software be a low-risk endeavor. The particular way a team chooses to do planning is just a local specialization to match the context.

The point is that the planning approach is irrelevant. It is the tail, not the dog. First we change the rules of the game—we remove all the arbitrary technical constraints on planning. Then we leverage our new circumstances.

And that brings us to scaling

So how do I recommend scaling Agile?

Well, assuming you master shipping software, then you scale the business. You won’t find answers to scaling Agile in some book. You certainly won’t find them in any framework (there are several out there. All assume a similar level of incompetence at delivering software. All are irrelevant if you lack that incompetence). About the best guide that I’ve found is “seek level 3 on the Agile Fluency path.” This works because it isn’t a set of prescriptive guidance. It is just an indicator of which problems are most worth solving.

There won’t be technical dependencies between your teams. Each team will be able to deliver value to the customer. Teams won’t cause problems when they integrate code with each other. None of the products will have bugs (well, not for more than about an hour out of every 2 weeks). Teams can work together in order to provide solutions that enhance each other. Teams can also deliver business value without coordinating with any other team. You only really have one open problem:

What does my business need?

  • How can I get that without sacrificing my ability to ship?
  • How can I maintain transparency when no one can know the entire business?
  • How can I decentralize decisions to avoid blocking on “the geniuses who make the decisions?”
  • How can I keep the right degree of alignment (not so much that I lose my ability to experiment; not so little that I fail to deliver anything good)?
  • What do my customers want out of my business anyway?
  • What does my business do that pisses my customers off?
  • How can I satisfy shareholders without allowing them to destroy my business (a business which, in general, they know basically nothing about)?
  • How can I gain information about my customers in the process of doing business with them?

These are not Agile software development questions. They are not process questions. They are business culture questions.

The answers

In the end, there are no constraints. You can choose to scale by simply delivering separate pieces of software, each on its own cadence (something like Amazon’s marketplace of services without their remixing). You can choose to do better than that when it is to your advantage.

Often the best approach is to simply take whatever you were doing to scale your previous software efforts and ask what you can delete, now that you no longer have to worry about bugs, integrations, technical dependencies, or other hogwash. Worst case, you do everything that you used to do and no more. This will deliver better results than your previous process, because you will still be set up to deal with all the technical complexity but that complexity will simply never arise. Best case, the team can drop the parts of the process that only existed to work around problems caused by its own software development.

In other words, you can follow your nose. Do the easy, obvious thing if you have no reason to do more. If you do, then do more.

Picking a fight with thought leaders

There are sets of 10 partner teams that are each fluent at the 2-star level. I challenge anyone to find me such a set that needs a complex framework to coordinate their efforts. A lot of effort to figure out what their customers really want? Sure. But not to coordinate their efforts.

In the end, anyone building a process or framework for scaling Agile doesn’t know Agile. Period. Full stop. Go learn to ship software at will without any risk of defect, and you will discover you no longer need your framework.

Oh, and if you want to know the formula to ship software without risk, just ask. Better yet, read a good XP book (I recommend James Shore & Shane Warden: The Art of Agile Development) and go to a Code Retreat. You’ll be working in that style by the end of the day. Bring it back to your daily work and you’ll see obvious results by the end of the quarter.

15 thoughts on “Scaling Agile – the Easy Way”

  1. Hmm, challenging. But I'm not convinced we can just wave away everything but technical fluency. Suppose a large organization could exist with all its teams working as you describe here. The teams are now "Agile", but the organization is not.

    It would be reasonable — expected, I'd suggest — that such an organization would still have very substantial problems and would consider those problems to be /precisely/ part of "scaling agile".

    Further, this level of technical fluency might be a sine qua non for some kind of maxed-out "Agility", but it seems almost certain that other, perhaps even much simpler changes elsewhere could improve all the measures the organization hopes to improve by becoming more "Agile". Attention to organizational coupling and cohesion can do a lot, even with only moderately fluent tech.

    This article is — I mean this in the most loving way — incomplete if we take technical fluency as being "all there is", because it waves away the incredibly complex cultural activity and learning that comes under "What does my business need". And it is quite possibly polishing the wrong things if we treat technical fluency as the basis, because improvements we actually want are often accessible without getting all the way to max fluency.

    Or so it seems to me …

    1. You are right – we can’t wave away everything but technical fluency. And no one will get to be a 3-star organization without improvements in other areas.

      That said, existing efforts to scale Agile, both the implementations and the frameworks, focus on the wrong things. They spend a ton of time, effort, and column-inches on ever-more-intricate ways to discuss things, speed decisions, and so on. They assume software development as it was yesterday and try to make the organization more effective.

      The real advances: feature flighting, lean startup, design thinking, devops, etc, are all being done by teams that first get the tech right. Once you do that, a tremendous number of problems go away. An even bigger number become obvious. And when you then go after the ones that aren’t obvious, you have a powerful engine to draw on. Shipping bug-free software when desired is something you can simply take for granted.

      Also, and here’s the kicker, I have found it takes about the same amount of time to build this technical fluency as it does to make the “simple” organizational changes that people usually go after – while trying to work around the lack of technical proficiency. It takes scrum teams 11 iterations to learn how to plan their work. It takes companies months or years to re-org into more loosely-coupled groups. And they can’t find the optimal solution, because they have lots of narrow specialists so it is very difficult to sufficiently staff a team to deliver end-to-end value while still keeping it manageably small.

      It takes about 2 months to take a team from 80s-era software practices to developer testing + refactoring to units + starting to unit test & test first + pairing at least a couple hours per day. It takes 3-6 months to make dramatic shifts in defect injection rate, cost per feature, cost per commit cycle, number of deep specialties per person, and number of people required to deliver end-to-end value. Sure, actually paying off the legacy code debt takes place over the next 1.5-2 years. But the value is realized much earlier and the company can start leveraging it.

      My argument is simply that most teams are perpetually only 3-6 months from being able to ship low-bug software when desired and delivering end-to-end business value out of each single team (making decoupling trivial). But they choose to go a different, more costly direction.

  2. Hi Arlo, thanks for posting. Would you care to list a couple of the open source projects that you're referring to, so that we can study them in more detail to see what they are doing.

    1. Actually, most of them do at least the basics (developer-written automated tests, refactoring to create units, unit testing, some test first, continuous integration). Those are pretty much required if you are going to ship anything open source. Otherwise bad pushes from loosely-connected collaborators will kill your project.

      Beyond the basics (pair programming, retrospectives, shared space / sit together / everyone's on IRC, etc) are pretty common. Pypy is a fine example, but there are tons of others. Samba has some famous pairing. Mylen has extensive use of tools. Most anything that gets large enough to likely always have 2 people in IRC hits this level.

      Basically, go to github, sourceforge, or any of the others. Pick any of the top 100 projects. Look in their contribution guidelines. They will tell you how to run tests and how to submit patches to their CI system. They will tell you how code reviews work. They will tell you where their IRC channel is (the shared space you are supposed to hang out in while coding). There are a couple of exceptions, but they are pretty rare.

      1. Hey Arlo, I admire your work. Having worked at Socialtext, I have experience the environment you describe with the life-on-irc, constant collaboration, git pushes are emailed to the entire team for continuous review, etc.

        I liked the whole article; brilliant stuff.

        Looking at SourceForge's top 100 projects, though, I don't see support for the claim that there are projects that go 1,000 developer-days between bugs that get to test. Given the amount of differences in javascript interpreters, I typically see an occasional browser compatibility bug in even the most high-performing team — after all, the code /did/ work the way the programmer expected it to in his/her environment. I have seen these sorts of numbers with batch apps and CRUD web apps where only one browser is supported.

        With a few more examples, we may have something worth publishing on #NoDefects. I a a writer for cio.com. Can we have some examples?

        1. Sorry for the confusion. The OSS world is the easy place to find some of the core practices. I recommend them to people who want to know whether TDD or refactoring are adopted much at all. I also point new learners to them.

          But they are still distributed and have more issues as a result. And the use of pairing is rare, compounding the problem.

          The numbers I was quoting for bug introduction rates are from companies.

          The 1/100 number, which is actually observed as "about one bug per team-fortnight," come from personal anecdotes. When I talk with teams that are really doing XP (at a minimum: pair program 100% of the code, refactor to units with tools, test first, dev tests, test units instead of integrations, sit together, and action-oriented retrospectives), then I commonly hear these numbers. Any time anyone identifies themselves as doing XP, I have taken to asking about their defect injection rate. I point them to Nancy V.'s paper (Embedded Agile by the Numbers With Newbies), and they say something along the lines of "yeah, that's about right. One per 2-3 weeks gets written, often caught by the dev team itself an hour or two later." I have gotten around 20 such anecdotes in the last year.

          The 1000 dev-day number was from Hunter Technologies. They have a ~6 person dev team who went over 2 years before introducing a defect. They were an XP team prior to this period, delivering at the "about a bug per team-fortnight" rate that others see. Then they upgraded their pairing 100% of the code to mobbing 100% of the code. This bumped them ahead of even the other XP teams.

          So the data is soft. It comes from anecdotes. But it is an accurate representation of the teams that state it. I'd love to see someone gather more firm data. The challenge lies in filtering: terms like TDD mean fundamentally different practices to different people, and teams change their own definitions over time (as they learn). This can make it hard to select the teams that have at least achieved fluency at a moderate level of practice.

          1. I've had similar experiences Arlo, and worked at teams that performed at that level. The challenge I see in these conversations is we lose /context/ – what kind of apps are they building on what platforms? What is your definition of 'bug'? What does 'found in testing' mean exactly, etc.

            The simplistic version that assumes we mean the same thing doesn't take into account platform compatibility, complex guis, or /change/ of platform – where a high-performing team might produce buggy code for a few iterations while they learn a new technology.

            So I hear you, I just wish the conversation had more context. Premature generalization is something I'd really, really caution against. Maybe that's something for us to work on as a community.

          2. This losing of the context is why I state that I’ve only got a set of anecdotes; I don’t have data. I’d love for someone to gather data.

            I am really good at seeing patterns, distilling the essence, and communicating that core. I am not good at gathering specific, concrete data and presenting it. The result is posts like this, where tons of experienced XP coaches & practitioners echo “yup, that’s right on,” and people who haven’t experienced XP vary in responses from “BS” to “interesting, tell me more – but prove it.” I need someone who is better with details to go and find them and prove the case. Are you that person?

            Specifically, the pattern I’ve seen is:

            * Look at bug injection rate. Ignore when a bug is found, and ignore bug severity.
            * Define a bug as “anything that causes anyone – from dev to stakeholder to customer – to go WTF, and which has ever been either checked into source control or experienced by anyone other than the person/pair who wrote it.”

            Then:

            * Defect injection rate for co-located, sit-together, full XP teams of 6-10 people who have been doing XP for at least a year is always around 1 bug per team per fortnight.
            * For those XP teams, this seems to be the case regardless of technology, language, product, company size, number of teams involved in the product, or any other factor that I have been able to discern.
            * This is a ton lower than non-XP teams.
            * There is one aberration: Hunter Technologies. They started as an XP team with numbers like the rest, then upgraded pairing to mobbing. In the 3 years since they started mobbing (ignoring the first week of the mob while they were learning), they have shipped ~10x as many products per year as before (same scope per product) and have written exactly 1 bug – which was found by the mob an hour later, but still counts since it made it to source control.

            I’d love someone to gather specific data to support or refute this pattern.

        2. Also, JavaScript is a very difficult environment to get to low defects. The problem isn't browser incompat – there are libraries that will help with that (especially if you only need to support latest versions). The problem is that (putting on flame-retardant suit) JS is a dynamically-typed language.

          This make it impossible to build good tools for JS. The computer simply can't figure out the total scope of what might be going on in any art of the code, so it can't reason over what changes are safe and what changes would introduce bugs. Result: no refactoring. Certainly not with tools, and not really by hand either.

          Without refactoring, the cost and risk of change goes up. Design-impacting changes have higher risk than changes that maintain the current design. So design quality suffers. This shows up in increased cyclomatic complexity, which is correlated with defects. The worst parts of the code become those that are changed the most often, which focuses rolls of the bug generation dice on the code with the highest probability of bug. Not a recipe for success.

          I don't have data to show it, but I would expect that a high-performing XP team (which uses refactoring well) will write fewer bugs on statically-typed languages (with tool support) than with dynamically-typed languages. This will, I expect, wash out all other productivity benefits from the dynamic typing – especially once we consider the organizational / total cost of the bugs. Furthermore, I expect that those XP teams with tools will produce fewer bugs than any team with dynamic languages – it isn't that they are maladapted to work on dynamic code.

  3. Arlo, one of the big problems I've seen is that organizations want large scale efforts to provide an estimate of when these large efforts will be done. (ha!) You and I realize that the larger the effort the more crazy that estimate is. But management is serious in their desire. What do you have as advice for those folks?

    1. Actually, I agree with management. Predictions do have a lot of value. And I think they can be done in a way so as to not be misleading.

      Heck, even information-free release date predictions can be useful: that's all an iteration boundary is (we will ship a week from today. Content TBD. But you can trust that something will ship). There's no less value in such a statement made 1 year out (we will ship a new version of the XBox for Christmas in 2013. No promises what will be in it, but it will ship). Used correctly – and correctly viewed as expressing only the info they contain – these are extremely useful.

      That said, we can do better. I should probably write a blog post about this, because there are details that matter. But I've got a poor track record at writing posts that I promise, so here's the precis.

      Key idea: stop guessing and checking. Start to measure, project, and learn.

      Pre-iterative companies use guess and check very effectively. They define some set of value as a "project" in a fairly arbitrary manner, then make a wild guess about how hard it will be to invent. They double the cost and compare things to each other. There is a lot of slop in the numbers, but they can at least find surprises (surprisingly expensive or cheap to produce).

      When they go Agile, they try the same thing. They run into 2 problems:

      1. They are now seeing more details, so have a harder time ignoring details and making wild guesses. They feel guilty for guessing.
      2. They continue the same guess and check approach for making predictions about their work.

      The second is the more insidious of these. Any team that estimates their work in actual numbers is using guess and check. Their mechanism of improving their predictions is:

      1. Guess how long something will take.
      2. See how long it took.
      3. Try to change the person making the guess so that they will guess better.

      That third step is hard. People are really good at consistency, so adjusting that to match a particular set of data is really tough.

      Teams that use relative estimation (or no estimation + equal-sized stories), on the other hand, do the following:

      1. Guess how items compare with each other in complexity. Or just make them all about the same complexity. Have lots of items so that error in this step will average out.
      2. Measure how much work gets done in a fixed time.
      3. Measure all the other stuff that gets done as well as pre-planned work.
      4. Measure how much new work arrives into the set of planned work for the project (stuff added to plan for future iterations, not done during this iteration)
      5. Assume that things are constant unless measurements prove otherwise (amount of unplanned work that will arrive, total work done, and thus the amount of pre-planned work that gets done, and amount added to plan).
      6. Use linear regression to smooth out variance. Observe where the lines cross (or that they diverge). That is your prediction, based on current methods of work.
      7. If this is a distasteful result, discuss changes that we could make in the way that we do the work that would make things better (cut scope, work more effectively, reduce blocking / waiting, reduce rate of new feature arrival, reduce amount of non-planned (interrupt) work).
      8. Assume that those changes will have no impact. But keep measuring the results. When they do have impact, it will show up in the data. The lines will move, and the prediction will change.

      The result is a highly predictable software delivery mechanism. There can still be a lot of variance. The main source is technical debt. But that is an addressable problem (implement XP & watch variance decrease steadily towards 0 at about 18 months).

      I am all for predictions. I just prefer to make my predictions by making mathematical projections based off of real data, rather than by guessing and checking.

      And, BTW, you can only use the data-driven approach if you have many measurement cycles during delivery and if those measurement cycles each deliver a completed chunk of work (nothing 90% done). Which is why XP teams can make predictions that are so much more accurate and specific than traditional teams. They have more data. As long as they choose to use it.

  4. > Most teams are perpetually only 3-6 months
    > from being able to ship low-bug software when desired and delivering
    > end-to-end business value out of each single team (making decoupling trivial).
    > But they choose to go a different, more costly direction.

    This should be your lead paragraph 🙂

    1. Good point. I think I'll write a new entry where it is. Title either "Just, Please, Stop the Stupid" or "Note to Teams: Please Stop Writing Bugs." Or something like that. Use that to get into the core of successful tech-first Agile transitions.

  5. I agree with what you wrote about here – I like to call it Product Agility. This is forgotten part of scaling Agile – everyone is talking abot organisations and processes but nobody is mentioning products. From my experience as an independent change agent and agile coach I must say that lack of product agility and legacy code is the biggest obstacle in agile trasformations on a bigger scale (I mean more than one-three teams). But still even in small organisations without product agility it is really difficult to achieve organisational agility.

    Methods like Scrum or Kanban will help you in transformation but they will show you that you have this problem with products. Maybe sometimes help you to adress and manage solving this problem. But you need to realize that this problem is really important and need to be solved.

Leave a Reply

Your email address will not be published. Required fields are marked *