No-Nonsense Product Development Estimates

Agile

Sep 12

Source: https://www.flickr.com/photos/bandfan/5548675317

When will the team be ready to ship?”

There is probably no question we engineers hate more than that one.

For as long as I have been building software, bumping up against 20 years now, I have observed engineers, engineering teams, and engineering managers squirming in their conference room chairs, frantically waving their hands, and saying just about any words they can think of — words like “dependencies” and “technical debt” and “build process” — to avoid directly answering that question.

And now that I spend more time than not with the business side of the house, guess what? They hate asking that question just as much as we hate hearing it! Why? Because either we avoid directly answering it, gingerly stepping around like bare feet over hot coals, or we give them the latest “educated guess” that we both know is entirely bullshit.

Can’t we just do away with the estimates all together? Yes, absolutely!

There are a great many startup teams that happily build software and ship to customers without ever asking their engineers to estimate. They can skip estimating altogether simply because the total time required to iterate from, “I have an idea for our next feature,” to shipping it is measured in days or even hours. It isn’t worth the cost of spending time estimating.

But most of us do not have that luxury. For better or worse, we work in large organizations, and have big bulky development processes and infrastructures. Projects take weeks or months, rather than days or hours. And we have stakeholders, customers, or investors, who demand — and are entitled to — a clear and responsible answer to the dreaded question: “When will my software be ready?”

In this article, I explain why estimation so hard, what’s at stake, and — when you absolutely have to do it — how to do it right.

The Nature of Digital Products

The only value in a software system, frankly, is in its novelty. Each new software component that is built, anywhere by anyone, only solves a specific and unique set of problems, bounded by a specific context. Once built, by definition, that component is perfectly adapted to its domain and strictly speaking never needs to change. Assuming no changes to its inputs or environment, it should run continuously the exact same way forever. It is automation in the purest sense.

Speaking generally then, automating something with software is a one-time thing. Once a manual process has been automated, once it has been codified into software algorithms and data structures, it can be replicated infinitely. There is never any need to write that exact same software algorithm again. Sure, there is work required to modify and extend it. But software by its nature can be copied at near-zero cost forever.

Of course, things like inputs and environments change all the time. There is plenty of work for engineers to constantly upgrade and tune software to adapt to changes. But in general, software should only have to be written once. Because of that principle, software once written is a commodity. The cost of designing it, massive; and the cost of copying it onto other computers in identical form, almost nothing.

Of course, in the real world you have software systems for sale that compete with one another in that they purport to do the same thing better, faster, or cheaper than one another. But those competing systems are generalized components, abstracted from the specific case, and not perfectly fit for the unique environment described above. Thus, most of the work required to install a new component is precisely in customizing and fitting it into a specific business context. Things like integrations, configuration, and importing unique data sets become valuable and add to the cost of the software. Again, novelty.

Here is the point: Software development is a design process. It is not construction or manufacturing. It’s closer to other knowledge work activities like architecture, design, writing, law, or even art. As software engineers, we don’t do repetitive tasks (at least, not while actually adding value). Every single line of code we write is literally different from every other line we’ve ever written before. Sure, there are patterns and similarities. And a great deal of the thought leadership of our profession is in how to optimize those patterns so we can generate the most business value for the least amount of code. Our whole profession is about not repeating ourselves!

Source: https://www.flickr.com/photos/davidgsteadman/8615028530

Why Estimating Digital Projects Is Hard

When we estimate how long it will take us to build something using software development, it is something that — by definition — we have never done before and will never do again. And estimating something that has never happened before is… well, difficult.

As we outlined above, software engineering is a design process, and every new software system is by definition unique. But, there are enough similarities and patterns that we should be able to estimate with some degree of accuracy.

There are three primary factors where those repeatable parts of software development manifest, and which we can leverage to estimate effectively.

Architecture Components and Complexity

Software architecture is surprisingly straight forward to explain in the abstract. Think of a software product as a system. A software system is made up of components, in a way that’s not entirely dissimilar to mechanical or electrical systems. Each component has a job to do, and a boundary around it that defines how you operate it, or how it coordinates with other components to achieve some outcome. Inside the component are more sub-components, and inside those further sub-components, all the way down to the machine code executing in the processor.

We design systems using components for two reasons. First, they give us re-usability, a point I belabored above. A component that does something well defined in one place can often be reused to do a similar thing in another place. Second, components allow us to organize the complexity of the system in ways that make it easier to manage, and complexity is a key factor in estimating software development.

I think of complexity a bit like matter and energy in physics. In fact, there is even a law of conservation of complexity. Complexity cannot be removed or hidden, only moved around from one level of abstraction to another. So components allow us to wrap up some amount of complexity, subdivide it, and deal with it abstractly in a way that simplifies it somewhat.

The best layman’s example of this is probably the API. An API is the interface to a component (a service) in a system (the internet). You don’t have to worry about how the component works internally in order to work with it and get value from that interaction.

So, in estimating the time required to develop the whole or a part of a software system, the first factor to consider is the set of components that make up that system. The number of components, and the number of points of connection between those components, can be used as a measure of the complexity of a software system. This will affect the estimate.

Tools & Process

If you are not an engineer, I’ll let you in on a dirty secret of software engineering. Although, nearly all of the value of software (and thus the value of the labor time of software engineers) is in the novel aspect of creating new lines of code, we actually spend a shockingly small amount of our time developing those lines of novel code.

So what the hell are we doing all day, then?

All of the time software engineers spend not coding new lines of code is spent performing activities related to writing code, but that are not, in fact, adding any direct novelty value to the software itself. I’d wager most teams spend more than half their time on this stuff. In the worst cases, teams can spend upwards of 80% of their time wrestling with configuration, security, incompatibilities between third-party components and tools, and unexpected impacts from legacy systems and technical debt.

I am generalizing grossly here for the industry as a whole. It has been improving over time, but this should paint the worst-case picture for you.

There is a constant battle over the ratio of tooling and process time to actual coding time in every software engineering team. The very best teams, not coincidentally, the same teams that skip estimating altogether, have a very high ratio of code-writing time to tooling and configuration time.

So, the amount of time the team spends on their tools and other secondary and tertiary aspects of software development is inversely proportional to the amount of time they spend writing novel software code of value to the business.

Giving the team the right tools and equipment with the proper time to set them up in an upfront one-time investment will improve the overall ratio of time spent coding to time spent wrestling with tools and other peripheral matters. This will affect the estimate as well.

Third Party Components

Finally, we have the factor of third party components. When I started building websites in late 1999, it was not uncommon for an engineering team to build every piece of an application from scratch. Over the intervening two decades, an explosion of third party tools, frameworks, and services have enabled us to develop powerful applications and interfaces by leveraging those existing components.

In fact, these days it is likely that the total lines of new code generated by your team represents a relatively small portion of the total number of lines of code executing on the server or in the users’ web browser or phone.

Engineers must account for a dizzying array of software components that are included in any modern application. The benefit, of course, is the time saved in not having to build those generic features and services from scratch. The risk is that we can’t possibly know (at least not in exhaustive detail) the ins and outs of every single one of those components.

Whenever a team decides to leverage a plugin or connect to an API built by other developers, they are exchanging risk of not knowing exactly how those components work for the benefit of not having to build it themselves.

This third factor of third-party dependencies is similar to the first component factor mentioned above, but with an extra twist. Not only do we have to account for the additional components represented here, we also don’t know exactly how they work, or how their operation might affect other components in the system. Third-party component dependencies are a huge aspect of software development, and it’s not going away. This too will affect the estimate.

Process: The Human Factor

The technical aspects previously discussed about the software itself, and the tools used to build it, represent only part of the picture. We also have a number of human factors at play that can dramatically affect the time required to develop and ship any software system.

Among the biggest of these human factors is process. Most software systems are built by teams rather than individuals. And the way the team works together to complete the design, development, and assembly of a complicated software system matters a great deal.

A healthy software development process can make the difference between a project that is completed on time and under budget, and one that careens horribly off the track of predictability and affordability. Those familiar with agile software development (most of you reading this, I hope) will likely have encountered a variety of concepts and norms used for describing and estimating software development. Artifacts like user stories for describing functional requirements, and practices like “planning poker” for group estimating, have gained almost mythical stature in the software world.

But instead of dutifully parroting the usual dogmatic agile buzzwords, let’s look at the fundamental principles that make it work.

Queues and Batches

I first encountered the concept of queues in product development in Don Reinertsen’s fantastic book, “Flow: Principles of Lean Product Development.” A queue is basically a set of objects that are waiting in line to be processed by some kind of service.

Queues play an incredibly important role in any workflow, but especially in product development. To illustrate this, let’s use an example. Let’s say you have a work process with two steps, call them Step A and Step B. These steps could be design and then coding a feature, for instance. Or coding and testing. It doesn’t matter.

If an item completes Step A but the service processing Step B is still busy, that item will have to wait in a queue. And further, if the arrival of items into the queue after Step A occurs at a rate that is faster than Step B can process them, the queue will never get any smaller —unless you add more resources to Step B to process items faster. So if you design features faster than your developers can build, adding more designers won’t improve overall throughput. See where I’m going with this yet?

Queuing theory (and practice) has shown that the buildup of queues in a system can have a negative compounding effect on performance. As queues form they become harder to reduce at an asymptotic rate. In the extreme, things can quickly grind to a halt.

As Reinertsen points out, while there are queues in manufacturing, the work items are physical objects that become very obvious when they accumulate in the factory. In fact, they represent inventory that has to be managed, stored, and accounted for as physical material on the balance sheet. It’s relatively easy for plant management to see where queues are forming and apply more resources to the bottleneck.

But in product development, the inventory is “invisible”. Partially completed designs or code can sit in a queue between steps, but as virtual assets they can remain invisible indefinitely unless you use a system to make them obvious. Once you can see your queues it becomes possible to manage them properly.

The number of items you work on at one time is called a batch. If you work on a single item at Step A, and then immediately pass that item to Step B, you have a batch size of 1. If you work on ten items at a time at Step A, and then pass the whole batch to Step B, your batch size is 10. It turns out, the larger the batch sizes you work with, the more likely there are to be queues of waiting items between steps. And the bigger the queues, the more likely your workflow will become overwhelmed and stall.

This has profound implications for software development, especially since time to ship is critical for business reasons. Designing your process to reduce queues, usually by making your work visible and by working with smaller batch sizes, helps ensure the smooth flow of work items through your system.

This is why agile software teams use visual systems, like Kanban boards or ticketing systems, for displaying work items like user stories. It’s also why most of the beloved agile practices, from test-driven development to the sprint, are precisely about reducing batch sizes.

But awesome as all this is, what does it have to do with estimating?

Well, everything. You see, the most profound impact of controlling queues by reducing batch size is that you can better manage variability. Variability is often amplified by queue size. Keeping queues in check allows for a system that has a smoother flow. And that means better predictability. Better predictability, more reliable estimates.

How do you know if your system has a smoother flow? You should be measuring your team’s cycle time. Cycle time is the total elapsed time it takes for a work item to travel from the beginning to the end of the your process. If you dutifully track cycle time on every work item, pretty soon you will be able to calculate an average cycle time for most items. This cycle time data is going to allow you and your time to better calibrate your estimates. If your cycle time is fairly smooth, as opposed to volatile, it means you have a stable, predictable system. That will make it easier to estimate with accuracy. If it doesn’t, you will probably need to look closely at where queues are forming and work to eliminate them.

How We Currently Do Estimates

When I start to work with teams, they are doing product development estimates in one of two ways. One is downright terrible, insofar as it’s sloppy and nearly always wrong. The other is at least apparently scientific on the surface, but can still lead to inaccurate results.

A staggering number of product teams still do estimates basically by gut feeling or guessing. The boss asks for an estimate for a list of features, and the engineers, either as individuals or as a group, will shoot back the expected calendar dates for completion or time required to complete each feature. This approach is so error-prone that it’s scarcely worth bothering with at all. If you’re going to be that careless, why estimate at all?!

Most teams that call themselves agile stumble at estimating scientifically. I definitely give them an “A” for effort. Followers of agile will be familiar with estimating in “story points” instead of time. For the uninitiated, story points are a unit of measure that is intentionally decoupled from actual calendar time. What it amounts to is essentially a level of difficulty. As engineers assign story point estimates to features, and complete some number of those story points during a sprint or iteration, they can use that historical data to improve their estimates over time.

At least, that’s the theory. In practice, I see few teams that are stable and mature enough to properly estimate in story points.

No-Nonsense Estimates

I don’t really care if you use story points or time units in your estimates. It is far more important that you estimate using a method that properly surfaces where there is uncertainty, and then iteratively work to reduce that uncertainty.

By combining our understanding that software architectures are made of components with our understanding that queues in our workflow affect the team’s throughput and predictability, we can build an estimation method that works. I learned this method from my close friend and colleague Tim James, and we have both used it successfully on projects since then.

1. Properly scope the complexity of your project.

First, you will need a list of features you need to ship. For each feature, you should list out all of the components required to complete the feature. Components should include front-end elements, back-end services, and any third party components. Pay special attention to the number of connection points between components. Don’t start estimating until you have elaborated all of the components for all of the features.

2. Get everyone aligned using historical data.

Next, you will want to calibrate the team by reviewing your historical data from your past work. Review work that was already completed, being careful to illustrate the component architecture of those features, and familiarize the team with the actual completion times it took to do that work. This will help the team to converge on a common understanding of its own capacity and speed to complete features with a similar number of components and connections.

3. Estimate in ranges, and then iterate.

Finally, you are ready to estimate. The simplest tool to use is just a spreadsheet. Ask the team to estimate each feature in the spreadsheet not with a single estimate value, but a range value from lowest to highest expected time to complete. This range is the critical part of this estimating method.

Once all of the ranges are entered for each feature, low and high, sort the list in order of the widest difference in the time ranges. The item at the top of the list should have the largest difference between its high and low estimate. This is intentional.

Now, proceeding in order from the top of the list, the team should discuss the features in more detail, particularly surfacing disagreement around why the disparity between high and low exists. Often, this can highlight new information, shortcuts, or compromises that enable you to update the estimate range. Do one pass of this through the entire list and update your range estimates. Then, re-sort the list and repeat the discussion. Do this at least 3 times, but feel free to do as many rounds as necessary until the team feels it fully and completely understands the time and complexity required to build each feature.

4. Track your actuals over time, and adjust estimates accordingly.

This is not a one-time process. If you are going to estimate at all, you should plan to have this discussion regularly, perhaps at the beginning of each sprint if you use sprints, or on a weekly or monthly basis if you don’t. Review your estimates from the previous session, and use the actual completion time to further calibrate the team. This is a critical step. Your team’s estimates will not improve over time if you skip the review of past estimates.

Sam McAfee