Not Estimating At Scale

Estimation is a sensitive subject in Agile. Should we estimate? Do we avoid estimation in days or other time-based units? If we use relative estimation like story points, do we standardize across teams? What do we use estimation for? Are we explicit enough in emphasizing the distinction between estimations and commitments? How do we prevent abuse?

I’m not going to provide an answer to these questions.

If you want to get a good treatment on estimation with regards to agile, I suggest you read Ron Jeffries’ excellent articles in the Pragmatic Programmer’s magazine: Estimation is Evil: Overcoming the Estimation Obsession, and Estimation: The Best We Can Do

I’m just going to describe a fairly simple and effective way to not (well, kind of not) estimate with multiple teams. This is based on a recent experiment doing this exercise in a running project.

Not Estimating

In this project we had not been estimating at all, up until this point, which was 9 months into the project. And when I say we did not estimate, then of course I’m lying. But we did not do estimation of individual user stories (or larger items, Epics, Features, whatever you want to call them).

We started this project introducing Scrum, and going into one-week sprints. None of the participants were used to working in an iterative way. And the requirements for the project were completely unclear. So even if we wanted to do estimations, there was very little to estimate! We decided to forgo estimation, and simply asked the teams to split the user stories into small enough slices that a story could be finished in two to three days.

Of course, that means that there was estimation involved. Otherwise they couldn’t know whether the story was small enough. So I wasn’t lying about the lying. But the effort was limited, and only done for stories that were being groomed for the next sprint(s).

Delivery after nine months?

Fast forwarding nine months into this project, we were in a somewhat different state. For one, we were no longer doing Scrum, but have moved to a Kanban approach. We have a two-step Kanban configuration. Stories are prepared through a READY board, of which the process steps’ Explicit Policies reflect the Definition of Ready we had as a Scrum team. One of the policies is the ’takes less than 2 - 3 days to get Done’ rule. One of three development teams is involved in the grooming process of an individual story and usually (but not necessarily) picks up that story later by moving it to their Build Board.

At nine months, people traditionally get interested in the concept of delivery. Of course, our project was already delivered. Production figures had been produced. But the project was supposed to ramp down around the 12 month mark. That meant that there was interest in finding out what part of the features still on the wishlist could be delivered by that time. So that required some estimations.

What to estimate

At this point, there are a bunch of high level areas of interest that we are interested in. These haven’t been looked at or communicated with the teams yet, and haven’t been prioritized. In fact, one of the main reasons we need the estimations is to help with prioritization. We do not want to spend a lot of time on estimating these things. We should focus on actually delivering software, not talking about things we might never work on.

We also don’t want to give the impression that the estimations that we will come up with are very accurate. A good estimation makes the uncertainty of the estimation explicit. We decide that we’ll be estimating the incoming feature requests in ‘T-Shirt sizes’, each categorized by a certain bandwidth of the number of stories expected to be necessary to implement the feature:

Small: 1 - 5 Stories
Medium: 6-10 Stories
Large: 11-15 Stories
Parked: Not clear enough to size, or simply too large: requires splitting and clarifying

To make sure we wouldn’t spend a lot of time in detailed discussion, we decided to use Silent Grouping as the method of estimation. To make it work across our teams, we added a little diverge and merge to the mix.

Estimation process

We arranged for a number of sessions, each 90 minutes long. Each session would be dedicated to one main functional area. The Product Owner (or in this case the functional area PO helper) would introduce the functional area, explain the overall goals, and run through the features that had been identified in that area. This introduction would be half an hour, and would allow the teams to ask any questions that came up on the subject discussed.

Then we split into four different corners of our room, each attracting 8-10 people, and performed the silent grouping exercise there. This simply meant that everyone that showed up got a feature token (in this case simply the title of the feature printed on a piece of paper) and was invited to put it onto a board in one of four columns (the categories described above). Then people were allowed to change the position of any paper if they didn’t agree with its placement. And all of this happened in complete (well, we tried…) silence.

After a few minutes, things stopped moving, and we then went for the ‘merge’: on a central board, we called out each feature title, and based on the placement on the four separate boards determined the final position for the story. We did a few iterations of this, but our final set of rules seemed to work quite well:

Stories that have the same estimation on all boards, obviously go into that category on the main board
Stories that have two different, but adjacent, estimations go into the larger category
Stories that have three or more different estimations go into ‘parked’

We found that we regularly had some discussion on whether something was parked because of uncertainty, or size. But when we tried those out as separate categories, most turned out to be uncertain and the added value was limited.

Results

We had four 90 minutes sessions to estimate what turned out (of course:-) to be quite a bit more than 3 months of work. We found that quite a large part of the features ended up in ‘parked’, simply because they were not clear enough for the development teams to give even this kind of ball-park estimate. To get these features clearer, three amigo grooming sessions were set-up. This brought the number of parked features down considerably, and got us to a fair idea of the total amount of work. Since those sessions did not include the entire team, this did break our intent to have everyone involved in estimation, but we haven’t found a better way of doing this yet.

A second, and maybe even more important effect was that the whole team was instantly up to date on what we had planned for the next time period. A number of additional features and technical improvement were brought up during and after this session as people realized they might be needed, or would fit into the direction the project was going.

And the estimates gave the more distant business owners the required feedback to consider the priorities and decide what they could do without. For now.