planning2-min (1)

Task Estimating Using Probability

murrayray Blog, Business Transformation, Transformation

This is the first of a 5 part series using probability and other tools to successfully plan and manage projects.

Part 1 – Task estimating using Probability
Part 2 – Project Plan creation using Probability
Part 3 – The Mechanics of using Probability to build a project plan
Part 4 – An easy way to capture actuals and manage the plan
Part 5 – Post-Mortem – or using data to improve the planning process

We will start with an example.  Imagine that you are a brand new project manager, and you need to get an estimate from one of the developers on your project team.  You have a conversation with the developer (We are going to assume for now that the scope of the task is defined and agreed to.  Scope is a topic for another day) and you walk away from that conversation with an estimate of 3 days.  What do you do with that estimate?  Do you:

  • Enter 3d in the Work field of your project planning tool?
  • Knowing that that developer is only allocated 50% to your project, do you enter 3d into the Duration field and 1.5d into the Work field?
  • Perhaps you enter 3d into the Work field and 6d into the Duration field?
  • Having been told that that developer is overly optimistic, do you pad the estimate and enter 4.5d into the Work field?

Before we can answer those questions, we need to understand what an “estimate” really is.  At its heart, an estimate is a prediction of the future, specifically predicting the number of hours it will take to complete the task.

Much like a weatherperson provides both a prediction and a probability[1], our estimate of the task effort should be a prediction with a probability.  So, if you go to three different people and ask for the estimate, you are likely to get 3 different responses.  Some of those responses might be optimistic (i.e., 25% chance of hitting), and some of those might be pessimistic (i.e., 75% chance of hitting).  As an aside:  When we say hitting, what we really mean is that the actual effort is equal to or less than the estimate in the plan.  Some people might quibble with that definition, but when I’m in middle of managing a big effort, tasks that take less time than estimated are a good thing.

So if we have a set of possible outcomes, we can use a normal distribution curve[2] (here is where the stats comes into play) to help us.  For that, you need to figure out the mean and the standard deviation.  To make the calculation simple, I like the formula:

( (1 * Best Case/Optimistic) + (4* Most Likely) + (1* Worst Case/Pessimistic) ) / 6

Most people find it easy to work with those 3 numbers.  We are used to thinking about best and worst case, and that makes it easier to come up with most likely.  You can use additional numbers, Extreme Worst Case, Worst Case, Most Likely, Best Case, and Extreme Best Case  ( (1*EWC + 3*WC + 5*ML + 3*BC + 1*EBC) / 13 ).  This could be useful for a large critical path task, or a task that has a large number of assumptions.

So, taking our original scenario, the 3 days becomes 24 hours (in a later post, we’ll explain why all estimates for Work should be in hours) for most likely.  Continuing the example, once we dig into this some more, the developer comes up with 20 hours for a best case and 35 hours for worst case (perhaps it is possible the “X” that the developer was planning on using may not be available, or some other assumption doesn’t come true).

That results in a mean of:  ((20 + (4*24) + 35) / 6) = 25.2

And a standard deviation[3] of: 5.1

So, this tells us that 50% of the time, the actual effort should be equal to or less than 25.2 hours.

We can use the standard deviation to help us provide an estimate with a probability of greater that 50% (or if pressed for a lower estimate, the probability of hitting that).  Each of the bands below is 1 standard deviation.  So, you can see that 68.2% of all estimated outcomes falls within 1 standard deviation.  As well, 95.4% of all estimated outcomes falls within 2 standard deviations.

graph

In our example, with 1 standard deviation the range of possible outcomes is 25.2 +- 5.1.  However, for our project, with success is defined as the actual work is equal or less than the estimated work.  So, estimating the mean gives us 50% and adding 1 standard deviation gives us an additional 34.1% chance or a total of 84.1%.

You can also flip that.  For instance, if you subtract 1 standard deviation you have a 15.9% degree of confidence.  Here is a simple table that you can use, along with the calculated estimate of Work (for our example) for each degree of confidence.

Degree of Confidence Standard Deviation Calculated Work
1.0% -2.340 13.3
5.0% -1.655 16.8
10.0% -1.290 18.6
15.9% -1.000 20.1
20.0% -0.840 20.9
30.0% -0.525 22.5
40.0% -0.250 23.9
50.0% 0.000 25.2
60.0% 0.250 26.5
70.0% 0.525 27.9
80.0% 0.840 29.5
84.1% 1.000 30.3
90.0% 1.280 31.7
97.8% 2.000 35.4
99.9% 3.000 40.5

When I talk to people about this method of estimating tasks, one of the complaints I hear is that this doesn’t work with Agile.  I generally ask, if you are running Agile projects, do you play “poker”?  If so, you have a range of estimates to use.  The process of playing of playing poker is designed to drive to consensus, and understand the assumptions that each person is using when deriving an estimate.  You can use the outliers as best case and worse case and calculate the mean and standard deviation from those numbers.

Finally, as a project manager, we are used to the “why is your estimate so high?”, or “why will this take so long?” types of questions.  Now instead of answering “because”, we can state, we have a 75% chance (or whatever STDEV you used), to have this effort come in on time or early.  I think you will find it changes the discussion dramatically.

About Murray Ray

Murray is a Senior Consultant at Statêra and has run software and professional services organizations in Telecommunications, ERP, CRM and Contact Center Analytics. He has been building software for the Internet as long as there has been an internet, and takes special pride in building software that people actually like to use.

Prior to joining Statera, Murray has worked with many technology companies over a 25 year career in commercial software development and client services leadership. He has helped guide several software companies through their growth phase to acquisition.

 

[1] The Signal and the Noise: Why So Many Predictions Fail-but Some Don’t, by Nate Silver, chap 4

[2] https://en.wikipedia.org/wiki/Normal_distribution

[3] For the calculation in EXCEL I am using the STDEV.S formula (standard deviation of a sample)