11 August 2010

Parkinson's Uncertainty

Parkinson's Law: Work expands so as to fill the time available for its completion.

It's been shown to apply to government bureaucracies and it surely applies to software development. One bit of lore that I've been carrying around for decades and seen repeatedly validated is that work is at best a little early and at worst a lot late, with a ratio of about 4 in either direction. How do we express this in our calculations if historical data is not available to sample?

I've come up with the following algorithm for a time-to-complete distribution expressing Parkinson's Law:

Tp = .75 + .25 * X1 + 4 * X2 * X3

Where X1, X2 and X3 are independent uniform random variables.

The R-Language code for 1,000 samples is

.75 + .25 * runif(1000) + 4 * runif(1000) * runif(1000)

The shape looks like histogram

The 4* in the formula will be disputed, in spite of all the evidence, so it's best to make it a parameter. Also, one of our objectives with skeptical planning is to reduce that parameter.

Here's Excel VBA code that returns an array usable with a range or DstCreate():

Function rparkinson(count As Long, _
    expected As Single, skew As Single) _
    As Variant
'   Returns an array of numbers interpreting
'   Parkinson's law as a time-to-delivery shape
    Application.Volatile True
    Dim ar() As Single
    ReDim ar(1 To count, 0)
    For i = 1 To count
        ar(i, 0) = expected * (0.75 + 0.25 _
        * Rnd() + skew * Rnd() * Rnd())
    Next i
    rparkinson = ar
End Function

As always, any improvements are welcome. If there are any mathematicians in the audience who'd like to reverse-engineer this into the usual calculus--knock yourself out.


  1. Marc, This is interesting. Is it your intention that this function be applied to a "best estimate" for a task duration; i.e., Tt = Te * Tp
    where Tt = forecasted duration of the task as a distribution
    Te = best estimate for the task duration in the units of concern
    Tp = the Parkinson distribution?

  2. I don't think there is a "best estimate" in the conventional sense. Te would be "the usual estimate" with a probability of success (without tail crunch) around 20%. The discussion with the programmer would be about the median, even odds. Te would be half that.

    Later on, when they have more experience estimating distributions, I'd give the programmer a "distribution shaper" to work with and leave parkinson behind.

    If the estimating doesn't involve the programmers, then I'd say Te is what the offending manager thinks is a good estimate.

  3. I should know better than to post in the wee hours.

    Te would be the expected value in the conventional sense: the value that becomes the commitment. From the developer point of view, if things go well, Te is easily met; if things go badly, it's missed. In other words, the work is either on time or late. Jensen's inequality rears it's ugly head.