#LKNA13 Wednesday in Tweets

From Douglas Hubbard’s keynote:

From the morning sessions:

From the ignite talks:

From the Brickell Key Award ceremony:

From the afternoon sessions:

And the summary:

And the preview of the next year’s event:

Posted in conferences | Leave a comment

#LKNA13 Tuesday in Tweets

From Stephen Parry’s keynote:

From the morning sessions:

As a side note, here is a small catalogue of these injuries:

  • Schedule Stress
  • Torn Trust
  • Termination Tension
  • Fear of Failure
  • Lost Respect
  • Brain Hernia (really complicated code)
  • Duplication Depression
  • Fragility Frustration
  • Merge Misery
  • Crushing Complexity
  • Outage Ordeal

From the ignite talks:

From the Monte-Carlo challenge:

From the afternoon sessions:

Observations:

And last, but not least:

Posted in conferences | Leave a comment

#LKNA13 Monday in Tweets

From Bob Lewis’ keynote:

About the unique, emergent scaled Agile/Lean/whatever system of work at Spotify:

Ignite talk highlights:

The announcement of the 2014 plans:

From the afternoon sessions:

Posted in conferences | Leave a comment

Blogging and Estimates. Rather, #NoEstimates

An interesting thought: how would some of the advice on #NoEstimates would apply to blogging?

I have a backlog of blog posts, that is a number of topics I think I have something useful to say, but I haven’t yet found time to actually write it up, find good illustrations and publish it. Should I estimate my backlog in story points, or in word count, or the time to write each post? Should I make a “release schedule”, commitments when these posts will be published? Does any of that make sense?

No, no, no, and no.

What Does Make Sense? What Can I Do?

First of all, value. I humbly submit that I have no idea what you, dear reader, will find useful in my future posts on this blog. Based on the past experiences, I can say some of the posts lead to useful discussions, some people commented or tweeted indicating that they found something useful. And some posts weren’t like that. I’m not going to let word count or time estimates to get in the way of the fact that the value of what I’m writing here is highly uncertain and speculative.

The main thing I’ve learned about blogging that a good blog post should be about one thing. You cannot size it by timebox or by word count. My 10 most read posts vary in size from 300 to 1300 words. The word count has virtually no correlation with the effort to write a post, much less with what a reader may take away from it.

Second, cadence. Predictable delivery of something. I’ve been trying to publish something every week lately.

Third, the explicit commitment point. Look at the options, see what a good topic could be for today, pick one topic, sit down and write a post. About one thing. Don’t estimate. If it’s the most important option out of the entire pool of options, you don’t need to estimate it.

Click Publish.

Posted in blog | Leave a comment

#NoEstimates Discussion at Agile Open Toronto

I was at Agile Open Toronto as week ago and this post is a short report and a follow-up on the “No-Estimates” session that took place there. The session was proposed and led by Chris Chapman and 23 people, about one-fourth of all open-space attendees showed up for it.

2013-04-13_11-10-42_387 - Copy

I believe Chris’ proposal was motivated in large part by Woody Zuill‘s story of no-estimates. Check out also Woody’s story of Mob Programming, a novel collaboration technique that is important part of his system.

The start of the discussion was uninspiring as we recycled often-cited notions such as:

  • Estimates are fragile, they are really “guesstimates.”
  • Not reliable if the timeframe is longer than one month. (Yeah, right, like all Scrum teams always meet their two-week sprint commitments.)
  • More spin of “guesstimates” such as “wild-eyed guesses.”
  • Estimates are useless, but the process of estimation is useful to the team. (So, what prevents the team to discover that useful knowledge without an exercise aimed at producing a useless number?)
  • There is value in team’s talking about each story. (What? They are not talking about it otherwise and on any other occasions?)
  • Correlation between the team’s cohesion and the quality of its estimates. A reasonable objection, which didn’t stand scrutiny.

Understanding It Deeper

The discussion eventually reached the point when some people started talking about deeper issues surrounding estimation. This is when I think the session became really valuable to the participants.

2013-04-13_12-03-58_354 - Copy

Trust. Estimation correlates with a low-trust culture. In a higher-trust organization, estimating the time to carry out this activity or that quickly loses its value. Business stakeholders can make decisions on what products and features to pursue based on their prudent assessments of risks and value, and they are trusted to do that. At the same time, technologists put their best effort behind the few most-valuable products and features. Frequent delivery, continual “doneness” help maintain trust. Trust enables all people involved to understand together that effort estimates have little correlation with delivery lead times and that “the time it takes” is never a number, but a probability distribution.

Decision-making. Effort estimation is pervasive in the software industry because it is assumed if only we could obtain an accurate enough number, we could accurately calculate ROI, prioritize backlogs, make good promises to customers, etc. However, “just give me the number” hides the reality that what we really need is not numbers, but decisions. Decisions on what to work on and in what sequence. How many bits of information do we need to make a decision? What information about the value of a new feature, or a bug fix, or associated risks is already available? When people start pondering such questions, it often turns out a lot of such information is already available and a precise effort estimate adds little to the decision. On the other hand, “just give me a number” often hides weak understanding of other factors involved in the decision and a weak decision-making capability.

Sizing features. T-shirt sizing was cited during the discussion as a way to estimate features on the order of magnitude and thus provide some useful bits of information to the decision-making process. It was quickly noted that T-shirts are not a great way to communicate the difference is sizes and the lizard-based scale may be better for it. One of the participants had an example how in her small software company, the development team established a steady delivery of iguana-sized features. The business stakeholders no longer need to ask the product developers to estimate each “iguana”, but instead, analyze risks, experiment to understand market value, and choose which iguana to work on, connecting the decision to the economic outcomes. They may even take a risk on an occasional Komodo dragon!


We didn’t get into exposing ROI as a vanity metric and into what specific risk management techniques can be used in place of the deterministic estimates. So, to finish the post, I’d like to summarize some of the most common well-established approaches that have been put in practice by the Kanban community with great success in recent years.

  • Make the value creation process explicit. This is no trivial matter – most Kanban boards are different, most Scrum boards are the same.
  • Establish a reliable delivery cadence that is economically sensible and helps maintain trust.
  • Make the commitment point explicit. Hint: if “backlog grooming” is part of your vocabulary, you don’t have an explicit commitment point.
  • Limit work in progress, which has the effect of transferring information and risk management upstream.
  • Treat everything upstream of the commitment point as an option. Options have value, options expire, don’t commit too early unless you know why.
  • Improve not only the production capability, but also the capabilities to replenish options and select and sequence commitments.
Posted in conferences | 1 Comment

T-Shirts, Rabbits, Lizards and Sizing Software Features

I was at Agile Open Toronto last weekend, which included a no-estimates session. That session and the open-space conference itself deserve separate blog posts, but for now I want to cover just set of concerns that relates to sizing and estimation of software projects, features, user stories, epics and other software work items.

T-shirt sizing (Small, Medium, Large, eXtra-Large) is a quick and easy method to roughly assess the size of a proposed software feature. My observation is, however, that many software people don’t appreciate the size scale that sizes S, M, L, and XL are supposed to represent. Relating it to a familiar everyday item like a T-shirt may add to the confusion. It is not obvious that the only thing the T-shirt feature sizing method has in common with wearable T-shirts is the size labels.

I decided to do a little experiment and found several road race T-shirts I collected over the years, such as this one:

2013-04-15_06-04-18_470

The shirts came in three different sizes and the organizers got them from the same vendor, so the sizing had to be consistent. The Medium shirt (shown in the picture) measured 21 1/2 inches across, the Large 23, and the X-Large 24. Assuming the volume is proportional to the cube of the linear size, a Large shirt can fit a 22% bigger body than a Medium one and that X-Large is 14% bigger than Large.

Here is the danger of confusion, especially if we take this talk of T-shirt sizes outside the development team to the business and forget to explain that the “t-shirt sizes” is only our technical jargon. And what it actually means. We may be mistakenly communicating that relationship between two sizes is “fits a little tight.” Now imagine what business may ask you late in the project based on such understanding of sizing.

The Fibonacci Code

Another t-shirt size confusion I’ve observed is mistaking the size labels as proxies for user story sizes used in the planning poker game. The most popular sizing sequence in this game is based on the Fibonacci series, in which every number is approximately the sum of the previous two: 1, 2, 3, 5, 8, 13, 20, … Some people mistakenly assumed that Small can mean 2, meaning 3 is Medium, 5 Large, etc. That again doesn’t properly communicate the time scale by suggesting that the next big size is less than twice as big.

While your team plays planning poker and uses numbers from a small range (e.g. 2-5 or 1-8 at the most), you’re only doing more refined estimates within the same T-shirt size. When people start raising numbers like 13, 20, or 40, then you’re in next T-shirt size.

The Power Law

What T-shirt sizes are really intended for was to communicate that the features are of different orders of magnitude. The next big size is really several times as big and outside the normal variation range of its smaller neighbour. The rationale for such power-law size scale is to keep categorization mistakes to a minimum. Fitting a proposed feature into a range on the power scale allows us to establish probabilistic estimates based on the history of our delivery of features within that range.

Software teams and organizations have done this successfully over the years. To give two examples of case studies, I would point the reader to Rick SimmonsUpstream Kanban and Henrik Kniberg‘s Lean from the Trenches. What these case studies have in common is that the teams came to realize that the extra-large and large sizes don’t really work for them, because they naturally have a lot of variation, making the delivery too unpredictable and, from the business point of view, often a risk that is not worth it. Further, each T-shirt size essentially represents a class of service and a recent inquiry into the economics of classes of service came to the conclusion that having too many of them, especially if they form a hierarchy, leads to sub-optimal economic outcomes.

I don’t remember who proposed the Lizard Scale several years ago, but I think it was Jeff Patton. The scale is: gecko, iguana, Komodo dragon, Godzilla. Each animal is clearly bigger and in a different class than the previous one. This scale is essentially the same T-shirt scale, but without any confusion that may be caused by referring to T-shirts. Teams may or may not need to use this scale literally as long as they and their business stakeholders understand what the labels S, M, L and XL mean.

Posted in hands-on | 2 Comments

Scrum Commitments, Little’s Law, and Variability

I have recently had a discussion with a Scrum Master whose team was struggling quite a bit, completing exactly zero stories for two straight iterations. This problem is often framed as overcommitment – how can we make them team commit to what they can deliver and deliver what they have committed to? The Scrum Master was also a recent graduate of an Agile training program as I could tell by very revealing language: “teaching the team a lesson”, “honouring commitments”, and looking at the Scrum Guide first and at the situation second, trying to link the problem with “violations” of Scrum rules.

Let’s re-frame the problem first. This is not about coaching the team to “perform.” Even the concept of a “team” is not really helpful here. What we deal with here is an ecosystem: part of it the team, other parts of it are its customers, partners within the organization, and various stakeholders. Some of these interests may be represented by the Product Owner – although how the Product Owner can be a leaky abstraction and other criticisms of this role is a wholly different topic. The problem is not how to make the team execute on some tasks, but to understand how its ecosystem – a system – co-evolves to create value by delivering its software. Our job is understand the system capability and improve it continually and forever.

Before we embark on that improvement journey, it would be reasonable to ask, what “laws of physics” our system may be governed by?

One of such laws is Little’s Law from the queuing theory. If we have to have conservation of flow – stories enter the system at the sprint planning and exit at the demo – then the amount of work in progress divided by the completion rate must equal the cycle time.

LittlesLaw

If you do one-week sprints, your minimally acceptable throughput rate is 0.2, representing the velocity of one user story per sprint (five working days). Since the amount of work in progress is greater or equal to 1, Little’s Law allows us to establish quickly that the best we can do as far as the cycle time in this case is 5 days. Fortunately, this cycle time fits into the one-week sprint, although it gives no room for error. But we can say at the very least that the cycle time must be consistently no more than 5 days, and ideally 4, to account for the transaction costs (such as sprint planning and demo)

If the team multitasks (WIP several times more than necessary), its cycle time increases linearly, making it likely it will not fit into the four-day timebox. For example, when the cycle time was three days, they met their commitments. Double the WIP, the cycle time is now six days and they’re now missing commitments. When human beings are involved, the task-switching will actually decrease throughput while increasing WIP at the same, making the cycle-time increase worse than linear.

Another force affecting the team is variability. Cycle time is not really a number, but a probability distribution. People tend to underestimate its variance and the effects of the variance. I hear quite a few Scrum Masters and coaches talk about velocity as a “hard number” or a “meaningful metric” – they miss these metrics’ probabilistic nature.

CycleTimeNotNumberButProbabilityDistribution

In order to meet weekly sprint commitments, you need the upper control limit of the cycle time to be no more than four days. For example, if the cycle time has Gaussian distribution with the mean of 2 and the standard deviation of 0.5, the control limits are 0.5 and 3.5. In this case, you have nearly 100% probability that the story will be delivered within the sprint. If the mean is 4 and the distribution is symmetrical, you have a 50% probability of missing your sprint commitment.

Delivering half of the sprint commitment

Even if my example of the average cycle time of 2 days with a sigma of 0.5 helped anyone start thinking probabilistically, it is actually too good to be a real-world example. It was only an illustration – real-world distributions are a lot worse!

They are almost never normal (Gaussian); lognormal, exponential and “unique” distributions much more common; the standard deviation a greater fraction of the mean as well. The expansion-of-work phenomenon and the asymmetry of distributions will ensure that you almost never benefit from “less-than” outcomes, but always pay the price of “longer-than.”

Options for Improvement

There are two basic options for improving due-date performance. One is reducing variation. Here the main culprits that contribute to increasing it:

  • multitasking
  • story size
  • various blockers (due to multitasking, new technologies, and other sources)

Another option is to reduce the average story size, which has a doubly beneficial side effect because it tends to reduce the harmful component of variation as well. As you can see in the made-up example (average 2, sigma 0.5), committing to stories even as small as half the sprint length assumes unrealistically low variability.

As a rule of thumb, divide the sprint length by a factor of 4 to 6 and that’s how long the average duration (not the development effort) of your stories should be. Err on the side of making your stories smaller, using XP techniques like Product Sashimi and Elephant Carpaccio to make them smaller.

Lastly, the Scrum Master received some advice to increase the sprint length, but was reluctant to do so. I’m in agreement with him on that. I have seen teams take this step as a matter of convenience and they ended up covering up problems and avoided the deeper learning that they needed to go through. Instead, keep the week-long sprints, reduce work in progress, and use the actual cycle-time and velocity data to guide improvement.

Posted in hands-on | 8 Comments