Thank you all of you who attended my #Agile2014 session: How to improve Flow Efficiency, Remove the Red bricks! In this, and upcoming posts (part 2) I will answer some of the questions I have received after the session.
Q1: I was hoping to better understand how to improve flow efficiency when the number of resources varies on our scrum process. For example, we have more developers than testers. We typically have a bottleneck in the test step. Not sure I got my answer.
This question is not necessarily a flow efficiency question. It may be more of a balance demand to capacity question. Nevertheless, let us explore the flow efficiency side first, as this was the main focus of the session. First, a short description of flow efficiency.
Flow efficiency is a measurement of how much a flow unit (e.g. user Storie, features, MVP, project) is worked on (touched) from the time a need is identified to the time it is satisfied.
Flow efficiency = Total touch time (time spend working on a flow unit) / Total lead-time (total time spends in the system for a flow unit, from need identified to need satisfied)
Red bricks – Non value adding time: Waiting in a queue, waiting for a decision, waiting on dependency
Yellow bricks – Non value adding activities but currently required: Over processing (e.g. backlog maintenance), reporting and status meetings, rework (e.g. due to defects, handovers, long lead-times, lack of understanding the requirements)
Green bricks – Value adding activities: True customer need
Improving flow efficiency
To improve flow efficiency, focus should be on the flow units. We want a flow unit to be worked on, touched, as much as possible compared to the total lead-time for a flow unit.
First, we should focus on removing the red bricks. Most of the time, removing the red bricks, will not involve any major investments. It is mostly a question of a policy change how we run our system. We need to change from a resource efficiency focus to a flow efficiency focus. With a resource efficiency focused system, we will often see an excess number of flow units in our system. In addition, the flow units will be moving through our system in big batches.
Second, we should focus on removing the yellow bricks in the process. Yellow bricks are non value adding, but currently required, activities for us to operate our system. Removing the yellow bricks often requires a little bit of more effort than removing the red bricks. It often involves changes to how our system is organized and run our system. A resource efficiency focus system often needs more yellow bricks due to the excess number of flow units and the big batch sizes.
Flow efficiency and Scrum
In most Scrum implementations I have observed, the flow unit (features and user stories) spend a lot more time in the product backlog, deployment queue and in the sprint to-do and done states compared to actually being worked on. Flow units may pass from to-do to the done state quite fast and with little waiting during the sprints. However, they wait for a long time outside the sprint boundaries. This is a clear indication of a resource efficiency focused system. Most organizations using Scrum fail to recognize this. Why?
It is rare that I see a visual management system that extends outside an individual Scrum teams and an individual sprint. If we want to improve flow efficiency, first we need to have a greater understanding of the end-to-end system that expands beyond an individual Scrum team and an individual sprint.
My first advice for organizations that use Scrum and want to improve flow efficiency:
Common ways to improve flow efficiency
With a greater shared understanding of the bigger end-to-end system, there are some common ways that we can try to improve flow efficiency. The two most common are:
Reducing batch size
As Don Reinertsen writes in his excellent book The Principles of Product Development Flow.
Most product developers underutilize one of the most important tools to improve flow – Reducing batch size.
Reducing batch size help us reduce the unwanted variability in our system. Reduced unwanted variability will reduce the need for buffers. Reduced buffers means less waiting time (less red bricks). Less waiting time increases flow efficiency.
Reducing batch size increases feedback. Increased feedback often leads to less rework (less yellow bricks). It is not uncommon that increased feedback will make some planned work superfluous. Less rework and not doing superfluous work increases flow efficiency.
Reduced batch size reduces overhead. Reduction of overhead can improve flow efficiency as the overhead typically adds hand-offs (yellow bricks). The more hand-offs the more queues is often needed. More queues means more wait time (red bricks). However, reduced overhead can also lead to lower flow efficiency. When non value adding touch time is reduced flow efficiency can go down if total lead-time don’t go down with an equal amount. However, this is often not a bad thing as flow efficiency on its own is not a good measure of a systems performance. Flow efficiency is only one indicator of a systems performance.
How do we reduce batch size in Scrum?
The most obvious and often the easiest is to have shorter Sprints. If we are starting work on more than one flow unit at a time in the Sprint reducing Sprint length often reduces our batch size. If we deliver all flow units of a sprint to the next stage in on go, reducing Sprint length will reduce batch size.
Another way to reduce batch size is to reduce the size of each flow unit (user story, feature). Making the flow unit smaller often leads to increased feedback and increased learning.
My added advise is not to focus too much of slicing based on customer or business value when we begin. If we focus too much on customer or business value in the beginning, it is easy no to start at all. Most of the time, just slicing will help improve flow.
Reducing Work-In-Process (WIP)
Most organizations keep way too much work in their systems. Having lots of work in our system has many downsides and is a clear indicator of a resource efficient system.
Applying WIP limit and then, in most cases, reducing WIP in our system has a number of benefits to flow and flow efficiency.
By applying WIP limits to our system, we can make the system more predictable and permanent the bottleneck of the system. Making the system more predictable reduces unwanted variability. Reduced unwanted variability will reduce the need for buffers. Reduced buffers means less waiting time (red bricks). Less waiting time increases flow efficiency. Making the system bottleneck permanent help us understand the capability of our system. If we understand our system, we can avoid overloading it with flow units. If we avoid overloading our system with flow units, queues will be shorter. Shorter queues means less waiting (red bricks). Less waiting increases flow efficiency.
For me, one of the biggest benefit of reducing WIP is a greater understanding of our system. With less WIP in our system, the easier it will become to see how our system really works. By lowering the water level in the value stream, even the smallest problems in our system will be visible just as the rock in a canal will be visible.
It is important to have an explicit WIP limit for every stage in our process. If not, we will only have limited benefits. Without WIP limits on every stage in our process, no backpressure will be present in our system. With no backpressure, we reduce the information that we can gain from our system.
Reducing Work-In-Process with Scrum
Reducing WIP is really simple, but also hard. The simple part – set an explicit WIP limit for every stage in our process. That typically mean we set WIP limits on:
- Upstream stages
- Product backlog
- Sprint to-do
- Downstream stages
If we have created a more detailed process description, we apply the WIP limit on all the detailed stages.
How do we decide what WIP limit to set? It really comes down to how much organizational friction we want to introduce when we start. With high organizational friction, getting buy-in can be harder but will help you create greater understanding of our system faster. With low organizational friction, we typically have an easier time to get buy-in but creating understanding can take longer.
The hard part of WIP limits is what we do when the WIP limits create backpressure and stops us to start work on the next flow unit. How do we as an organization handle the situation when we have excess capacity at some stage in our system? Can we resist the urge to add more work, just to keep people busy? Or, do we see this as an opportunity to learn and improve our system?
Alternatively, when we do not get any backpressure. Do we then take the opportunity to reduce WIP again to gain more knowledge? Or, do we let inertia set in and just keep ourselves busy?
Go through this WIP adjustment cycle on a regular basis, every 2nd to 4th sprint when using Scrum.
Balance demand to capacity
Now coming back to maybe answer your real question. The question of how to balance demand to capacity.
The “Law of bottlenecks”
We cannot go faster than our bottleneck.
To understand where in our system we have the current bottleneck I recommend two things that I already have discussed in this post:
When we have identified where the current bottleneck is in our system we can go through the Five focusing steps from Theory of Constraints (Constraint ~ bottleneck) until we have the desired system performance.
Five focusing steps
- Identify the constraint
- Exploit the constraint
- Subordinate to the constraint
- Elevate the constraint
- If in the previous steps a constraint has been broken, Go back to step 1
I will not explain the step in great detail in this post. Here is a short version of how it can work with the example from the question:
- Find your bottleneck. Here global end-to-end awareness helps as well as lowering the water level. Create an end-to-end visualization of your system. Apply WIP limits and adjust until you get backpressure.
- Use your bottleneck only for bottleneck activities, e.g. testers should only do what they are truly the experts at and only that. Rework by the testers should be avoided as much as possible, e.g. they should not find defects that the upstream steps could have found/avoided.
- Make the rest of our system work to the capacity of the bottleneck. This is where a WIP limited system really help you by throttling the capacity of your system to the bottleneck. If you have excess capacity in front of your bottleneck, testing, that capacity should only work to the capacity of the bottleneck, the testers. This will create slack capacity in front of the bottleneck, e.g. the developers will have slack.
- Now try to improve the capacity at the bottleneck. This typically involves adding more people and/or equipment to the bottleneck e.g. get more testers, cross train developers and testers. It can also involve making the bottleneck activities go faster e.g. test automation. This step is often expensive and time consuming to implement.
- If the bottleneck has been broken, moved, start over from step 1
We move on to the next step if we have not reached our desired capacity. If we have a desire to continuously improve our system towards perfection we will keep going through the Five focusing steps for a long time.