Principles behind the Three Ways

‍

Agile may demand a lot from companies, but before they can even started, they need to firstly comprehend the three ways. Below is a description of what to look for.

‍

First Way: Principles of Flow

The First Way requires the fast and smooth flow of work from Development to Operations, to deliver value to customers quickly

‍
‍Focus on the following target:

Increase quality of work as well as our throughput
Boost our ability to out-experiment the competition
Fast left to right flow of work from Dev to Ops to customer
Work needs to be visible
Reduce batch sizes
Reduce intervals of work
Build in quality by preventing defects from being passed to downstream work centres
Optimise for global goals
Reduce lead time required to deploy code into production environment

‍
‍Results to expect:

Continuous build
Continuous integration
Continuous testing
Continuous deployment process
Creating environments on demand
Limiting work-in-process (WIP)
Building systems and organisations that are safe to change

We reduce lead time to fulfil internal and external customer requests thereby increasing the quality of our work while making us more agile and out-experiment the competition

‍
‍Goals to concentrate on:

Decrease the amount of time required for changes to be deployed into production;
Increase the reliability and quality of those services we render

‍

1. Make our work visible

Unlike physical processes in the technology value stream we cannot easily see where flow is being impeded or when work is piling up in front of constrained work centres.

Work can be passed onto downstream work centres with problems that remain completely invisible until we are late delivering what we promised to the customer.
It can happen that the application fails in the production environment
Allowing work to become visible, we can se: where work is queued, where work is stalled

Kanban boards are one of the best ways to make our work as visible as possible:

Work originates from the left
Pulled from work centre to work centre (columns)
Finished when it reaches to the right (done)

Kanban board will span the entire value stream

Work is done only when it is delivering value to the customer.

Throughput is increased when each work centre is focused on a single task of the highest priority until it is completed

‍

2. Limit Work in Process (WIP)

‍
‍Technology sectors

Teams must satisfy demands of many stakeholders
Daily work is doinated by priority du jour
Requests for urgent work comes in through every communication mechanism such as: ticketing systems, e-mails, phone calls, chatrooms, management escalations

‍

The basic idea behind WIP limits is: stop starting, start finishing

Interrupting work is easy because consequences are invisible to almost everyone, even though the negative impact to productivity may be far greater than in manufacturing
Work in technology value stream is far more cognitively complex, the effects of multitasking on process time is much worse
Limiting WIP makes it easier to see problems that prevent the completion of work.
Instead of starting new work, a far better action would be to find out what is causing the delay and fix that problem

‍

Solution

Enforcing WIP limits for each column for each column or work centre.
Nothing can be worked on until it is represented first in a work card, reinforcing that all work must be made visible

‍

3. Reduce Batch Sizes

Prior to the lean manufacturing revolution, it was common practise to manufacture in large batch sizes (or lot sizes), especially for large operations where job setup or switching between jobs was time-consuming or costly

Which changover cost is too expensive, the same thing would be done as much as possible at a time, creating large batches in order to reduce the number of changeovers
Large batch sizes result in sky-rocketing levels of WIP and high levels of variability

‍Result: long lead times and poor quality. If a problem is found in a single area, the whole batch must be scrapped.

‍Key lessons from lean

In order to shrink lead times and increase quality, we must strive to continually shrink batch sizes. The theoretical limit for a batch size is single piece flow, where each operation is performed one unit at a time.
Smaller batch sizes result in less WIP, faster lead times, faster detection of errors, and less rework
The larger the change going into production, the more difficult the production errors are to diagnose and fix, and the longer they take to remediate.

‍

4. Reduce number of Handoffs

In the technology value stream, whenever we have long deployment times, it is often because there are ohundreds of operations required to move our code from version control into production environment.

To transmit code through the value stream requires multiple departments to work on a variety of tasks including:

Functional testing
Integration testing
Environment creation
Server administration
Storage administration
Networking
Load balancing
Information security

‍
‍Each time work passes from team to team, we require all sorts of communication:

Specifying
Requesting
Signaling
Coordinating
Prioritising
Schedule
De-conflicting
Testing
Verifying

‍
‍To accomplish all of the above, we may require:

Different ticketing systems
Different project management systems
Writing technical specification documents
Communicatoins via meetings, emails, or phone calls
Using filesystem shares
FTP servers
Wiki pages

Consequences:

Each of these steps is a potential queue where work will want when we rely on resources that are shared between different value streams
Lead times from environments with shared resources are often so long that there is constant escalation and have work performed within the needed timelines
Some knowledge is lost with each handoff
The work can completely lose the context of the problem being solved or the oragnisational goal being supported.

Solution:

Automate significant portions of the work
Re-organizing teams so they can deliver value to the customer themselves, instead of having to depend on others.
Increase flow by reducing the amount of time that our work spends waiting in queue

‍

5. Continually Identify & Elevate our Constraints

To reduce lead times and increase throughput, we need to continually identify our systems constraints and improve its capacity.

"In any value stream, there is always a direction of flow, and there is always one and only constraint; any improvement not made at that constraint is an illusion"

If we remove work before the constraint, work will pile up at the bottleneck even faster.
If we improve work centre after the bottleneck, it remains starved, waiting for work to clear the bottleneck

‍

Second Way: Principles of Feedback

The second way describes the principles that enable the reciprocal fast and constant feedback from left to right at all stages of the value stream.

Goal: create an even safer and more resilient system of work

Fast feedback is especially important in complex systems
We make our work safer by creating fast, frequent, high quality information flow throughout our value stream and our organization, which includes feedback and feed forward loops.

‍

‍Result:

We detect and remediate problems while they are smaller, cheaper and easier to fix.
Avert problems before they cause catastrophes.
Create organizational learning that we integrate into future work.

Complex Systems:

‍Complex systems defy any single person’s ability to see the system as a whole
Complex systems have a high degree of interconnectedness of tightly coupled components.
Doing the same thing twice will not predictably or necessarily lead to the same result.
Failure is inherent and inevitable in complex systems.

Goal: Re-organizing teams so they can deliver value to the customer themselves, instead of Goal: We must design a safe system of work where we can perform work without fear, confident that any errors will be detected quickly before they cause catastrophic outcomes (e.g.: worker injury, product defects, or negative customer impact)

‍
‍We need the following conditions to make it safer to work in complex systems:

Complex work is managed so that problems in design and operations are revealed
Problems are swarmed and solved, resulting in quick construction of new knowledge
New local knowledge is exploited globally throughout organisations
Leaders create other leaders who continually grow these types of capabilities.

‍

Safe Systems of Work

We must constantly test our design and operating assumptions.

‍
‍Goal: increase information flow in our system from as many areas as possible, sooner, faster, cheaper and with as much clarity between cause and effect as possible.

The more assumptions we can invalidate the faster we can find and fix problems, increasing our agility, our resilience and ability to learn and innovate
We need to create feedback and feed forward loops into our system of work
Feedback and feed forward loops are a critical art of learning orgainsations and systems thinking.
When feedback is delayed and infrequent, it is too slow to enable us to prevent undesirable outcomes.

Focus on the following:

Create fast feedback and fast forward loops wherever work is performed, at all stages of the value stream, encompassing product management, development, QA, infosec, and operations.

‍

How?

We create automated builds, integration and test processes so that we can immediately detect when a change has been introduced that takes us out of a correctly functioning and deployable state.
Create pervasive telemetry so we can see how all our system components are operating in the production environment, so we can detect when they are not operating as expected. Telemetry allows us to measure whether we are achieving our intended goals and is radiated to the entire value stream so we can see how our actions effect other portions of the system.
Feedback loops enable quick detection and recovery of problems and also inform us on how to prevent those problems from reoccurring.
We must constantly validate between our intentions and implementations. Thus feedback is critical.

‍

Swarm & Solve Problems to build new knowledge

Goal: Contain problems before they have a chance to spread, and to diagnose and treat the problem so that it cannot reoccur

‍Andon cord (swarming):

When pulled, the team leader is alerted and immediately works to resolve the problem. If the problem cannot be solved, the production line is stopped so that the entire organization can be mobilized to assist with the problem until a resolution is found and a counter measure has been developed

‍Benefits of swarming:

Prevents problem from progressing downstream, where the cost and effort to repair it increases exponentially and technical debt is allowed to accumulate
Prevents the work center from starting new work which will likely introduce new errors into the system.
If problem is not addressed, the work center could potentially have the same problem in the next operation requiring more types of work.
Swarming enables learning.
It prevents the loss of critical information due to fading memories or changing circumstances – as time passes it becomes impossible to reconstruct exactly what was going on when the problem occurred.
Swarming is part of the disciplined cycle of real-time problem recognition diagnosis and treatment.
Swarming allows us to discover ever earlier in the life cycle that we can deflect problems before a catastrophe occurs.

‍What we need to do:

Create the equivalent of an andon cord and related swarming response
Culture that makes it safe, and even encourages to pull the andon cord when something goes wrong whether in production or an error occurs earlier in the value stream.
Enable us to quickly isolate and diagnose the problem and prevent further complicating factors that can obscure cause and effect.

‍

‍Keep pushing quality closer to the source

Goal: In complex systems, adding more inspection steps and approval processes actually increases the likelihood of future failures. Effectiveness of approval processes decrease as we push decision-making further away from where the work is performed.

‍Result:

Increases cycle time
Decreases strength between cause and effect
Reduces our ability to learn from failures

‍
‍Please note: When top-down bureaucratic command & control systems become ineffective, it is usually because of the varience between “who should do something” and “who is actually doing something” is too large, due to insufficient clarity and timelines.

‍
‍Examples of ineffective quality controls:

Requiring another team to complete tedious, error-prone, and manual tasks that could be easily automated and run as needed by the team who needs the work
Requiring approvals from busy people who are distant from the work forcing them to take decisions without an adequate knowledge of the work or the potential implications, or to merely rubber stamp their
Creating large amounts of documentation that quickly become
Pushing large batches of work to teams and special committees for approval and processing and them waiting for responses.

‍
‍What we need to do:

We need everyone in our value stream to find and fix problems in their area of control as part of their daily work.
We need to push quality and safety responsibilities and decision-making to where the work is performed, instead of relying on approvals from distant executives.
We need to use peer reviews of our proposed changes to gain whatever assurance is needed that our changes will operate as designed.
We need to automate as much of the quality check performed by a QA or InfoSec department as possible.
We need tests to be performed on demand so that developers can quickly test their own code and even deploy those changes directly into production.
We need to make quality everyone’s responsibility.
We need to accelerate learning by sharing responsibility for the quality of the systems they build.
We need to provide feedback as quickly as possible.
We need developers share responsibility for the quality of the systems they build.

Optimise for downstream work centers

Lean defines two customers:

External customers (who pays for the product)
Internal customer (who takes part in work development)

According to lean, our most important customer is our next step downstream. We must empathize for their problems in order to better identify the design problems that prevent fast and smooth flow.

In the technology value stream, we optimize for downstream work centers, where operations and non-functional requirements

Architecture
Performance
Stability
Configurability
Security
Testability

Target goal: create quality at the source

‍

Third Way: Principles of Continual Learning & Experimentation

These are the principles that enable constant creation of individual knowledge, which is them turned intoteamandorgn

In the technology value stream our goals are the following:

Create a high-trust culture
Know that we are all lifelong learners who must take risks in our daily work.
Apply a scientific approach to both process improvement and product development.
Learn from success and failure.
Identify which ideas don’t work and reinforce those that do.
Local learnings are turned into global improvements.
Enable new techniques that can be used by the entire organization.
Reserve time for the improvement of daily work.
Introduce stress into our systems to force continual improvements.
Stimulate and inject failures in our production services under controlled conditions to increase our resilience
Create a continual and dynamic system of learning, we enable tems to rapidly and automatically adapt to an ever-changing environment

Sufi Mohamed

What are the three ways?