Scaling: Local vs Full Vertical scaling

It’s funny, isn’t it? Everybody is still talking about ‘scaling agile’. A whole industry has been created on the premise that large companies need process structures to help them manage pushing very large projects through huge sets of development teams.

Luckily, the DevOps movement (and the continuous delivery movement, I’m not sure they’re really separate) happened early into that process, so in most of these large scale processes there’s at the very least lip service to the idea that quality needs to be high, and delivery needs to be automated, even if they don’t aim for continuous.

Unfortunately, most of the work on the other side of the workflow was not included. Even though a number of initiatives have started in the last five years to bring product, marketing and UX people more into the fold to really include the customer into the process, this is still a very rare occurrence.

This isn’t all that surprising, I suppose. The longer-term planning and significant coordination overhead is familiar and comforting. And the large organisations that start using LeSS, SAFe, etc., will actually be better off than they were before. They’ll deliver more predictably, they’ll get into production quicker and they will feel more in control. Not bad, actually.

But even though their products will be in the hands of customers somewhat faster, the feedback loops from their customers will still be too slow. And the people that should be the link to the customers, the people in Marketing, Sales and Product Management, are way too far from the action. Too far to get fast feedback from the customer directly, but also too far from development to start using all the possibilities of technology to get more information on that customer and their preferences.

We scale our development organizations but ignore the capacity of its customers to make use of it

We scale our development organizations but ignore the capacity of its customers to make use of it

The solution is, predictably, to bring those roles into the fold and create a combined team where marketeers, product manager, software developers, ux specialist, testers and operations people work together to deliver on business goals. We have already seen the game changing effects that occur when software developers, testers and operations truly combine their knowledge and efforts. This next step will have more impact, simply because the knowledge that is being combined in the team is much broader, and more directly linked to the customer.

There are new issues to resolve when we do this. Coordination on a product level needs to be done in a different way. And coordination on the level of the different areas of expertise also needs to be done in a different way. Much more thought needs to be put into matters of vision and strategy, and how those have to be communicated to the teams that are formed. Into how this translates into rewarding people. But we’ve solved those kinds of issues before, and we can tackle them again.

Stop trying to scale agile. Scale your organisation.

The three failures of Continuous Delivery

Everyone seems to want to get on the Continuous Delivery train. Rightfully so, I think. For most, though, it’s not an easy ride. From my work with client and conversations with other coaches there’s a few common barriers to adoption.

In the end, the goal should be to be able to react faster to the market. And, to be honest, to finally be in actual control. But in business terms, it’s about cycle times. That’s what allows you to not just react quickly to market circumstances, but to actively probe markets and test product ideas.

So, as I mentioned, there’s a few common problems companies run into. First, just the basic technical steps to create a fully automated pipeline. Then, getting the tests sorted to a level that gives enough confidence to deploy to production whenever they’ve run. Only when those technical matters have been sorted do we get to the more interesting issues of allowing the business to make use of the possibilities offered by the newfound agility. Those have their own challenges.

Let’s have a look at the the ways these particular subject give teams trouble. In the hope that being forewarned some will be able to avoid them. I’ll go into more detail on how to avoid them in subsequent posts.

Get a pipeline

Now, if you’ve paid any attention to the literature, you know that at the core, CD is all about important things like process and a culture of quality. Which is all true, but that probably won’t help you very much. Most development organisations have spent years wrapping themselves in workarounds and buffers all painstakingly created to prevent detection of their real problems. So taking a relatively small, technical, step in setting up a delivery pipeline at least seems somewhat feasible and will by its nature start showing where some of the real problems lie.

A Delivery Pipeline

From what I’ve seen, just trying to set up that pipeline is trouble enough. That’s why I’ve put it as the first barrier to adoption of CD. It may seem easy, but there turn out to be many basic technical challenges. Most teams go through those same pains, and it’s not really surprising. There’s quite a bit of (often new) knowledge and skills involved. And teams usually have to deal with all kinds of legacy code and infrastructure, which doesn’t make it any easier.

Mostly, what companies find here is that they are missing is skills. And there are a lot of skills involved! A real DevOps approach should include operations knowledge in a team, but even then most of the skills needed to create a modern, fully automated infrastructure are something that takes most organisations a long time to develop.
It’s not that these things are beyond those teams, it’s just that they’ve not had to deal with them before. Sure, it is easy enough to package your application in a docker container and run it locally, but people are discovering it is quite a different thing to build it out further than that.

Testing

Testing is the achilles heel of many development teams. Most agile teams work hard to get and keep their code under test. Many fail. The advantage that Continuous Delivery has is that it sets explicit expectations on quality. There’s really no room to skimp on testing if every push you do should end up on production.
Like was the case for Continuous Integration, testing is what makes a Delivery Pipeline useful. It’s great if you have fully automated deployment, but if you have no way to determine if the code you’re building can be trusted, you’ll still not be in production any sooner.
There’s different ways teams fail with testing. Insufficient unit testing. Too limited protocol and service testing. A reliance on slow and brittle end-to-end testing. Skipping manual / exploratory testing, that may no longer be a gateway before going into production but is still very much necessary.

Business

Organisations that manage to get past the first two hurdles have at their disposal a tool that can bring them unimagined business advantages. But even having come this far, existing silo’s, processes and political positioning prevent organisations from profiting from their newly found technical capabilities.

Symptoms of this can be found in the ignoring, or even complete lack, of market data in deciding on new products and functionality. In continuing a practice of long term planning, without built in checks to see if the intended goals are being achieved. In basing priorities on political influence instead of business goals. And even in a reluctance to release new features to users even once they’re available behind a feature toggle in production.

These issues can be the most difficult to address and need to be picked up at the highest management levels. They are attacked with changes in goal setting, reward systems, and organisational structure.

Interlocking pieces

As with any process, these different elements cannot exist for long without the others to support them. Testing withers if it cannot be run quickly and frequently enough. A delivery pipeline has little value if you have no way to know if you can trust the code that it’s building. And a highly evolved technical team that is not clearly and directly involved with business goals and customers will easily find more fulfilling work elsewhere.

That’s why my advice is to start in this order, picking up the next challenge as soon as there’s clear progress on the previous. You start building technical skills and then use that base as a flywheel to get a change in the rest of the company going.

Don’t Refactor. Rebuild. Kinda.

I recently had the chance to speak at the wonderful Lean Agile Scotland conference. The conference had a very wide range of subjects being discussed on an amazingly high level: complexity theory, lean thinking, agile methods, and even technical practices!

I followed a great presentation by Steve Smith on how the popularity of feature branching strategies make Continuous Integration difficult to impossible. I couldn’t have asked for a better lead in for my own talk.

Which is about giving up and starting over. Kinda.

Learning environments

Why? Because, when you really get down to it, refactoring an old piece of junk, sorry, legacy code, is bloody difficult!

Sure, if you give me a few experienced XP guys, or ‘software craftsmen’, and let us at it, we’ll get it done. But I don’t usually have that luxury. And most organisations don’t.

When you have a team that is new to the agile development practices, like TDD, refactoring, clean code, etc. then learning that stuff in the context of a big ball of mud is really hard.

You see, when people start to learn about something like TDD, they do some exercises, read a book, maybe even get a training. They’ll see this kind of code:

Example code from Kent Beck's book: "Test Drive Developmen: By Example"

Example code from Kent Beck’s book: “Test Drive Development: By Example”

Then they get back to work, and are on their own again, and they’re confronted with something like this:

Code Sample from my post "Code Cleaning: A refactoring example in 50 easy steps"

Code Sample from my post “Code Cleaning: A refactoring example in 50 easy steps”

And then, when they say that TDD doesn’t work, or that agile won’t work in their ‘real world’ situation we say they didn’t try hard enough. In these circumstances it is very hard to succeed. 

So how can we deal with situations like this? As I mentioned above, an influx of experienced developers that know how to get a legacy system under control is wonderful, but not very likely. Developers that haven’t done that sort of thing before really will need time to gain the necessary skills, and that needs to be done in a more controlled, or controllable, environment. Like a new codebase, started from scratch.

Easy now, I understand your reluctance! Throwing away everything you’ve built and starting over is pretty much the reverse of the advice we normally give.

Let me explain using an example.

Contine reading

Extending the Goal in Scrum

In his post “The Goal in Scrum“, Ron Jeffries makes the case for having a proper, higher-level-than-stories, Sprint Goal. As he says:

This is better, because it allows the wisdom and knowledge of the team to be fully exercised, and because it keeps focus on “what” is needed more than on just how it is to be done.

The point is well made, and true. Many Scrum teams would be much better off when adopting this practice. If you haven’t read the article yet, please do so now. It’s short and to the point, I’ll wait right here.

I think there are further steps beyond the point that Ron describes, that a good Agile organisation should aspire to. And that help get closer to the XP idea of an on-site customer.

For an example, let’s take the same team that Ron is talking about, working on some web-shop like domain. I’ll take a point in time a little further out than Ron did. They already learned his lesson, after all. And having done that they have a nice web shop running, with a working checkout flow, and even a wish-list.

The shop has a reasonable number of visitors, and sells enough to keep everyone employed. But though new functionality is built regularly, growth in terms of revenue is very uneven and not clearly linked to the efforts of the development team. This worries the CEO. He even considers whether changes in the team (bigger/smaller?) are necessary. The PO advises a more considered approach. He goes to the team and tells them about the issue:

“It seems our work sometimes helps us make money, but other times has no effect at all!”

The team has a nice, long, retro discussion about this. They remind the PO that they sometimes have raised questions on the practical use of some of the things they were building. He reminds them that those same things sometimes turned out to work well. And sometimes not. They realise they are missing an important feedback cycle.

Step one: Sprint Goal as a Business Test

The team is a very competent XP team, and knows that the best way to develop is to pull your assumptions forward. Test first. And change direction if the results tell you it’s not working. They agree with the PO to take a similar approach to the Sprint Goal: Describe the Goal as a test. Not a Unit Test. Not an Acceptance Test. Maybe a Business Test? One of the members talks about hypotheses but is voted down because the international team knows they’ll fail pronouncing that.

So for the next sprint, the PO and the rest of the team discuss what the Goal should be. The PO tells them that it seems many people put items in their shopping cart, and even go to checkout, but then stop and never go to payment. They agree that the goal should be to find out how to improve the conversion in that part of their sales funnel.

Step two: Information Radiator for Business Goal

The first story they agree on is to create a dashboard for the team to see this particular funnel. Easily done with their existing analytics software, but the team hasn’t been looking at that until now.

Step three: Generate ideas that could influence Business Goal

Overly simplified dashboard

Overly simplified dashboard

Then they think of all the reasons why they think someone would stop at that point. Could it be that the total amount frightens them? Should that be in the short view of the shopping cart on the main page? Is it the account creation that stops people going forward? Or the selection of the payment method? One of the team thinks the absence of PayPal as an option could be the problem. They decide they don’t know. And decide to find out.

Step four: Verify ideas

The other stories they create are small changes. And as part of those stories they encode decisions. Decisions that will result in more stories. Or will result in quickly deleting the just built functionality.

One example is the amount: they make a change in the shopping cart view on the main page that shows the approximate amount. The amount will be calculated client side, not taking into account tax and such which would require much more work. And they build it so that about 20% of their users get this new version while the rest get the old one. And compare the results. They agree up-front that only when this has more than 1% effect in the conversion they will build a more capable version of the feature in the next sprint.

The team member that likes PayPal gets a go too: let’s just put a ‘Pay with PayPal’ button on there, and see how often its pressed. Again shown to only a small subset of users. And again, only if it results into an increase of 1% or higher, they will build the PayPal integration.

Step five: Build feature

Based on the results of their experiments, which were very easy and quick to build, they create further stories for the backlog. Depending on how much time they had to spent, some of those stories could even be added to the current sprint. They’re proven to be supportive of the Goal. But if that doesn’t happen, it could also be fine to plan them in the next sprint, or later. At least the business value of those stories are very well defined.

Report

The PO is exited, but also worried a little. They’ll be building partial solutions. He is used to reporting on completed features to management. He works with the Scrum Master and one of the developers on the type of data they’ll be deciding on and creating a good report on them. Then he goes to discuss those reports with management. Management likes the figures, but would like to add a few forecasts on how these figures influence the revenue figures. That is quite easy to do, using the conversion figures along with average order size, and pretty soon they have a report that everyone is happy with.

The PO now reports on which parts of the sales funnel they’ve worked on, what ideas they testes, which worked and which didn’t, and how they are influencing revenue. Because they employ small experiments, they don’t spend much on the ideas that don’t work. And the report makes very clear that the increases in revenue that occur are significantly less than the stable costs of the team, even if the difference isn’t constant.

TL:DR

Defining your Sprint Goal in measurable business terms (such as Pirate Metrics for a web shop) gives more transparency and closer integration between development teams and their stakeholders.

From Here to Continuous Delivery

Situation Normal

There’s a clear pattern for software development. A pattern of lost opportunity.

In most, if not all, places where I’m called in the base question deals with the inability to deliver. Management sees that the plans they have are simply not going to be realised.

Business opportunities are lost waiting. Waiting for the next available spot in the product roadmap. Waiting for the development team to finish ‘stabilizing’ the system. Waiting for lengthy ‘refactoring’ phase to complete. Waiting for new servers to be delivered (in only six weeks!). Waiting for a PMO organisation to complete project initiation and relative priority. Waiting for development to complete coding. Waiting for a testing phase to complete. Waiting for management to analyse long lists of known issues and risks so they can decide whether a release is possible.

Anyway, there’s waiting involved.

As with any interesting problem, this one doesn’t have a single identifiable cause. The marketing department will blame the development team for being too slow. The development team will blame the marketing team for not knowing what they want. The development manager will blame his team for being slow and writing buggy code and all his stakeholders for not being realistic. Upper management will blame the marketing and development managers for being slow and not delivering.

They are, all of them, right.

And you can’t point a finger at one root cause. Yes, development messed up and wrote crappy, unmaintainable code. Yes, the business focused too much on short term gains and put too much pressure on development to deliver early. Yes, management should have focused on mission and strategy so the rest of the company could have managed scope. Yes, all were too hasty hiring new people when the pressure was on, and too much incompetence entered the company.

I’m willing to bet you’ve heard most of those complaints, made a few of them, and been the subject of others.

How do we get out of this vicious circle?

Step one: Fix execution

Stop. You’re trying to do too many things at once. The first thing that needs to be done is to get your technical house in order. As long as you can’t deliver a working, tested, system at the drop of a hat, you’ll always be too slow.

So now, immediately, start changing your technical practices to support better quality. Deploy automatically, Test automatically, test everything, and test in all manner of ways. Change your architecture to support quick change and better practices. And do all that while still delivering value.

That sounds difficult. It is. But it’s possible. I’ve done it. Others have.

You will slow down a little, initially. But you can use architectural changes, such as a Strangler Pattern or Branch by Abstraction, to quickly start over without throwing all your existing systems away.

And focus on quality. This is hard. Management needs to be extremely explicit in this. Technical teams are used to a focus on progress and speed, and will automatically revert to those and subvert the quality of your system in any case of perceived pressure.

Focus on quality

Add all the elements that give you more control over your systems. That means fully automated deployment, include the automated testing that you need to be confident enough to have every push of a developer going to production. Then make sure that actually happens.

Introduce feature toggles, so that your decision to supply a new feature to end-users becomes exactly that: a decision. And not related to your release cycle.

Measure everything.

Measure the results of new functionality on your business. Whether it’s in use of features for an internal system, or all your Pirate Metrics for your product website.

Step two: Fix alignment

Now that you’re able to deliver, it’s time to start making use of the opportunities that gives you.

You already know, now, how to measure the effects new features have on the use of your product. For some, that is already a direct link to the money being made by the product. For the funnel on a product website or web-shop, we can calculate the revenue increase (or decrease) from a change. Other types of applications need a little more work and imagination, but we can certainly get to some measure of value.

If you can’t measure the effects of your work, you can be sure you are not doing the right things.

But that is all still a bottom-up approach. Effective for short term goals, but potentially dangerous if the metrics we use aren’t aligned with longer term business goals.

im_example

If you have your mission and vision defined for you company, and there’s a strategy that you expect will bring you there, you should now spend some time to hammering out a small set of actionable metrics that we can use to prioritize our opportunities on a day-to-day basis. The post linked to above shows a way to determine that, based on Gojko Adzic’s excellent Impact Mapping.

You pick a very few metrics. In fact, you should aim for that One Metric That Matters. This is your compass, steering the whole of the company. Be careful that you don’t have other metrics hidden that undermine this, for instance in a target/bonus system.

dashboard-metrics

This OMTM should be permanently visible for everyone. It should be continuously, and automatically measured and updated. And it should be directly coupled to the various day-to-day activities of everyone in your company.

On the level of product development, all your priorities should be determined by the impact on your OMTM.

For most companies, this level of focus and clarity of purpose is far off. It requires clear vision and leadership. And will transform your organisation.

When are you getting started?

Everybody need somebody

On occasion, I like to listen to podcasts. Some of the most interesting can be those that are from outside of the software industry. This week I was listening to Robb Wolf’s podcast, where he hosted guest David Werner. Robb talks mostly about diet, metabolism and exercise, and this episode was focused on that last one. Both Robb and David are coaches. In the sports sense of the word: they own gyms, and teach people how to exercise both for general health and to improve performance in some sports endeavor.

Listening to people who are experts in their area is always a joy. Because learning by osmosis is fun. Because listening to people talk at a higher level of experience then you can helps you find out what is really important in an area (well, sometimes…). A joy. And, remarkably, it’s also a joy to find how people in completely different lines of work have found ways of working and thinking that so resemble things in my own area of work.

So it was nice to hear David Werner talking extensively about improving in small steps. About the danger (in physical training) of taking too big a step, and having related smaller goals that won’t over-strain you current capacity. And about how often people don’t do this, and try to do pull-ups while they’re not even able to do a proper push-up, damaging their shoulders in the process. The fact that I’m still recovering from my own shoulder injury due to over-straining has only marginal influence on that.

drop down and give my twenty! (well, if you can. Otherwise 3?)

drop down and give me twenty! (well, if you can. Otherwise 3?)

David went on to describe that based on that experience, he was building his new website in the same manner. He even mentioned that there was some Japanese word that is sometimes used for that. Kai-something?

Another piece of cross-industry wisdom is their discussion on how everybody, no matter how experienced, needs a coach. Robb joining David’s training helped him find areas where he could improve his fitness that he hadn’t found himself. I guess that the more of an expert you are in an area, the more expert your coach would need to be, but having an outside view of what your doing is the very best way to get better of what you do.

Everybody needs a coach

As a coach, of consultant, or whatever you want to call it, it’s sometimes hard to get this kind of feedback. That’s why initiatives such as Yves’ Pair Coaching, of one of the Agile Coach camps are very valuable. And why we like to go to all those conferences. But you can find opportunities in your everyday work as well, just by explicitly looking for it.

Scaling Agile with Set-Based Design

I wrote a while back about set-based design, and just recently about a way to frame scaling Agile as a mostly technical consideration. In this post I want to continue with those themes, combining them in a model for scaled agile for production and research.

Scale

In the previous post, we found that we can view scale as a function of the possibilities for functional decomposition, facilitated by a strong focus on communication through code (customer tests, developer tests, simple design, etc.)

This will result in a situation where we have different teams working on different feature-areas of a product. In many cases there will be multiple teams working within one feature area, which can again be facilitated through application of well known design principles, and shared code ownership.

None of this is very new, and can be put squarely in the corner of the Feature Team way of working. It’s distinguished mainly by a strong focus on communication at the technical level, and using all the tools we have available for that this can scale quite well.

set_based_design_image_1

Innovation

The whole thing starts getting interesting when we combine this sort of set-up with the ideas from set-based thinking to allow multiple teams to provide separate implementations of a given feature that we’d like to have. One could be working on a minimum viable version of the feature, ensuring we have a version that we can get in production as quickly as possible. Another team could be working on another version, that provides many more advantages but also has more risk due to unknown technologies, necessary outside contact, etc.

set_based_design_image_2

This parallel view on distributing risk and innovation has many advantages over a more serial approach. It allows for an optimal use of a large development organization, with high priority items not just picked up first, but with multiple paths being worked on simultaneously to limit risk and optimize value delivered.

Again, though, this is only possible if the technical design of the system allows it. To effectively work like this we need loosely coupled systems, and agreed upon APIs. We need feature toggles. We need easy, automated deployment to test the different options separately.

Pushing Innovation Down

But even with all this, we still have an obvious bottleneck in communication between the business and the development teams. We are also limiting the potential contributors to innovation by the top-down structure of product owner filling a product backlog.

Even most agile projects have a fairly linear look on features and priorities. Working from a story map is a good first step in getting away from that. But to really start reaping the benefits of your organisation’s capacity for innovation, one has to take a step back and let go of some control.

The way to do that is by making very clear what the goals for the organisation are, and for larger organisations what the goals for the product/project are. Make those goals measurable, and find a way to measure frequently. Then we can get to the situation below, where teams define their own features, work on them, and verify themselves whether those features indeed support the stated goals. (see also ‘Actionable Metrics at Organisational Scale‘, and ‘On Effect Mapping and Pirate Metrics‘)

set_based_design_image_3This requires, on top of all the technical supporting practices already mentioned, that the knowledge of the business and the contact with the user/customer is embedded within the team. For larger audiences, validation of the hypothesis (that this particular, minimum viable, feature indeed serves the stated goals), will need to be A/B tested. That requires a yet more advanced infrastructural setup.

All this ties together into the type of network organisations that we’ve discussed before. And this requires a lot of technical and business discipline. No one ever said it was going to be easy.

The ‘Just Do It’ Approach To Change Management

Last Friday I gave a talk at the Dare 2013 conference in Antwerp. The talk was about the experiences I and my colleague Ciarán ÓNeíll have had in a recent project, in which we found that sometimes a very directive, Just Do It approach will actually be the best way to get people in an agile mindset.

Update: The full video of this talk as given on ‘Agile on the Beach’ is available on youtube.

This was surprising to us, to say the least, and so we’ve tried to find some theory supporting our experiences. And though theory is not the focus of this story, it helps if we set the scene by referencing two bits of theory that we think fits our experience.

Just Do It

A long time ago, in a country far away, there was this psychologist called William James, who wrote:

“If you want a quality, act as if you already have it.” – William James (1842-1910)

We often say that if you want to change your behaviour, you need to change your mind, be disciplined, etc. But this principle tells us that it works the other way around as well: if you change your behaviour this can change your thinking. Or mindset, perhaps?

For more about the ‘As If’ Principle, see the book by Richard Wiseman

Another piece of theory that is related is complexity thinking as embodied by the Cynefin framework. Cynefin talks about taking different actions when managing situations that are in different domains: simple, complicated, complex or chaos.

Cynefin Framework

The project

And in chaos, our story begins.

This particular project was a development project for a large insurance company. The project had already been active for over half a year when we joined. It was a bad case of waterfall, with unclear requirements, lots of silo’s, lots of finger pointing and no progress.

The customer got tired of this, and got in a high-powered project manager who was given far reaching mandate to get the project going. (ie. no guarantees, just get *something* done) This guy decided that he’d heard good things about this ‘Agile’ thing, and that it might be appropriate here as a risk-management tool. Which was where we came in.

And this wasn’t the usual agile transition, with its mix of proponents and reluctants, where you coach and teach, but also have to sell the process to large extend.

Here, everyone was external (to the customer), no-one wanted Agile, or had much experience with it, but the customer was demanding it! And taking full responsibility for delivery, switching the project to a time-and-material basis for the external parties.

A whole new ballgame.

Initial actions

We started out by getting everyone involved local. Up to then, people from four different vendors been in different locations, in different countries even. Roughly 60 people in all, we all worked from the office in Amsterdam. Most of these people had never met or even spoken!

We started with implementing a fairly standard Scrum process.

Step one was requiring multi-functional teams, mixing the vendors. This was tolerated. Mostly, I think, because people thought they could ignore it. Then we explained the other requirements. One week sprints, small stories (<2 / 3 days), grooming, planning, demo, retro. These things were all, in turn, declared completely impossible and certainly in our circumstances unworkable. But the customer demanded it, so they tried. And at the end of the first week, we had our first (weak) demo.

So, we started with basic Scrum. The difference was in the way this was sold to the teams. Or wasn’t.

That is not to say that we didn’t explain the reasons behind the way of working, or had discussions about its merit. It’s just that in the end, there was no option of not doing it.

And… It worked!

The big surprise to us was how well this worked. People adjusted quickly, got to work, and started delivering working software almost immediately. Every new practice we introduced, starting with testing within the sprint, met with some resistance, and within 4 to 6 weeks was considered normal.

After a while we noticed that our retrospectives changed from simply complaining about the process to open discussion about impediments and valuable input for improvements generated by our teams.

And that’s what we do all this for, right? The continuous improvement mindset? Scrum, after all, is supposed to surface the real problems.

Well. It sure did.

Automated testing

One of those problems was one which you will be familiar with. If you’ve been delivering software weekly for a while, testing manually won’t keep up. And so we got more and more quality issues.

We had been expecting this, and we had our answer ready. And since we’d had great success so far in our top-down approach, we didn’t hesitate much, and we started asking for automated testing.

Adoption

Resistance here was very high. Much more so than for other changes. Impossible! But we’d heard all those arguments before, and why would this situation be any different? We set down the rules: every story is tested, tests are automated, all this happens within the sprint.

the-princess-bride-inconceivable

And sure enough, after a couple of sprints, we started seeing automated tests in the sprint, and a hit in velocity recovered to almost the level we had had before.

See. It’s Simple! Just F-ing Do It!

Limitations

Then after another 3-4 sprints, it all fell apart.

Tests were failing frequently, were only built against the UI, had lots of technical shortcomings. And tests were built within the team, but still in isolation: a ‘test automation’ person built them, and even those were decidedly unconvinced they were doing the right thing.

In the end, it took us another 6 months to dig our way out of this hole. This took much coaching, getting extra expertise in, pairing, teaching. Only then did we arrive at the stop-the-line mindset about our tests that we needed.

Even with all of that going on, though we were actually delivering working software.

And we were doing that, much quicker than expected. After the initial delays in the project, the customer hadn’t expected to start using the system until… well, about now, I think. But instead we had a (very) minimal, but viable product in time for calculating the 2012 year-end figures. And while we were at it, since we could roll-out new environments at a whim (well… almost:-) due to our efforts in the area of Continuous Delivery, we could also do a re-calculation of the 2011 figures.

These new calculations enabled the company to free a lot of money, so business wise there’s no doubt this was the right thing to do.

But it also meant that, suddenly, we were in production, and we weren’t really prepared to deliver support for that. Well, we really weren’t prepared!

Kanban

And that brings us to one of the most invasive changes we did during the project. After about 5 months, we moved away from Scrum and switched to Kanban.

Just Do It

At that time I was the scrum master of one of the teams, the one doing all the operations work. And our changes in priority were coming very fast, with many requests for support of production. In our retros, the team were stating that they were at the same time feeling that nothing was getting done (our velocity was 0), and they felt stressed (overtime was happening). Not a good combination. This went on for a few sprints, and then we declared Kanban.

That’s not the way one usually introduces Kanban. Which is carefully, evolutionary, keeping everyone involved, not changing the process but just visualising it. You guys know how that’s supposed to be done right?

This was more along the lines: “Hey, if you can’t keep priorities stable for a week, we can’t plan. So we won’t.”

Of course, we did a little more than that. We carefully looked at the type of issues we had, and the people available to work on them. We based some initial WIP limits on that, as well as a number of classes of service. And we put in some very basic explicit policies. No interruptions, except in case of expedite items. If we start something, we finish it. No breaking of WIP limits. And no days longer than 8 hours.

Adoption

That brought a lot of rest to the team. And immediately showed better production. It also made the work being done much more transparent for the PO.

It worked well enough, that another team that was also experiencing issues with the planning horizon also opted to ‘go Kanban’. Later the rest of the teams followed, including the PO team.

Limitations

That is not to say there was no resistance to this change. The Product Owners in particular felt uncomfortable with it for quite some time. The teams also raised issues. All that generated many of those nice incremental, evolutionary changes. And still does. The mindset of changing your process to improve things has really taken root.

The most remarkable thing, though, about all that initial resistance was the direction. It was all about moving back to the familiar safety of… Scrum!

Wrap-up

I’d like to tell you more but this post is getting long enough already. I don’t have time to talk about our adventures with going from many POs to one, introducing Specification by Example, moving to feature teams, or our kanban ready board.

I do feel I need to leave you with some comforting words, though. Because parts of this story go against the normal grain of Agile values.

Directive leadership, instead of Servant Leadership? Top-Down change, instead of bottom-up support? Certainly more of a dose of Theory X than I can normally stomach!

And to see all of that work, and work quite well, is a little disconcerting. Yes, Cynefin says that decisive action is appropriate in some domains, but not quite in the same way.

And overcoming the familiar ‘That won’t work in our situation’ resistance by making people try it is certainly satisfying, but we’ve also seen that fail quite disastrously where deep skills are required. That needs guidance: Still no silver bullets.

Enlightened Despotism is a perhaps dangerous tool. But what if it is the tool that instills the habits of Agile thinking? The tool that forcibly shakes people out of their old habits? That makes the despot obsolete?

Practice can lead to mindset. The trick is in where to guide closely, and when to let go.

Conway’s Organizational Structure Heuristic

organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations. — Melvin Conway

We often run into examples of Conway’s Law in organizations where silo-ed departments prompt architectural choices that are not supportive of good software design. The multi-functional nature of Agile teams is one way to prevent this from happening. But why not turn that around? If we know that organizational structure influences our software design, shouldn’t we let the rules of good software design inform our organizational structure?

Good software design leads to High Cohesion and Loose Coupling. What happens when we apply those principles to organizational design, and in particular to software teams? High Cohesion is about grouping together the parts that work on the same functionality, or need frequent communication. Multi-functional teams do just that, of course, so there we have an easy way of achieving effective organizational design.

Loose Coupling can be a little more involved to map. One way to look at that is that when communication between different teams is necessary, it should be along well-defined rules. That could be rules described in programming APIs when talking about development. Or rules as in well defined pull situations in a Lean process. Or simply the definition of specific tasks for which an organization has central staff available, such as pay-roll, HR, etc.

In general, though, the principles make it very simple: make sure all relevant decisions in day-to-day work can be made in the team context, with only in exceptional situations a need to find information or authorization in the rest of the organization.

Scaling Agile?

There’s a lot of discussion in the Agile community on the matter of scaling agile. Should we all adopt Dean Leffingwell’s Scaled Agile Framework? Do the Spotify tribe/squad thing? Or just roll our own? Or is Ron Jeffries’ intuition right, and do the terms scaling and agile simply not mix?

Ron’s stance seems to be that many of Agile’s principles simply don’t apply at scale. Or apply in the same way, so why act differently at scale? That might be true, but might also be a little too abstract to be of much use to most people running into questions when they start working with more than one team on a codebase.

Time and relative dimension in space

When Ron and Chet came around to our office last week, Chet mentioned that he was playing around with the analogy of coordination in time (as opposed to cross-team) when thinking about scaling. This immediately brought things into a new perspective for me, and I thought I’d share that here.

If we have a single team that will be working on a product/project for five years, how are they going to ensure that the team working on it now communicates what is important to the team that is working on it three, four or five years from now?

Now that is a question we can easily understand. We know what it takes to write software that is maintainable, changeable, self-documenting. We know how to write requirements that become executable, living documentation. We know how to write tests that run through continuous integration. We even know how to write deployment manifests that control the whole production environment to give us continuous deployment.

So why would this be any different when instead of one team working five years on the same product, we have five teams working for one year?

This break in this post is intentionally left blank to allow you to think that over.

Simple Design

scrum_tardis

Scrum really is bigger on the inside!

This way of looking at the problem simplifies the matter considerably, doesn’t it? I have found repeatedly that there are more technical problems in scaling (and agile adoption in general) than organizational ones. Of course, very often the technical problems are caused by the organizational ones, but putting them central to the question of scaling might actually help re-frame the discussions on a management level in a very positive way.

But getting back to the question: what would be the difference?

Let’s imagine a well constructed Agile project. We have an inception where the purpose of the product is clearly communicated by the customer/PO. We sketch a rough idea of architecture and features together. We make sure we understand enough of the most important features to split off a minimum viable version of it, perhaps using a story map. We start the first sprint with a walking skeleton of the product. We build up the product by starting with the minimal versions of a couple of features. We continue working on the different features later, extending them to more luxurious versions based on customer preference.

As long as the product is still fairly well contained, this would be exactly the same when we are with a few teams. We’d have come to a general agreement on design early on, and would talk when a larger change comes up. Continuous integration will take care of much of the lower level coordination, with our customer tests and unit testing providing context.

One area does become more explicit: dependencies. Where the single team would automatically handle dependencies in time by influencing prioritization, the multiple teams would need to have a commonly agreed (and preferably commonly built) interface in existence before they could be working on some features in parallel. This isn’t really different from the single-team version above, where the walking skeleton/minimal viable feature version would also happen before further work. But it would be felt as something needing some special attention, and cooperation between teams.

If we put these technical considerations central, that resolves a number of issues in scaling.  It could also allow for a much better risk / profit trade-offs by integrating this approach with a set-based approach to projects. But I’ll leave that for a future post.