Technical Excellence: Why you should TDD!

Last Thursday, Januari 19, I gave a short talk at ArrowsGroup’s Agile Evangelists event in Amsterdam. Michiel de Vries was the other speaker, talking about the role of Trust in Agile adoptions. On my recommendation the organisers had changed the format of the evening to include two Open Space Technology sessions, right after each 20 minute talk, with the subject of the talk as its theme. This worked very well, and we had some lively discussions going on after both talks, with the 50 or so attendents talking in four or five groups. I’m happy it worked out, as it was the first time I facilitated an Open Space.

My own talk dealt with Technical Excellence, and in particular with Test Driven Development, and why it’s a good idea to use it. My talk was heavily inspired by James Grenning‘s closing keynote at the Scrum Gathering in London last year. He even allowed me to use some of his sheets, which were very important in getting the argument across. I’ll go through the main points of my talk below.

For me the big challenge was to manage to fit the most important parts of what I wanted to say within the 20 minutes I had available to me. As in software development, it turns out presentation writing is about making the result short, clear, and without duplication. The first attempt at this presentation took about 50 minutes to tell, and subsequent versions got shorter and shorter…

The complete set of slides is here. And this is how the story went:

I quickly introduced the subject by talking about Agile’s rise in popularity ever since the term was introduced ten years ago (and before, really, for the separate methods that already existed). About35% of companies reported using some form of Agile early last year, in a forrester report. Most of those companies are showing positive effects of this adoption. No matter what you think of the quality of investigation of the Standish Report, for their internally consistent measure of project success the improvements over the last ten years have been distinct.

That’s not the whole story, though. Many teams have been finding that their initial success hasn’t lasted. Completely apart from any other difficulties in getting Agile accepted in an organisation, and there are plenty of othere, this particular one is clearly recognisable.

There is a slowing of development. Usually due to having a lot of defects that come out in production, but also because of an increasing fragility of the codebase that means that any change you do can (and will…) have unforseen side-effects. One very frequent indicator that this is happening in your team is the increase in requests for ‘refactoring’ stories from the team. By this they usually mean ‘rework’ stories, where major changes (‘clean-up’) is needed for them to work more comfortably,  and productively, with the code. In this situation the term ‘Technical Debt’ is also becoming a more familiar part of the vocabulary.

And that is why it is time to talk about technical excellence. And why the people who came up with this whole Agile thing were saying last year that encouraging technical excellence is the top priority! OK, they said ‘demand’, but I have less clout than those guys… They said other things, but this was a 20min. presentation, which is already too short for even this one. It is on top, though.

I further emphasised the need with a quote by Jeff Sutherland:

“14 percent, are doing extreme programming practices inside the SCRUM, and there is where we see the fastest teams: using the SCRUM management practice with the XP engineering practices inside.” – Jeff Sutherland, 2011

Since Jeff was nice enough to mention XP, Extreme Programming, this allows us to take a look at what those XP Engineering practices are.

We’re mainly talking about the inner circle in the picture, since those are the practices that are most directly related to the act of creating code. The second circle of Hell^H^H^H^H XP is more concerned with the coordination between developers, while the outer ring deals with coordination and cooperation with the environment.

From the inner circle, I think Test Driven Development is the most important one. The XP practices are designed to reinforce each other. One good way to do Pair Programming, for instance is by using the TDD development cycle to drive changing position: First one developer writes a test, then the other writes a piece of implementation, doeas some refactoring, and writes the next test, after which he passes the keyboard back to the first, who starts through the cycle again. And, since you are  keeping each-other honest, it becomes easier to stick to the discipling of working Test Driven.

TDD is, according to my humble opinion, a little more central than the others, even if only because it is so tightly woven with Refactoring and Simple Design. So let’s talk further about TDD!

Test Driven Development has a few different aspects to it, which are sometimes separately confused with the whole of TDD. I see these aspects:

  • Automated Tests
  • Unit Tests
  • Test-First
  • The Red, Green, Refactor process

I’ve had plenty of situations where people were claiming to do TDD, but it turned out their tests were not written before the production code, and were at a much higher level than you’d want for unit tests. Well, let’s look at those different aspects of TDD, and discuss why each of them is a good idea.

Why Automated Tests?

This is where some of the wonderful slides of James Grenning come in. In fact, for a much better discussion of these graphs, you should read his blog post! If you’re still with me, here’s the short version: Every Sprint you add new functionality, and that new functionality will need to be tested, which takes some fraction of the time that’s needed to implement it. The nasty thing is that while you’re adding new functionality, there is a good chance of breaking some of the existing functionality.

This means that you need to test the new functionality, and *all* functionality built in previous sprints! Testing effort grows with the age of your codebase.

Either you skip parts of your testing (decreasing quality) or you slow down. It’s good to have choices. So what happens is that pretty soon you’re stuck in the ‘untested code gap’, and thus stuck in the eternal firefighting mode that it brings.

 

This seems to be a great time to remind reader that the primary goal of a sprint is to release complete, working, tested software. And if that doesn’t happen, then you will end-up having to do that testing, and fixing, at a later date.

To try to illustrate this point, I’ve drawn this in a slightly suggestive form of a… waterfall!

Why Unit Tests?

So what is the case for unit tests? Let me first assure you that I will be the last to tell you that you shouldn’t have other types of tests! Integration Tests, Performance Test, Acceptance Tests, … And certainly also do manual exploratory testing. You will never think over everything beforehand. But. Those tests  should be added to the solid base of code covered by unit tests.

Combinatorial Complexity

Object Oriented code is built out of objects that can have a variety of inner states, and the interactions between those objects. The possible different states of such interacting components grows huge very quickly! That means that testing such a system from the outside requires very big tests, that would not have easy access to all of the details of all possible states of all objects. Unit Tests, however are written with intimate knowledge of each object and its possible states, and can thus ensure that all possible states of the object are covered.

Fine Grained tests make it easier to pin-point defects

Perhaps obvious, but if you have small tests covering a few lines of code each, when a defect is introduced, the test will accurately point to the defect with a couple of lines of accuracy. An integration or acceptance test will perhaps show the defect, but still leaves tens, hundreds or thousands of lines of code to comb through in search of the defect.

Code documentation that stays in sync

Unit tests document the interfaces of objects by demonstrating how the developer means them to be used. And because this is code, the documentation cannot get out of sync without breaking the build!

Supports changing the code

Unit tests are crucial to your ability to do changes to code. Be they refactoring changes or added functionality, if you don’t have detailed tests they chances of introducing a defect are huge. And if you don’t have those tests, chances are you’ll be finding out about those shiny new defect in production.

Why Test-First?

Ok, so we need unit tests. But why write them before writing the production code? One of the arguments is simply that there’s a good chance you won’t write them if you’ve already written your production code. But let’s pretend, just for a minute, that we’re all very disciplined, and do write those tests. And that we wrote our code in such a way that we can write unit tests for it with a normal level of effort. Time for another of mr Grenning’s sheets!

The full story, again, can be found on his blog, but the short version is as follows. It’s been an accepted fact in software development for quite a long time that the longer the period between introducing a defect and finding out about it, the more expensive it will be to fix it. This is easy to accept intuitively: the programmer won’t remember as well what he was thinking when the error was made; He might have written code in the same general area; Other people might have made changes in that part of the code. (How dare they!)

Now how does Test-First help alleviate this? Simple: If you write the test first, you will notice the defect immediately when you introduce it! Yes, granted, this does not mean that you can’t get any defects into production, there are many different types of bugs. But you’ll avoid a huge percentage of them! Some research on this indicates differences of 50 to 90 percent reduction of bugs.  More links to various studies on TDD can be found on George Dinwiddie’s wiki.

Red, Green, Refactor

And now we get to the last aspect of TDD, as an actual process of writing/designing code. The steps are simple: write a small test (red), that fails; write the simplest code that makes the test pass (green), and then refactor that code to remove duplication and ensure clean, readable code. And repeat!

One thing that is important in this cycle, is that is is very short! It should normally last from 3 to 10 minutes, with most of the time actually spent in the refactoring step. The small size of the steps is a feature, keeping you from the temptation of writing too much production code which isn’t yet tested.

Clean code

Sticking to this process for writing code has some important benifits. Your code will be cleaner. Writing tests first means that you will automatically be writing testable code. Testable code has a lower complexity (see Keith Braithwaite’s great work in this area). Testable code will be more loosely coupled, because writing tests for tightly coupled code is bloody hard. Continuous refactoring will help keep the code tightly cohesive. Granted, this last one is still very much dependent on the quality of the developer, and his/her OO skills. But the process encourages it, and if your code is not moving towards cohesion, you’ll see duplication increasing (for instance in method parameters).

After last weeks presentation, in one of the Open Space discussions some doubt was expressed on whether a lower cyclomatic complexity is always an indication of better code. I didn’t have/take much time to go into that, and accidentally reverted to an invocation of authority figures, but it is a very interesting subject. If you look at the link to Keith Braithwaite’s slides, and his blog, you’ll notice that all the projects he’s discussing have classes with higher and lower complexity. The interesting part is that the distribution differs between code that’s under test and code that isn’t. He also links this to more subjective judgement of the codebases. This is a good indication that testing indeed helps reduce complexity. Complexity is by no means the only measure though, and yes, you can have good code that has a higher complexity. But a codebase that has a tendency to have more lower complexity and less higher complexity classes will almost always be more readable and easier to maintain.

Simple Design

XP tells you You Ain’t Gonna Need It. Don’t write code that is not (yet) absolutely necessary for the functionality that you’re building. Certainly don’t write extra functionality! Writing the tests first keeps you focused on this. Since you are writing the test with a clear objective in mind, it’s much less likely that you’ll go and add this object or that method because it ‘seems to belong there’.

Another phrase that’s popular is Don’t Repeat Yourself. The refactoring step in the TDD process has the primary purpose of removing duplication. Duplication usually means harder to maintain. These things keep your code as lean as is possible, supporting the Simple Design filosophy

Why Not?

So why isn’t everyone in the world working with this great TDD thing I’ve been telling you about? I hear many reasons. One of them is that they didn’t know about is, or were under the misconception that TDD is indeed just Test Automation, or Unit Testing. But in the Internet age, that excuse is a bit hard to swallow.

More often, the first reaction is along the lines of “We don’t have the time”.

I’ve just spent quite a bit of time discussing hoe TDD will normally save you time, so it doesn’t make much sense to spend more time discussing why not having time isn’t an argument.

But there is one part of this objection that does hold: getting started with TDD will take time. As with everything else, there’s a learning curve. The first two or three sprints, you will be slower. Longer, if you have a lot of code that wasn’t written with testing in mind. You can speed things up a bit by getting people trained, and getting some coaches in to help get used to it. But it will still take time.

Then later, it will save you time. And embarrasment.

Another objection I’ve encountered is ‘But software will always contain bugs!’.

Sure. Not as many as now, but I’m sure there will be bugs. Still, that argument makes as much sense to me as saying that we will always have deaths in traffic due to people crossing the road, even if we have traffic lights! So we might as well get rid of all traffic lights?

There will always be some people that fall ill, even if we inoculate most everyone against a disease. So should we stop the inoculations?

Nah.

Uhm… Yeah…

At the same time the strongest and the weakest of arguments. Yes, in many companies there are bosses who will have questions about time spent writing tests.

The thing is: you, as a developer are the expert. You are the one who can tell your boss why not writing the test will be slower. Will slow you down in the future. Why code that isn’t kept clean and well factored will bog you down into the unmaintainability circle of hell until the company decides to throw it all away and start over. And over. And over.

You are also the one that that guy is going to be standing at the desk of, telling you that he’s going to have to ask you to come in on Saturday. And Sunday. And you know what? That’s going to be your own fault for not not working better.

And in the end, if you’re unable to impress them with those realities, use the Law of Two Feet, and change companies. Really. You’ll feel so much better.

This last one is, I must admit, the most convincing of the reasons why people are not TDDing. Legacy code really is hard to test.

There’s some good advice out there on how to get started. Michael Feathers book. Working Effectively With Legacy Code is the best place to start. Or the article of the same name, to whet your appetite. And the new Mikado Method definitely has promise, though I haven’t tried that on any real code yet.

And the longer you wait to get started, the harder it’s going to be…

How?

I closed the presentation with two sheets on how to get to the point where you can deliver at high quality.

For a developer, it’s important to keep learning. There a lot of great books out there. And new ones coming out all the time. But don’t just read. Practice. At Qualogy I’ve been organising Coding Dojos, where we get together and work on coding problems, as an exercise. A lot of learning, but also a lot of fun to do! Of course, we’ve also all had a training in this area, getting Ron Jeffries and Chet Hendrickson over to teach us Agile Development Skills.

Make sure that when you give an estimate, that you include plenty of time for testing. We developers simply have a tendency to overlook this. And when you’re still new to TDD, make a little more room for that. Commit to 30% fewer stories for your next three sprints, while your learning. And when someone tries to exert some pressure to squeeze in just that one more feature, that you know you can only manage by lowering quality? Don’t. Say NO. Explain why you say no, say it well, and give perspective on what you can say yes to. But say no.

For a manager, the most important thing you can do is ensure that you consistently emphasize quality over schedule. For a manager this is also one of the most difficult things to do. We’ve been talking about the discipline required for developers to stick to TDD. This is where the discipline for the manager comes in. It is very easy to let quality slip, and it will happen almost immediately as soon as you start emphasizing schedule. You want it as fast as possible, but only if it’s solid (and SOLID).

You can encourage teams to set their own standards. And the encourage them to stick to them. You’ll know you’ve succeeded when the next time you slip up and put on a little pressure, they’ll tell you “No, we can’t do that and stick to our agreed level of quality.” Congrats:-)

You can send your people to the trainings that they need. Or get help in to coach them (want my number? :-)

You can reserve time for those practice sessions. You will benifit from them, no reason they have to be solely on private time.

Well, it’s nicely consistent. I went over my time when presenting this, and over my target word-count when writing it down. If you stuck with me till this end, thanks! I’m also learning, and short, clear posts are just as hard as short, clear code. Or short, clear presentations.

Scrum Gathering London 2011 – day 3

The third day of the London Scrum Gathering (day 1, day 2) was reserved for an Open Space session led by Rachel Davies and a closing keynote by James Grenning.

We started with the Open Space. I’d done Open Spaces before, but this one was definitely on a much larger scale than what I’d seen before. After the introduction, everyone that had a subject for a session wrote their subject down, and got in line to announce the session (and have it scheduled). With so many attendents, you can imagine that there would be many sessions. Indeed, there were so many sessions that extra spaces had to be added to handle all of them. The subject for the Open Space was to be the Scum Alliance tag-line: “Changing the world of work”.

I initially intended to go the a session about Agile and Embedded, as the organiser mentioned that if there wouldn’t be enough people to talk about the embedded angle, he was OK with widening the subject to ‘difficult technical circumstances’. I haven’t done much real embedded work, but was interested in the broader subject. It turned out, though, that there were plenty of interested parties into the real deal, so the talk quickly started drifting to FPGAs and other esoterica and I used the law of two feet to find a different talk.

My second choice for that first period was a session about getting involvement of higher management. This session, proposed by Joe Justice and Peter Stevens (people with overlapping subjects were merging their sessions), turned out to be  very interesting and useful. The group shared experiences of both successfully (and less successfully) engaging higher management (CxOs, VPs, etc.) into an Agile Change process. Peter has since posted a nice summary of this session on his blog.

My own session was about applying Agile and Lean-Startup ideas to the context of setting up a consultancy and/or training business. If we’re really talking about ‘transforming the world of work’, then we should start with our own work. My intention was to discuss how things like transparency, early feedback, working iteratively and incrementally could be applied for an Agile Coach’s work. My colleague and I have been working to try and approach our work in this fashion, and are starting to get the hang of this whole ‘fail early’ thing. We’ve also been changing our approach based on feedback of customers, and more importantly, not-customers. During the session we talked a little about this, but we also quickly branched off into some related subjects. Some explanation of Lean-Startup ideas was needed, as not everyone had heard of that. We didn’t get far on any discussion on using some of the customer/product development ideas from that side of things, though.

Some discussion on contracts, and how those can fit with an agile approach. Most coaches are working on a time-and-material basis, it seems. We have a ‘Money for nothing and your changes for free’ type contract (see http://jeffsutherland.com/Agile2008MoneyforNothing.pdf, sheets 29-38) going for consultancy at the moment, but it’s less of a fit than with a software development project. Time and material is safest for the coach, of course, but also doen’t reward them for doing their work better than the competition. How should we do this? Jeff’s ‘money back guarantee’ if you don’t double your velocity is a nice marketing gimmick, but a big risk for us lesser gods: Is velocity a good measure for productivity and results? How do we measure it? How do we determine whether advice was followed?

Using freebies or discounts to customers to test new training material on was more generally in use. This has really helped us quickly improve our workshop materials, not to mention hone the training skills…

One later session was Nigel Baker’s. He did a session on the second day called ‘Scrumbrella’, on how to scale Scrum, and was doing a second one of those this third afternoon. I hadn’t made it to the earlier one, but had heard enthusiastic stories about it, so I decided to go and see what that was about. Nigel didn’t disappoint, and had a dynamic and entertaining story to tell. He made the talk come alive by the way he drew the organisational structures on sheets of paper, and moved those around on the floor during his talk, often getting the audience to provide new drawings.

There is no way I can do Nigel’s presentation style justice here. There were a number of people filming him in action on their cellphones, but I haven’t seen any of those movies surface yet. For now, you’ll have to make do with his blog-post on the subject, and some slides (which he obviously didn’t use during this session). I can, however, show you what the final umbrella looked like:

All my other pictures, including some more from the ScrumBrella session, and from the Open Space closing, can be found on flickr.

Closing Keynote by James Grenning on ‘Changing the world of work through Technical Excellence’

(slides are here)

The final event of the conference was the closing keynote by James Grenning. His talk dealt with ‘Technical Excellence’, and as such was very near my heart.

He started off with a little story, for which the slides are unfortunately not in the slide deck linked above, about how people sometimes come up to him in a bar (a conference bar, I assume, or pick-up lines have really changed in the past few years) and tell him: “Yeah, that agile thing, we tried that, it didn’t work”.

He would then ask them some innocent questions (paraphrased, I don’t have those slides, not a perfect memory):

So you were doing TDD?

No.

Ah, but you were doing ATDD?

No.

But surely you were doing unit testing?

Not really.

Pair programming?

No.

Continuous Integration?

Sometimes.

Refactoring?

No!

At least deliver working software at the end of the sprint?

No…

If you don’t look around and realise that to do Agile, you’ll actually have to improve your quality, you’re going to fail. And if you insist on ignoring the experience of so many experienced people, then maybe you deserve to fail.

After this great intro, we were treated to a backstage account of the way the Agile Manifesto meeting at Snowbird went. And then about what subjects came up after this year’s reunion meeting. James showed the top two things coming out of the reunion meeting:

We believe the agile community must:

  1. Demand Technical Excellence
  2. Promote individual [change] and lead organizational change

The rest of his talk was a lesson in doing those things. He first went into more detail on Test Driven Development, and how it’s the basis for improving quality.
To do this, he first explains why Debug-Later-Programming (DLP) in practice will always be slower than Test-First/TDD.

The Physics of Debug-Later-Programming

DLP vs TDD

Mr. Grenning went on to describe the difference between System Level Tests and Unit Tests, saying that the System Level Tests suffer from having to test the combinatorial total of all contained elements, while Unit Tests can be tailored by the programmer to directedly test the use of the code as it is intended, and only that use. This means that, even though System Level Tests can be useful, they can never be sufficient.
Of course, the chances are small that you’ll write sufficient and complete Unit Tests if you don’t do Test Driven Development, as Test-After always becomes Release First. Depending on Manual Testing for testing is a recipe for unmaintainability.

The Untested Code Gap

The keynote went on to talk about individual and organisational change, and what Developers, Scrum Masters and Managers can do to improve things. Developers should learn, and create tested and maintainable code. Scrum Masters should encourage people to be Problem Solvers, not Dogma Followers. He illustrated this with the example of Planning Poker. As he invented Planning Poker, him saying that you shouldn’t always use it is a strong message. For instance, if you want to estimate a large number of stories, other systems can work much better. Managers got the advice to Grow Great Teams, Avoid De-Motivators, and Stop Motivating Your Team!

DeMotivators

It was very nice to be here for this talk, validating my own stance on Technical Excellence, and teaching me new ways of talking to people in order to help them see the advantages of improving their technical practices. Oh, and some support for my own discipline in strictly sticking to TDD…

Article on Oracle and Agile posted (Dutch)

My article on using Agile on projects using Oracle technology was published in this months Optimize magazine, and the full text (in Dutch!) is now available on the Qualogy website.

The article is mainly meant as an introduction to agile principles and practices for Oracle developers that are new to them. It focused on explaining short feedback loops, and the importance of technical practices such as Continuous Integration and (Unit) testing for success with iterative and incremental development.

A Cucumber Experiment

Having used the GildedRose recently as the subject of a coding dojo, I thought it would also make an interesting subject for some experimentation with Cucumber. Cucumber is a tool that allows you to use natural language to specify executable Acceptance Tests. The GildedRose exercise, originally created by Bobby Johnson, can be found on github, in both C# (the original), and Java (my copy). This code kata is a refactoring assignment, where the requirements are given together with a piece of existing code and the programmer is expected to add some functionality. If you take one look at that existing code, though, you’ll see that you really need to clean it up before adding that extra feature:

    public static void updateQuality()
    {
        for (int i = 0; i < items.size(); i++)
        {
            if ((!"Aged Brie".equals(items.get(i).getName())) && !"Backstage passes to a TAFKAL80ETC concert".equals(items.get(i).getName()))
            {
                if (items.get(i).getQuality() > 0)
                {
                    if (!"Sulfuras, Hand of Ragnaros".equals(items.get(i).getName()))
                    {
                        items.get(i).setQuality(items.get(i).getQuality() - 1);
                    }
                }
            }
            else
            {
                if (items.get(i).getQuality() < 50)
                {
                    items.get(i).setQuality(items.get(i).getQuality() + 1);

                    if ("Backstage passes to a TAFKAL80ETC concert".equals(items.get(i).getName()))
                    {
                        if (items.get(i).getSellIn() < 11)
                        {
                            if (items.get(i).getQuality() < 50)
                            {
                                items.get(i).setQuality(items.get(i).getQuality() + 1);
                            }
                        }

                        if (items.get(i).getSellIn() < 6)
                        {
                            if (items.get(i).getQuality() < 50)
                            {
                                items.get(i).setQuality(items.get(i).getQuality() + 1);
                            }
                        }
                    }
                }
            }

            if (!"Sulfuras, Hand of Ragnaros".equals(items.get(i).getName()))
            {
                items.get(i).setSellIn(items.get(i).getSellIn() - 1);
            }

            if (items.get(i).getSellIn() < 0)
            {
                if (!"Aged Brie".equals(items.get(i).getName()))
                {
                    if (!"Backstage passes to a TAFKAL80ETC concert".equals(items.get(i).getName()))
                    {
                        if (items.get(i).getQuality() > 0)
                        {
                            if (!"Sulfuras, Hand of Ragnaros".equals(items.get(i).getName()))
                            {
                                items.get(i).setQuality(items.get(i).getQuality() - 1);
                            }
                        }
                    }
                    else
                    {
                        items.get(i).setQuality(items.get(i).getQuality() - items.get(i).getQuality());
                    }
                }
                else
                {
                    if (items.get(i).getQuality() < 50)
                    {
                        items.get(i).setQuality(items.get(i).getQuality() + 1);
                    }
                }
            }
        }
    }

Having gone through the exercise a few times, I already had a version lying around that had the requirements implemented as a bunch of junit regression tests, and some real unit tests for my implementation, of course. A good starting point to get going with Cuke, though in many real-life situations I’ve found that those regression tests are not available…

Adding Cucumber to the maven build

To prepare for the use of Cucumber, I first had to set it up so that I had all the dependencies for cucumber, and have it run in the maven integration-test phase. The complete Maven pom.xml is also on GitHub.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>GildedRoseJava</groupId>
    <artifactId>GildedRoseJava</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <dependencies>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.8.2</version>
            <type>jar</type>
            <scope>test</scope>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.picocontainer</groupId>
            <artifactId>picocontainer</artifactId>
            <version>2.10.2</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>cuke4duke</groupId>
            <artifactId>cuke4duke</artifactId>
            <version>0.4.4</version>
            <scope>test</scope>
        </dependency>
    </dependencies>
    <repositories>
        <repository>
            <id>codehaus</id>
            <url>http://repository.codehaus.org</url>
        </repository>
        <repository>
            <id>cukes</id>
            <url>http://cukes.info/maven</url>
        </repository>
    </repositories>
    <pluginRepositories>
        <pluginRepository>
            <id>cukes</id>
            <url>http://cukes.info/maven</url>
        </pluginRepository>
    </pluginRepositories>
    <build>
        <plugins>
            <plugin>
                <groupId>cuke4duke</groupId>
                <artifactId>cuke4duke-maven-plugin</artifactId>
                <configuration>
                    <jvmArgs>
                        <jvmArg>
                            -Dcuke4duke.objectFactory=cuke4duke.internal.jvmclass.PicoFactory
                        </jvmArg>
                        <jvmArg>-Dfile.encoding=UTF-8</jvmArg>
                    </jvmArgs>
                    <!-- You may not need all of these arguments in your
          own project. We have a lot here for testing purposes... -->
                    <cucumberArgs>
                        <cucumberArg>--backtrace</cucumberArg>
                        <cucumberArg>--color</cucumberArg>
                        <cucumberArg>--verbose</cucumberArg>
                        <cucumberArg>--format</cucumberArg>
                        <cucumberArg>pretty</cucumberArg>
                        <cucumberArg>--format</cucumberArg>
                        <cucumberArg>junit</cucumberArg>
                        <cucumberArg>--out</cucumberArg>
                        <cucumberArg>${project.build.directory}/cucumber-reports</cucumberArg>
                        <cucumberArg>--require</cucumberArg>
                        <cucumberArg>${basedir}/target/test-classes</cucumberArg>
                    </cucumberArgs>
                    <gems>
                        <gem>install cuke4duke --version 0.3.2</gem>
                    </gems>
                </configuration>
                <executions>
                    <execution>
                        <id>run-features</id>
                        <phase>integration-test</phase>
                        <goals>
                            <goal>cucumber</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>

Adding a Feature

To start with cucumber, you create a feature file that contains a Feature, which is expected to be in the User Story schema, and contains all the Acceptance Tests for that story in the Gherkin format.

The feature file is placed in the {$project.root}/features directory. The Story looks as follows:

Feature: Quality changes with sell-in date Feature
         In order to keep track of quality of items in stock
         As a ShopKeeper
         I want quality to change as the sell-in date decreases

A Scenario

This doesn’t do anything, until you start adding some scenarios, which can be seen as the acceptance criteria for the story/feature. For now, if you do

mvn integration-test

you’ll get confirmation that Cucumber has found your feature, and that there were no test scenarios to run:

[INFO] Code:
[INFO]
[INFO] Features:
[INFO]   * features/quality_decrease.feature
[INFO] Parsing feature files took 0m0.111s
[INFO]
[INFO] Code:
[INFO]   * /home/wouter/data/eclipse/kata/GildedRoseGitHub/target/test-classes/org/geenz/gildedrose/BackstagePassStoreKeepingItemTest.class
[INFO]   * /home/wouter/data/eclipse/kata/GildedRoseGitHub/target/test-classes/org/geenz/gildedrose/GildedRoseRegressionTest.class
[INFO]   * /home/wouter/data/eclipse/kata/GildedRoseGitHub/target/test-classes/org/geenz/gildedrose/GildedRoseTest.class
[INFO]   * /home/wouter/data/eclipse/kata/GildedRoseGitHub/target/test-classes/org/geenz/gildedrose/ImmutableItemTest.class
[INFO]   * /home/wouter/data/eclipse/kata/GildedRoseGitHub/target/test-classes/org/geenz/gildedrose/ItemFactoryTest.class
[INFO]   * /home/wouter/data/eclipse/kata/GildedRoseGitHub/target/test-classes/org/geenz/gildedrose/StoreKeepingItemTest.class
[INFO]   * /home/wouter/data/eclipse/kata/GildedRoseGitHub/target/test-classes/org/geenz/gildedrose/cucumber/BasicFeature.class
[INFO]
[INFO] Feature: Quality changes with sell-in date Feature
[INFO]   In order to keep track of quality of items in stock
[INFO]   As a ShopKeeper
[INFO]   I want quality to change as the sell-in date decreases
[INFO]
[INFO] 0 scenarios
[INFO] 0 steps
[INFO] 0m0.062s

Now, let’s add a scenario, let’s say a scenario for the basic situation of quality decreasing by one if the sell-in date decreases by one:

Feature: Quality changes with sell-in date Feature
        In order to keep track of quality of items in stock
        As a ShopKeeper
        I want quality to change as the sell-in date decreases

        Scenario: Decreasing Quality of a Basic Item
                Given a Store Keeping Item with name "+5 Dexterity Vest"
                And a sellIn date of 5
                And a quality of 7
                When the Item is updated
                Then the sellIn is 4
                And the quality is 6

There’s a few things to notice here. First of all, this is readable. This scenario can be read by people that have no programming experience, and understood well if they have some knowledge of the domain of the application. That means that scenarios like this one can be used to talk about the requirements/expected behaviour of the application. Second is that this is a Real Life example of the workings of the application. Making it a specific example, instead of a generic rule, makes the scenario easier to understand, and thus more useful as a communication device.

Glue

So what happens when we try to run the integration-test phase again? Will we magically see this scenario work? Well, no, there’s still some glue to provide, but we do get some help with that:

[INFO] Code:
[INFO]
[INFO] Features:
[INFO]   * features/quality_decrease.feature
[INFO] Parsing feature files took 0m0.077s
[INFO]
[INFO] Code:
[INFO]   * /home/wouter/data/eclipse/kata/GildedRoseGitHub/target/test-classes/org/geenz/gildedrose/BackstagePassStoreKeepingItemTest.class
[INFO]   * /home/wouter/data/eclipse/kata/GildedRoseGitHub/target/test-classes/org/geenz/gildedrose/GildedRoseRegressionTest.class
[INFO]   * /home/wouter/data/eclipse/kata/GildedRoseGitHub/target/test-classes/org/geenz/gildedrose/GildedRoseTest.class
[INFO]   * /home/wouter/data/eclipse/kata/GildedRoseGitHub/target/test-classes/org/geenz/gildedrose/ImmutableItemTest.class
[INFO]   * /home/wouter/data/eclipse/kata/GildedRoseGitHub/target/test-classes/org/geenz/gildedrose/ItemFactoryTest.class
[INFO]   * /home/wouter/data/eclipse/kata/GildedRoseGitHub/target/test-classes/org/geenz/gildedrose/StoreKeepingItemTest.class
[INFO]   * /home/wouter/data/eclipse/kata/GildedRoseGitHub/target/test-classes/org/geenz/gildedrose/cucumber/BasicFeature.class
[INFO]
[INFO] Feature: Quality changes with sell-in date Feature
[INFO]   In order to keep track of quality of items in stock
[INFO]   As a ShopKeeper
[INFO]   I want quality to change as the sell-in date decreases
[INFO]
[INFO]   Scenario: Decreasing Quality of a Basic Item               # features/quality_decrease.feature:6
[INFO]     Given a Store Keeping Item with name "+5 Dexterity Vest" # features/quality_decrease.feature:7
[INFO]     And a sellIn date of 5                                   # features/quality_decrease.feature:8
[INFO]     And a quality of 7                                       # features/quality_decrease.feature:9
[INFO]     When the Item is updated                                 # features/quality_decrease.feature:10
[INFO]     Then the sellIn is 4                                     # features/quality_decrease.feature:11
[INFO]     And the quality is 6                                     # features/quality_decrease.feature:12
[INFO]
[INFO] 1 scenario (1 undefined)
[INFO] 6 steps (6 undefined)
[INFO] 0m0.221s
[INFO]
[INFO] You can implement step definitions for undefined steps with these snippets:
[INFO]
[INFO] @Given ("^a Store Keeping Item with name \"([^\"]*)\"$")
[INFO] @Pending
[INFO] public void aStoreKeepingItemWithName+5DexterityVest_(String arg1) {
[INFO] }
[INFO]
[INFO] @Given ("^a sellIn date of 5$")
[INFO] @Pending
[INFO] public void aSellInDateOf5() {
[INFO] }
[INFO]
[INFO] @Given ("^a quality of 7$")
[INFO] @Pending
[INFO] public void aQualityOf7() {
[INFO] }
[INFO]
[INFO] @When ("^the Item is updated$")
[INFO] @Pending
[INFO] public void theItemIsUpdated() {
[INFO] }
[INFO]
[INFO] @Then ("^the sellIn is 4$")
[INFO] @Pending
[INFO] public void theSellInIs4() {
[INFO] }
[INFO]
[INFO] @Then ("^the quality is 6$")
[INFO] @Pending
[INFO] public void theQualityIs6() {
[INFO] }
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 9.500s
[INFO] Finished at: Fri Sep 16 14:27:29 CEST 2011
[INFO] Final Memory: 5M/106M
[INFO] ------------------------------------------------------------------------

Ok! The build is successful, the new scenario is found, but apparently ‘undefined’, and there’s a load of javacode dumped! Things are moving along…

We copy-past-clean-up the java code into a test class, such as this:

package org.geenz.gildedrose.cucumber;

import cuke4duke.annotation.I18n.EN.Given;
import cuke4duke.annotation.I18n.EN.Then;
import cuke4duke.annotation.I18n.EN.When;
import cuke4duke.annotation.Pending;
import org.geenz.gildedrose.ItemFactory;
import org.geenz.gildedrose.StoreKeepingItem;

import static org.junit.Assert.*;

public class BasicFeature {

	 @Given ("^a Store Keeping Item with name \"([^\"]*)\"$")
	 @Pending
	 public void aStoreKeepingItemWithName+5DexterityVest_(String arg1) {
	 }

	 @Given ("^a sellIn date of 5$")
	 @Pending
	 public void aSellInDateOf5() {
	 }

	 @Given ("^a quality of 7$")
	 @Pending
	 public void aQualityOf7() {
	 }

	 @When ("^the Item is updated$")
	 @Pending
	 public void theItemIsUpdated() {
	 }

	 @Then ("^the sellIn is 4$")
	 @Pending
	 public void theSellInIs4() {
	 }

	 @Then ("^the quality is 6$")
	 @Pending
	 public void theQualityIs6() {
	 }
}

Pending

And, after noticing and correcting that a ‘+’ sign can’t be part of a Java method name, we run it again, Sam.

Oh dear:

[INFO] Feature: Quality changes with sell-in date Feature
[INFO] In order to keep track of quality of items in stock
[INFO] As a ShopKeeper
[INFO] I want quality to change as the sell-in date decreases
[INFO]
[INFO] Scenario: Decreasing Quality of a Basic Item # features/quality_decrease.feature:6
[INFO] Given a Store Keeping Item with name "+5 Dexterity Vest" # BasicFeature.aStoreKeepingItemWithNamePlus5DexterityVest_(String)
[INFO] TODO (Cucumber::Pending)
[INFO] /home/wouter/.m2/repository/.jruby/gems/cucumber-0.8.7/lib/cucumber/step_match.rb:26:in `invoke'
[INFO] /home/wouter/.m2/repository/.jruby/gems/cucumber-0.8.7/lib/cucumber/ast/step_invocation.rb:62:in `invoke'
[INFO] /home/wouter/.m2/repository/.jruby/gems/cucumber-0.8.7/lib/cucumber/ast/step_invocation.rb:41:in `accept'
[INFO] /home/wouter/.m2/repository/.jruby/gems/cucumber-0.8.7/lib/cucumber/ast/tree_walker.rb:99:in `visit_step'
[INFO] /home/wouter/.m2/repository/.jruby/gems/cucumber-0.8.7/lib/cucumber/ast/tree_walker.rb:164:in `broadcast'
[INFO] /home/wouter/.m2/repository/.jruby/gems/cucumber-0.8.7/lib/cucumber/ast/tree_walker.rb:98:in `visit_step'
[INFO] /home/wouter/.m2/repository/.jruby/gems/cucumber-0.8.7/lib/cucumber/ast/step_collection.rb:15:in `accept'
[IN
.....

Just what a Java developer likes: Ruby stackstraces! Luckily, the first entry is fairly clear: TODO (Cucumber::Pending).
And indeed, now that we take another look at it, the methods we just created all have a @Pending annotation. If we remove that, the scenario ‘runs’ successfully:

[INFO] Feature: Quality changes with sell-in date Feature
[INFO]   In order to keep track of quality of items in stock
[INFO]   As a ShopKeeper
[INFO]   I want quality to change as the sell-in date decreases
[INFO]
[INFO]   Scenario: Decreasing Quality of a Basic Item               # features/quality_decrease.feature:6
[INFO]     Given a Store Keeping Item with name "+5 Dexterity Vest" # BasicFeature.aStoreKeepingItemWithNamePlus5DexterityVest_(String)
[INFO]     And a sellIn date of 5                                   # BasicFeature.aSellInDateOf5()
[INFO]     And a quality of 7                                       # BasicFeature.aQualityOf7()
[INFO]     When the Item is updated                                 # BasicFeature.theItemIsUpdated()
[INFO]     Then the sellIn is 4                                     # BasicFeature.theSellInIs4()
[INFO]     And the quality is 6                                     # BasicFeature.theQualityIs6()
[INFO]
[INFO] 1 scenario (1 passed)
[INFO] 6 steps (6 passed)
[INFO] 0m0.269s

Adding code

Of course, it doesn’t actually test anything yet! But we can take a look at the way the text of the scenario is translated into executable code. For instance:

	 @Given ("^a sellIn date of 5$")
	 public void aSellInDateOf5() {
	 }

This looks straightforward. The @Given annotation gets a regular expression passed to it, which matches a particular condition. If the method actually set a sell-in value on some object, this would be a perfectly valid step in performing some test.

So let’s see where we can create such a sellable item. This other method looks promising:

	 @Given ("^a Store Keeping Item with name \"([^\"]*)\"$")
	 public void aStoreKeepingItemWithNamePlus5DexterityVest_(String arg1) {
	 }

For the non-initiated into the esoteric realm of regular expressions, the regexp here looks a little more scary. It really isn’t all that bad, though. The backslashes (\) are there to escape out the quotes (“). The brackets (()) are there to capture a specific region to be passed into the method: anything within those brackets is passed in as the first parameter of the method (out arg1 String parameter). And to specify what we want to capture, the [^\”]* simply means any character that is not a quote. So this captures everything within the quotes in our scenario, which happens to be the name of the item.

So let’s change that into:

    private StoreKeepingItem item = null;

    @Given ("^a Store Keeping Item with name \"([^\"]*)\"$")
    public void aStoreKeepingItemWithName(String name) {
        item = ItemFactory.create(name, 0, 0);
    }

Now our first line will create a new Item, with the correct name! That was easy!
Now we can fix the earlier method:

	 @Given ("^a sellIn date of 5$")
	 public void aSellInDateOf5() {
             item.setSellIn(5);
	 }

But, since I now already know how to parameterise these annotations, let’s doe that immediately:

    @Given("^a sellIn date of ([0-9]*)$")
    public void aSellInDateOf(int sellIn) {
        item.setSellIn(sellIn);
    }

Much better. Now let’s imagine we’ve done this for the rest as well (click to see):

package org.geenz.gildedrose.cucumber;

import cuke4duke.annotation.I18n.EN.Given;
import cuke4duke.annotation.I18n.EN.Then;
import cuke4duke.annotation.I18n.EN.When;
import cuke4duke.annotation.Pending;
import org.geenz.gildedrose.ItemFactory;
import org.geenz.gildedrose.StoreKeepingItem;

import static org.junit.Assert.*;

public class BasicFeature {

    private StoreKeepingItem item = null;

    @Given ("^a Store Keeping Item with name \"([^\"]*)\"$")
    public void aStoreKeepingItemWithName(String name) {
        item = ItemFactory.create(name, 0, 0);
    }

    @Given("^a sellIn date of ([0-9]*)$")
    public void aSellInDateOf(int sellIn) {
        item.setSellIn(sellIn);
    }

    @Given("^a quality of ([0-9]*)$")
    public void aQualityOf(int quality) {
        item.setQuality(quality);
    }

    @When("^the Item is updated$")
    public void theItemIsUpdated() {
        item.doDailyInventoryUpdate();
    }

    @Then("^the sellIn is (.*)$")
    public void theSellInIs(int expectedSellIn) {
        assertEquals(expectedSellIn, item.getSellIn());
    }

    @Then("^the quality is ([0-9]*)$")
    public void theQualityIs(int expectedQuality) {
        assertEquals(expectedQuality, item.getQuality());
    }
}

The test still passes, so this seems to work! For the rest of the scenarios, see the checked-in feature file on GitHub. Take a look at that file, and compare it to the README. Which one is clearer to you? Do they both contain all the information you need?

The odd one out

Note that there is one scenario there that is not covered by the methods that we have, so once you try to run this particular scenario, things fall apart:

        Scenario: Quality and SellIn of a Sulfuras item does not change
                Given a Store Keeping Item with name "Sulfuras, Hand of Ragnaros" with a sellIn date of 5, a quality of 7
                When the Item is updated
                Then the sellIn is 5
                And the quality is 7

Since this item is supposed to be immutable, I can’t really go and set the sellIn or quality after it’s been created. So I made a separate method to pass-in the sell-in and quality at initialisation time:

    /**
     * Separate when clause to indicate initial state, since some items can't be changed after initial creation.
     */
    @Given ("^a Store Keeping Item with name \"([^\"]*)\" with a sellIn date of ([0-9]*), a quality of ([0-9]*)$")
    public void aStoreKeepingItemWithNameAndSellInDateAndQualityOf(String name, int sellIn, int quality) {
        item = ItemFactory.create(name, sellIn, quality);

There could be a nicer way to do this, either by creating all items in this way, or by changing the way the immutable items work, but this was easy enough to do that I didn’t look any further.

So when we run all the scenarios, we get:

[INFO] Feature: Quality changes with sell-in date Feature
[INFO]   In order to keep track of quality of items in stock
[INFO]   As a ShopKeeper
[INFO]   I want quality to change as the sell-in date decreases
[INFO]
[INFO]   Scenario: Decreasing Quality of a Basic Item               # features/basic.feature:6
[INFO]     Given a Store Keeping Item with name "+5 Dexterity Vest" # BasicFeature.aStoreKeepingItemWithName(String)
[INFO]     And a sellIn date of 5                                   # BasicFeature.aSellInDateOf(int)
[INFO]     And a quality of 7                                       # BasicFeature.aQualityOf(int)
[INFO]     When the Item is updated                                 # BasicFeature.theItemIsUpdated()
[INFO]     Then the sellIn is 4                                     # BasicFeature.theSellInIs(int)
[INFO]     And the quality is 6                                     # BasicFeature.theQualityIs(int)
[INFO]
[INFO]   Scenario: Quality decrease doubles after sell-in has passed # features/basic.feature:14
[INFO]     Given a Store Keeping Item with name "+5 Dexterity Vest"  # BasicFeature.aStoreKeepingItemWithName(String)
[INFO]     And a sellIn date of 0                                    # BasicFeature.aSellInDateOf(int)
[INFO]     And a quality of 10                                       # BasicFeature.aQualityOf(int)
[INFO]     When the Item is updated                                  # BasicFeature.theItemIsUpdated()
[INFO]     Then the sellIn is -1                                     # BasicFeature.theSellInIs(int)
[INFO]     And the quality is 8                                      # BasicFeature.theQualityIs(int)
[INFO]
[INFO]   Scenario: Quality never becomes negative                   # features/basic.feature:22
[INFO]     Given a Store Keeping Item with name "+5 Dexterity Vest" # BasicFeature.aStoreKeepingItemWithName(String)
[INFO]     And a sellIn date of 0                                   # BasicFeature.aSellInDateOf(int)
[INFO]     And a quality of 0                                       # BasicFeature.aQualityOf(int)
[INFO]     When the Item is updated                                 # BasicFeature.theItemIsUpdated()
[INFO]     Then the sellIn is -1                                    # BasicFeature.theSellInIs(int)
[INFO]     And the quality is 0                                     # BasicFeature.theQualityIs(int)
[INFO]
[INFO]   Scenario: Quality of Aged Brie increases with age  # features/basic.feature:30
[INFO]     Given a Store Keeping Item with name "Aged Brie" # BasicFeature.aStoreKeepingItemWithName(String)
[INFO]     And a sellIn date of 5                           # BasicFeature.aSellInDateOf(int)
[INFO]     And a quality of 1                               # BasicFeature.aQualityOf(int)
[INFO]     When the Item is updated                         # BasicFeature.theItemIsUpdated()
[INFO]     Then the sellIn is 4                             # BasicFeature.theSellInIs(int)
[INFO]     And the quality is 2                             # BasicFeature.theQualityIs(int)
[INFO]
[INFO]   Scenario: Quality of Aged Brie never increases past 50 # features/basic.feature:38
[INFO]     Given a Store Keeping Item with name "Aged Brie"     # BasicFeature.aStoreKeepingItemWithName(String)
[INFO]     And a sellIn date of 5                               # BasicFeature.aSellInDateOf(int)
[INFO]     And a quality of 50                                  # BasicFeature.aQualityOf(int)
[INFO]     When the Item is updated                             # BasicFeature.theItemIsUpdated()
[INFO]     Then the sellIn is 4                                 # BasicFeature.theSellInIs(int)
[INFO]     And the quality is 50                                # BasicFeature.theQualityIs(int)
[INFO]
[INFO]   Scenario: Quality of Backstage Passes increases by 1 if sell-in is greater than 10 # features/basic.feature:46
[INFO]     Given a Store Keeping Item with name "Backstage passes to a TAFKAL80ETC concert" # BasicFeature.aStoreKeepingItemWithName(String)
[INFO]     And a sellIn date of 11                                                          # BasicFeature.aSellInDateOf(int)
[INFO]     And a quality of 20                                                              # BasicFeature.aQualityOf(int)
[INFO]     When the Item is updated                                                         # BasicFeature.theItemIsUpdated()
[INFO]     Then the quality is 21                                                           # BasicFeature.theQualityIs(int)
[INFO]
[INFO]   Scenario: Quality of Backstage Passes increases by 2 if sell-in is less than 10 but more than 5 # features/basic.feature:53
[INFO]     Given a Store Keeping Item with name "Backstage passes to a TAFKAL80ETC concert"              # BasicFeature.aStoreKeepingItemWithName(String)
[INFO]     And a sellIn date of 6                                                                        # BasicFeature.aSellInDateOf(int)
[INFO]     And a quality of 20                                                                           # BasicFeature.aQualityOf(int)
[INFO]     When the Item is updated                                                                      # BasicFeature.theItemIsUpdated()
[INFO]     Then the quality is 22                                                                        # BasicFeature.theQualityIs(int)
[INFO]
[INFO]   Scenario: Quality of Backstage Passes increases by 3 if sell-in is 5 or less but more than 0 # features/basic.feature:60
[INFO]     Given a Store Keeping Item with name "Backstage passes to a TAFKAL80ETC concert"           # BasicFeature.aStoreKeepingItemWithName(String)
[INFO]     And a sellIn date of 5                                                                     # BasicFeature.aSellInDateOf(int)
[INFO]     And a quality of 20                                                                        # BasicFeature.aQualityOf(int)
[INFO]     When the Item is updated                                                                   # BasicFeature.theItemIsUpdated()
[INFO]     Then the quality is 23                                                                     # BasicFeature.theQualityIs(int)
[INFO]
[INFO]   Scenario: Quality of Backstage Passes is 0 after the concert (sell-in) passes      # features/basic.feature:67
[INFO]     Given a Store Keeping Item with name "Backstage passes to a TAFKAL80ETC concert" # BasicFeature.aStoreKeepingItemWithName(String)
[INFO]     And a sellIn date of 0                                                           # BasicFeature.aSellInDateOf(int)
[INFO]     And a quality of 20                                                              # BasicFeature.aQualityOf(int)
[INFO]     When the Item is updated                                                         # BasicFeature.theItemIsUpdated()
[INFO]     Then the quality is 0                                                            # BasicFeature.theQualityIs(int)
[INFO]
[INFO]   Scenario: Quality and SellIn of a Sulfuras item does not change                                             # features/basic.feature:74
[INFO]     Given a Store Keeping Item with name "Sulfuras, Hand of Ragnaros" with a sellIn date of 5, a quality of 7 # BasicFeature.aStoreKeepingItemWithNameAndSellInDateAndQualityOf(String,int,int)
[INFO]     When the Item is updated                                                                                  # BasicFeature.theItemIsUpdated()
[INFO]     Then the sellIn is 5                                                                                      # BasicFeature.theSellInIs(int)
[INFO]     And the quality is 7                                                                                      # BasicFeature.theQualityIs(int)
[INFO]
[INFO]   Scenario: Quality of Conjured items goes down twice as fast as a normal item before sell-in # features/basic.feature:80
[INFO]     Given a Store Keeping Item with name "Conjured Mana Cake"                                 # BasicFeature.aStoreKeepingItemWithName(String)
[INFO]     And a sellIn date of 5                                                                    # BasicFeature.aSellInDateOf(int)
[INFO]     And a quality of 20                                                                       # BasicFeature.aQualityOf(int)
[INFO]     When the Item is updated                                                                  # BasicFeature.theItemIsUpdated()
[INFO]     Then the quality is 18                                                                    # BasicFeature.theQualityIs(int)
[INFO]
[INFO]   Scenario: Quality of Conjured items goes down twice as fast as a normal item after sell-in # features/basic.feature:87
[INFO]     Given a Store Keeping Item with name "Conjured Mana Cake"                                # BasicFeature.aStoreKeepingItemWithName(String)
[INFO]     And a sellIn date of 0                                                                   # BasicFeature.aSellInDateOf(int)
[INFO]     And a quality of 20                                                                      # BasicFeature.aQualityOf(int)
[INFO]     When the Item is updated                                                                 # BasicFeature.theItemIsUpdated()
[INFO]     Then the quality is 16                                                                   # BasicFeature.theQualityIs(int)
[INFO]
[INFO] 12 scenarios (12 passed)
[INFO] 64 steps (64 passed)
[INFO] 0m1.617s
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 10.861s
[INFO] Finished at: Fri Sep 16 15:57:23 CEST 2011
[INFO] Final Memory: 5M/106M
[INFO] ------------------------------------------------------------------------

So now we have a full set of acceptance tests, covering the whole of the requirements, and with only very limited amount of code needed. Of course, the small amount of code needed is for a large part because I already refactored the original to something more managable. I would be a nice next experiment to start with these scenarios, and grow the code from there. If you do this, let me know, and send a pull request!

Parameterization

For some type of tests, it makes sense to have a scenario where you put in different types of data, and expect different results. By separating the scenario from the input and output data, you can make thes kind of tests much more readable. In the case of our example, you can find the same tests in the parameterised.feature file, but it’s small enough to simply include here:

Feature: Quality changes with sell-in date Feature
        In order to keep track of quality of items in stock
        As a ShopKeeper
        I want quality to change as the sell-in date decreases

        Scenario Outline: Changing Quality of an Item
                Given a Store Keeping Item with name "<item name>"
                And a sellIn date of <sell in>
                And a quality of <quality>
                When the Item is updated
                Then the sellIn is <expected sell in>
                And the quality is <expected quality>

		Examples:
		| item name			| sell in	| quality	| expected sell in	| expected quality	|
		| +5 Dexterity Vest	| 5			| 7			| 4					| 6					|
		| +5 Dexterity Vest	| 0			| 10		| -1				| 8					|
		| +5 Dexterity Vest	| 0			| 0			| -1				| 0					|
		| Aged Brie			| 5			| 1			| 4					| 2					|
		| Aged Brie			| 5			| 50		| 4					| 50				|
		| Backstage passes to a TAFKAL80ETC concert	| 11 | 20 | 10		| 21				|
		| Backstage passes to a TAFKAL80ETC concert	| 6  | 20 | 5		| 22				|
		| Backstage passes to a TAFKAL80ETC concert	| 5  | 20 | 4		| 23				|
		| Backstage passes to a TAFKAL80ETC concert	| 0  | 20 | -1		| 0					|
		| Conjured Mana Cake| 5			| 20		| 4					| 18				|
		| Conjured Mana Cake| 0			| 20		| -1				| 16				|

As you can see, this is much shorter. It does skip on the detailed text for each scenario, though, which I thought to be somewhat of a loss in this particular case. For tests with larger sets of data, this is probably a great feature, though. For this example I though the feature was clearer with the more verbose scenarios.

If you want to know more, a great resource is the cuke4ninja free on-line book. The Cucumber wiki, and additional tutorials list are also great sources of knowledge.

Adventures in Rework

I came across this post by Martin Fowler, on the Strangler Application pattern, and its accompanying paper. This brought back memories of some of my own adventures in rework, some fond, others not so much. In all cases, though, I think they were very valuable lessons on what to do and not to do when reworking existing systems. No reason not to share those lessons, especially as some of them were rather painful and expensive to learn. The paper linked above is a success story, mine are more the kind you tell your children about to keep them away from dysfunctional projects. This is not because these projects were done by horrible or incompetent people, or completely ineffective organisations. They weren’t. But sometimes a few bad decisions can stack up, and result into the kind of war stories developers share after a few beers.

I’ll wait here until you’re back from the fridge.

When talking about rework, note that I’m not calling them ‘legacy systems’, as you’ll see that in a few of these cases the ‘legacy’ system wasn’t even finished yet before the rework began.

‘Rework’ is simply a name for rebuilding existing functionality. It is distinct from Refactoring in that it is usually not changing existing code, but replacing it. Another distinction is one of scope. Refactoring is characterised by small steps of improvement, while Rework is about replacing a large part of (or an entire) system. Or in simpler terms:

Rework = bad, Refactoring = good

One problem is that very often people talk about refactoring when they mean rework, giving the good practice of refactoring a bad name. When a developer or architect says ‘We need to do some refactoring on the system, we think it will take 4 weeks of work for the team’, what they are talking about is rework. Not surprisingly, many managers now treat refactoring as a dirty word…

When do you do rework?

Rework is not always bad. There can be good reasons to invest in re-implementation. Usually, maintainability and extensibility are part of those reasons, at least on the technical side. This is the type of rework that is similar to refactoring, in that there is no functional change. This means that this is work that does not give any direct business value. From the point of view of the customer, or the company, this means that these kind of changes are ‘cost’ only.

Rework can also be triggered by changes in requirements. These might be functional requirements, where new functionality can’t easily be fitted in the current system. Usually, though, these are non-functionals. Such as: we need better scalability, but the platform we’re using doesn’t support that. Or: We need to provide the same functionality, but now as a desktop application instead of a web-application (or vice versa).

Rework is also sometimes triggered by policy, such as “we’re moving all our development over to…”. And then Java, Scala, Ruby, ‘The Cloud’, or whatever you’re feeling enthusiastic about at the moment. This is not a tremendously good reason, but can be a valid one if you see it in the context of, for example: “We’re moving all our development over to Java, since the current COBOL systems are getting to be difficult to maintain, simply because we’re running out of COBOL programmers.’

Adventure number one

This was not the first piece of rework I was involved with, but a good example of the importance of always continuing to deliver value, and keeping up trust between different parties in an organisation. No names, to protect the innocent. And even though I certainly have opinions on which choices were good and which were not, this is not about assigning blame. The whole thing is, as always, the result of the complete system in which it happens. The only way to avoid them is complete transparency, and the trust hopefully resulting of that.

A project I worked on was an authorisation service for digital content distribution. It could register access rights based on single-sale or subscription periods. This service in the end was completely reworked twice, with another go in the planning. Let’s see what happened, and what we can learn from that.

The service had originally been written in PHP, but was re-created from scratch in Java. I don’t know all the specifics on why this was done, but it involved at least the component of expected better performance, and there was also a company-wide goal of moving to Java for all server-side work. The non-functional requirements and policy from above.

This project was a completely redone system. Everything, including database structures, was created new from scratch. There was a big data-migration, and extensive testing to ensure that the customers wouldn’t suddenly find themselves with missing contents, or subscriptions cut short by years, months or minutes.

Don’t try to change everything at once

A change like that is very difficult to pull off. It’s even more difficult if the original system has a large amount of very specific business-logic in code to handle a myriad of special cases. Moreover, since the reasons for doing rework were completely internally directed, the business side of the company didn’t have much reason to be involved in the project, or understanding of the level of resources that were being expended in it. It did turn out, though, that many of the specific cases were unknown to the business. Quickly growing companies, and all that…

[sidebar: The Business]
I use the term ‘The Business’ in this post regularly. This is intentional. There is a valid argument, made often in the Agile community, that talking about ‘the business’ is an anti-pattern indicating overly separated responsibilities indicative of silo thinking. And I think that in a lot of cases this is true, though sometimes you just need a word…
In this case, there actually was a separation. There were some silos. And the use of the word is accurate.
And I couldn’t think of another term.
[/sidebar]

Anyway, the project was late. Very late. It was already about 9 months late when I first got involved with it. At that point, it was technically a sound system, and was being extensively tested. Too extensively, in a way. You see, the way the original system had so many special cases hard-coded was in direct conflict with the requirement for the new system to be consistent and data-driven. There was no way to make the new implementation data-driven and still get 100% the same results as the old one.

Now, this should not be a problem, as long as the business-impact is clear, and the business side of the organisation is closely enough involved to early-on make clear decisions on what are acceptable deviations from the old system and what are not. A large part of the delays were simply due to that discussion not taking place until very late in the process.

As with all software development, rework needs the customer closely involved

In the end, we did stop trying to work to 100% compliance, and got to sensible agreements about edge-cases. Most of these cases were simply that a certain subset of customers would have a subscription end a few days or weeks later, with negligible business impact. They still caused big delays in the project delivery!

What problems to fix is a business decision

Unfortunately, though the system went live eventually, this was with a year’s delay. It was also too late. On the sales and marketing side, a need had grown to not only register subscriptions for a certain time-period, but also to be able to bill them periodically (monthly payments, for instance). Because the old one hadn’t been able to do this, neither could the new one. And because the new system had been designed to work very similar to the old one, this was not an very straightforward functionality to add.

If you take a long time to make a copy of an existing system, by the time you’re done they’ll want a completely different system

Of course, it was also not a completely impossible thing to add, but we estimated at the time that it would take about three months of work. And that would be *after* the first release of the system, which hadn’t taken place yet. That would bring us to somewhere around October of that year, while the business realities dictated that having this type of new product would have the most impact if released to the public early September.

So what happens to the trust between the development team and the customer by a late release of something that doesn’t give any new functionality to the customer? And if the customer, after not getting any new functionality for a full year, then has a need and hears that he’ll have to wait another 6 months before he can get it? He tells the development team: “You know what? I’ll get someone else to do it!”

Frustrate your customer to your peril!

So the marketing department gets involved in development project management. And they put out a tender to get some offers from different parties. And they pick the cheapest option. And it’s going to be implemented by an external party. With added outsourcing to India! Such a complex project set-up that it *must* work. Meanwhile, the internal development organisation is still trying to complete the original project, and is keeping off getting involved into this follow-up thing.

Falling out within the organisation means falling over of the organisation

Now this new team is working on this new service which is going to be about authorisation and subscriptions. They talk to the business, and start designing a system based on that (this was an old-school waterfall project). Their requirement focus a lot on billing, of course, since that is the major new functionality relative to the existing situation. But they also need to have something to bill, and that means that the system also supports subscriptions without a hard end-date, which are renewed with every new payment. The existing system doesn’t support that, which is a large part of that three months estimation we were talking about.

Now a discussion starts. Some are inclined to add this functionality to the old system, and make the new project about billing and payment tracking. But that would mean waiting for changes in the existing system. So others are pushing to make the new system track the subscription periods. But then we’d have two separate systems, and we’d need to check in both to see if someone is allowed access to a specific product. Worse, since you’d have to be able to switch from pre-paid (existing) to scheduled payments, there would be logic overlapping those two.

Architecture is not politics. Quick! Someone tell Conway!

All valid discussions on the architecture of this change. Somehow, there was an intermediate stage where both existing and new system would keep track of everything, and all data would magically be kept in sync between those two systems, even though they had wildly different domain models about subscriptions. That would have made maintenance… difficult. So the decision was to move everything into the new system, and have the old system only there as a stable interface toward the outside world (ie. a façade talking to the new system through web-services, instead of to its own database).

So here’s a nice example of where *technically* there isn’t much need for rework. There are changes needed, but those are fairly easy to incorporate into an existing architecture. We’re in a hurry, but the company won’t fall over if we are late (even if we do have to delay introducing new payment methods and product types). But the eroded trust levels within the company made the preference to start from scratch, instead of continuing from a working state.

Trust is everything

Now for the observant among you: Yes, some discussion was had about how we had just proven that such a rework project was very complex, and last time took a lot of time to get right. But the estimates of the the external party indicated that the project was feasible. One of the reasons they thought this was that they’d talked mostly to the sales side of the organisation. This is fine, but since they didn’t talk much to the development side, they really had no way of knowing about the existing product base, and its complications and special cases. Rework *should* be easier, but only if you are in a position to learn from the initial work!

If you do rework, try to do it with the involvement of the people who did the original work

It won´t come as a big surprise that this project did not deliver by early September as was originally intended. In fact, it hadn´t delivered by September of the following year. In that time the original external party had been extended and/or replaced (I never quite got that clear) by a whole boatload of other outsourcing companies and consultants. The cost of the project skyrocketed. Data migration was, of course, again very painful (but this time the edge-case decisions were made much earlier!)

A whole new section of problems came from from a poorly understood domain, and no access during development to all of the (internal) clients that were using the service in different ways. This meant that when integration testing started, a number of very low-level decisions on the domain of the new application had to be reconsidered. Some of those were changed, others resulted in work-arounds in all the different clients, since the issues were making a late project later.

Testing should be the *first* thing you start working on in any rework project. Or any project at all.

Meanwhile, my team was still working on the existing (now happily released) system, both maintenance, new features, and the new version that ran against the new system´s web-services. And they were getting worried. Because they could see an army of consultants packing-up and leaving them with the job of keeping the new system running. And when it became clear that the intention was to do a big-bang release, without any way to do a roll-back, we did intervene. One of the developers created a solution to pass all incoming requests to both the old and the new systems, and do extensive logging on the results, including some automated comparisons. Very neat, as it allowed us to keep both systems in sync for a period of time, and see if we ran into any problems.

Always make a roll-back as painless as possible

This made it possible to have the new system in shadow-mode for a while, fix any remaining issues (which meant doing the data-migration another couple of times), and then do the switch painlessly by changing a config setting.

Make roll-back unnecessary by using shadow-running and roll-forward

So in the end we had a successful release. In fact, this whole project was considered by the company to be a great success. In the sense of any landing you can walk away from, this is of course true. For me, it was a valuable lesson, teaching among other things:

  • Haste makes waste (also known as limit WIP)
  • Don´t expect an external supplier to understand your domain, if you don´t really understand it yourself
  • Testing is the difference between a successful project and a failed one
  • When replacing existing code, test against live data
  • Trust is everything

I hope this description was somewhat useful, or at least entertaining in a schadenfreude kind-of way, for someone. It is always preferably to learn from someone else’s mistakes, if you can… I do have other stories of rework, which I´ll make a point of sharing in the future, if anyone is interested.

Code Cleaning: How tests drive code improvements (part 1)

In my last post I discussed the refactoring of a particular piece of code. Incrementally changing the code had resulted in some clear improvements in its complexity, but the end-result still left me with a unsatisfied feeling: I had not been test-driving my changes, and that was noticeable in the resulting code!

So, as promised, here we are to re-examine the code as we have it, and see if when we start testing it more thoroughly. In my feeble defence, I’d like to mention again why I delayed testing. I really didn’t have a good feel of the intended functionality, and because of that decided to test later, when I hoped I would have a better idea of what the code is supposed to do. That moment is now.

Again a fairly long post, so it’s hidden behind the ‘read more’ link, sorry!

Contine reading

Code Cleaning: A Refactoring Example In 50 Easy Steps

One of the things I find myself doing at work is looking at other peoples code. This is not unusual, of course, as every programmer does that all the time. Even if the ‘other people’ is him, last week. As all you programmers know, rather often ‘other people’s code’ is not very pretty. Partly, this can be explained because every programmer knows, no one is quite as good at programming as himself… But very often, way too often, the code really is not all that good.

This can be caused by many things. Sometimes the programmers are not very experienced. Sometimes the pressure to release new features is such that programmers feel pressured into cutting quality. Sometimes the programmers found the code in that state, and simply didn’t know where to start to improve things. Some programmers may not even have read Clean Code, Refactoring, or the Pragmatic Programmer! And maybe no one ever told them they should.

Recently I was asked to look at a Java codebase, to see if it would be possible for our company to take that into a support contract. Or what would be needed to get it to that state. This codebase had a number of problems, with lack of tests, lots of code duplication and a very uneven distribution in complexity (lots of ‘struct’ classes and the logic that should be in them spread out, and duplicated, over the rest). There was plenty wrong, and sonar quickly showed most of them.

Sonar status

When discussing the issues with this particular code base, I noticed that the developers already knew quite a few of the things that were wrong. They did not have a clear idea of how to go from there towards a good state, though. To illustrate how one might approach this, I spent a day making an example out of one of the high complexity classes (cyclomatic complexity of 98).

Larger examples of refactoring are fairly rare out there, so I figured I’d share this. Of course, package and class names (and some constants/variables) have been altered to protect the innocent.

I’d like to emphasize that none of this is very special. I’m not a wizard at doing this, by any standard. I don’t even code full time nowadays. That’s irrelevant: The point here is precisely that by taking a series of very simple and straightforward steps, you can improve your code tremendously. Anyone can do this! Everyone should…

I don’t usually shield off part of my posts under a ‘read-more’ link, but this post had become HUGE, and I don’t want to harm any unsuspecting RSS readers out there. Please, do read the whole thing. And: Let me (and my reader and colleagues) know how this can be done better!

Contine reading

Learning is key

An old article I just came across, posits that learning is the thing of value in software development:

When we present this hypothetical situation to students – many of them with 20+ years experience in building software – they typically respond with anywhere between 20% to 70% of the original time.  That is, rebuilding a system that originally takes one year to build takes only 2.5 to 8.5 months to build.  That’s a huge difference!  It’s hard to identify another single factor that could affect software development that much!

The article goes on to discuss in detail how learning is enabled by agile processes. The advantages of quick feedback cycles are not just ‘fail early’, but also ‘learn early’ (and continuously).

Agile Feedback Loops

Agile Feedback Loops

BDD intro: TDD++

While looking for ways to make using Selenium nicer, wondering a bit through WebDriver descriptions and FitNesse explanations, I ran into this nice Introduction to Behaviour Driven Development.

Dan North explains how he arrived at the ideas of BDD, originating with a desire to explain how to best do Test Driven Development in his teams, and arriving at this structured way of creating the tests that drive implementation.  Very illuminating, and worth reading if you take your testing seriously.

It also made me browse on a bit, since getting decent tests running often involves all kinds of fixture nastiness to get a usable data-set in place. I found the Test Data Builder pattern promising, but I’ll have to use it to know for sure. When I do, I’ll probably use Make It Easy to, well, make it easier.


Reading Up: Books Every Programmer Should Read

When discussing books on software engineering with colleagues, I got the idea of listing the best books I’ve read in the past 15 years. Because it seems useful, but also because that will allow others to tell me which ones I should have read… Let’s start with some technical books. I’ve never had much taste for books on too specific technology subjects, so there’s no ‘J2EE Internals’, or ‘Jini Programming for Dummies’ books here. Not that I never read anything like that, but those were never books that really influenced how I do my job. Looking at this list it seems that I’m not very imaginative, since most of these are very well known classics. Still:

Applying UML and Patterns Cover Image Larman’s book (though in the first edition, I haven’t read the third edition I’m linking to here) was the book that first really explained Object Orientation to me. I had had courses in OO before that, but I’m pretty sure I didn’t actually understand it before reading this book.

It was also a nice introduction to UML, which I went all meta-meta-model-overboard on for a while before calming down and focusing on code.

There are other books on UML (UML Distilled is a good one), others on Design Patterns (see below), but if you want to learn Object Oriented Design, and get a good introduction to iterative development, UML and Patterns, then this is the book to get.

Design Patterns Cover Image I think everyone (who is remotely interested in OO) has at least heard about this book by the ‘Gang of Four’. The first approximately 65 pages of the book are a discussion on what patterns are, why they are useful, and how they are sturctured. The rest of the book is a catalogue of patterns.

Reading the patters will give (gave me, at least) a constant ‘Aha-erlebniss’, ranging from ‘So that’s why they did that!’, to ‘So that’s the thing I was doing in that project called!’ to (most often:-) ‘Damn! That’s what I should have used there-and-there!’.

Very much recommended for its obvious and continuing relevancy to anyone writing code. I’ve noticed recently that I’ve been forgetting which pattern was which (not enough coding…), so I suppose it’s time to do a little refresher here.

This one has in my mind always been the companion book the the Design Patterns book. It is structured in a similar way, with a part explaining what refactoring is (‘Improving the design of existing code’, as the subtitle proclaims), and then contains a long list of ‘Refactorings’, which are structured in the same way as in the Patterns book: How to apply in Which situations. This book also introduced the concept of ‘code smells’: those places in the code where you know something is wrong, even in the cases where you don’t know what it is that’s wrong. Again very much recommended. The various descriptions of what is wrong why, and how to fix it, are a great learning experience.

Kent Beck, of eXtreme Programming and JUnit fame, goes into deep detail of the TDD XP practice, by showing step-by-step implementations using TDD. He also explains why TDD works so well, that it is not just a way to increase your test-coverage, how it drastically improves design, and how to use it to write your own unit-test framework.

For some nice links to TDD articles and to Becks (and others) screencasts, see my earlier post. If you haven’t tried TDD yourself, give it a try. The results are surprising…

Clean code cover image

I haven’t actually read this one yet, but based on reviews, and on the fragments that I did read, it’s another must-read. Which is why it is on my nightstand in the ‘to read’ queue… The ‘Refactoring’ book teaches you a lot of things about improving code, but this one gives often more basic advice on keeping your code clean, readable, understandable en easy to change.

More when I’ve finished reading this one…

One other book that I’m aiming to add to my queue is Michael Feather’s Working Effectively With Legacy Code, which builds on the Refactoring concept, but grounds it in the ugly reality of dealing with legacy systems (defined as code that is not under test). After reading the original article, and recognising the situations he’s talking about, I’m convinced this is a good one to add.

Same is true for the Pragmatic Programmer book. I did read Andy Hunt’s Pragmatic Thinking and Learning book, and thoroughly enjoyed that, so I expect I’ll be reading this one in the coming months as well.

Next time, books related to Lean and Agile development, which have been my main reading material for the past year…