27 March 2014

Testing in Agile

A broad topic, and I should narrow it down, but I just wanted a quick post about something I've been getting frustrated with lately.

A developer who can write useful tests for their own code is awesome, although as the head of Engineering in my current job once described it, it should also be Developing 101: write unit tests, so you know when something breaks, write integration tests where it makes sense, reduce your own headaches later, ease debugging, have some idea of where not to look when things go wrong.

What we don't need are developers who specialise in test automation. Now, I have met some very fine testers who started their careers as developers, so I want to make it clear that I'm not talking about them. They were passionate about quality, wanted to get to the root of how to make things better, and decided that testing would give them a wider view of the situation. They put in the effort to learn what a good test is, to learn how to test, to learn techniques for finding weaknesses in the SUT quickly and efficiently, and they generally had a very good grasp of when to use their tools and when to use their brains.

But what I'm talking about are those who want to write code, and have been directed to write code to automate the higher level functional tests that are the normal domain of testers. And who spend more time thinking about the technology that they are using to do this than the tests that they are trying to automate. Or whether those tests are good tests to automate. Or whether you should break down those high level functional tests into smaller tests and automate those smaller tests to get the same information, minus the brain of the tester to interpret the results.

What we need instead in Agile are more people coming from a testing background, who are technically savvy, who are passionate about quality, passionate about testing well, testing better, automating the boring stuff using tools that they can learn about and start using in a few days, pushing back on the rest of the agile team for the things that they need to be done to automate what makes sense to automate, manually testing what needs to be tested only once, using exploratory testing for fuzzy areas, writing their own tools to generate their test data or environment setup. Able to talk with developers and understand what is being done, spot the holes between subsystems before they become bugs, talk to Product Owners to find out what they really want, find out what the users have been saying to direct future design or criticise current design - we need testers who can do all the traditional test tasks well, who are also technical and quick on the uptake. As a born tester, I don't like being labelled as "QA", but looking at what I do on a daily basis, that is the better description compared to "Automation Engineer".

01 February 2014

Just fix it already!


I was asked in a fairly recent job interview how, when faced with a developer who didn't want to fix a bug that I raised, I would convince them to fix it.

Ironically, I was asked exactly the same question in my very first interview for a testing job, when my answer was, well, incredulous that such a thing would happen. Oh, how much I've learnt since then.

My answer in the recent interview was that firstly, I have learnt not to be too protective of my bugs - they aren't my babies, they're my opinion with evidence of actual behaviour but not necessarily correct expected behaviour, and secondly that over the years I've managed to amass a whole range of techniques to get all of us to agree on what the expected behaviour should be and to want the system to behave that way.

I didn't go into many details in the interview, it was not one of the main questions, but I thought I'd share some ideas here for how I reduce the conflict that can arise.


  1. My first step, in any project, is to build a good relationship with the developers whose code I am testing, and the product manager whose system I am hoping to perfect. I'm the epitome of a team player, even if I am shy and introverted, so this bit is quite easy for me, but I use an enthusiastic approach - none of us* want to be in weekends working on last minute fixes to be pushed through to production immediately because the customer demands it, so testing is essentially a service to the developers so that we can spend our weekends doing stuff we enjoy.
  2. Chattering around what I think might be a bug. "There's something weird", "Can you tell me what I'm doing wrong here?", "Why why why?", "I don't know, what do you think should happen here?" and occasionally just looking pointedly pained - we're helping each other out, I'm not pointing at somebody else's booboo. (And truthfully, there is a good chance that it is my booboo anyway.)
  3. Looking deeper and deeper into the consequences of the current behaviour. Does it break the user's flow, will it confuse technical support, is there a memory leak near it or because of it, is there something fundamentally shaky about how that whole area has been implemented that will come back and bite us later? Will we, heaven forbid, have to come in at the weekend??
  4. If I should have (or even could have) found a problem earlier, confessing that I should have found it earlier. This gets us over the whole discussion about how long it has been that way and not been a problem before, and it ties into my approach that I provide a service to the team.
  5. And now into the fun ways, once the team is comfortable together. Everybody's favourite gif at the top of this post, I have this pinned to the wall of my cubicle, so that I can (jokingly) point from one side to the other during an explanation of why the bug isn't a bug.
  6. "You mean, you want it to work that way?" This is great for those problems that are a result of poor design decisions, and even if these are the least likely to actually get fixed (depending on what stage of the project we are at, the legacy that we are working with, budget and so on, although I do make an effort to catch this stuff early) then they are still acknowledged as not being desirable behaviour.
  7. Looking more and more doubtful during an explanation of why it isn't a bug. And can I emphasise again, I am still open to being convinced, but if I have doubts then I just ... convey them in my facial expressions.
  8. Asking a lot of questions. If this is correct, then does that mean that that is wrong? How will the user recover from this situation? Should we put in some explanatory text? What is supposed to happen when I dongle the widget, given that the widget creation makes me pick this doodad? Can we change the default setting of the flahullah so that dongling the widget doesn't mean the automatic inclusion of the performance impacting gewgaw? (What kind of verb is dongling?) Doing this often finds easy fixes for bugs, or just convinces the developer that I have a point, or (perhaps) makes them put in a fix just so that I will stop talking.
The times that I still struggle are usually with developers who just assume that I am stupid - as you can see, I don't put myself up as a figure of authority, and there are times when my questions are met with condescending, trivial answers ("When you dongle the widget, the widget should be in a dongled state"), and while I manage to quash my immediate reaction ("Ugg want to know what dongled state") I may not get much further.

What are your strategies? 

* I am not dead against working overtime or weekends (dear future employers), but I do hate having to rush through a fix. It seldom ends happily for anyone, and in my experience can kick off long chains of fix, push, discover new problem, start all over again.

24 January 2014

The Whack-a-bug days

We're getting to the end of a three-month development project, exploring new technologies in ahead of a much larger project that will kick off sometime next month, and I have been working like a dervish to keep up with five developers.

I'm feeling quite pleased, overall, with the testing I've done. I have a mix of scripted manual tests, automated functional GUI tests, and exploratory testing areas, plus some random shell and python scripts for helping me through the nasty bits. The developers on the project are all experienced, decent, and they produce quality code that is a joy to test. I'm not tripping over stupid bugs that stop my flow and prevent me from getting down into the nitty gritty details where I love to frolic.

This week I found an NPE in a single line of code with the only possible case that would ever find it, an incorrect response from a REST call in such a corner case that I had to run the same script 400 times to check the fix, and my automated tests found two bugs immediately after the developer committed his changed code.

But then today came along and spoilt my whole week. There was a fairly big feature added to the product, with the terrifying comment "and then I found I had to refactor all the error handling", and all hell broke loose. I didn't get a green build all day, left the office at 19.00 with it still red but narrowed down to a single bug, and another problem, with a jsQuery parameter being intermittently set to null, raising its ugly (and I thought beaten) head all over the place. Oh, and the hour and a half I spent chasing a bug that turned out to be nothing more than a hardcoded value in a helper app as opposed to an actual bug.

To add to the fun, two of the identifiers I use for GUI elements in the automated tests changed name. And then one of them changed back to its old name. And this was apparently a really cool bug that I had found, but in reality was just me sitting in front of my monitor feeling confused.

Which got me thinking about how my job is really about being able to zoom in to the details, and then zoom out to see the big picture, on a minute by minute basis. And that while it is definitely better that I find those bugs now rather than in a month's time, working so that I do find them that quickly will inevitably result in days like today, when I am either chasing my own tail, or staring sadly at a build that has new failures for reasons that I don't want to look at at half past six on a Friday evening.

********************************************************************************

And yes, it has been a while. You're not interested in my excuses, so I won't make any. Nor will I make any vague plans to return. I'd like to keep this active as a place to talk about my work, but lots of other things have my attention too.

01 August 2008

"Oh, Q/A. Everybody tortures the guys in Q/A. It's like being hazed for a living."
JPod - Douglas Coupland

I was going to write something long and meaningful about finding this quote in JPod, but I started in my new job this week and it has all been kind of draining. So instead, just some of the notes I’ve made for this entry over the week:

• Testers hardly ever feature in fiction (books / TV / films), unless I have been reading / watching completely the wrong fiction (and I read around 100 books a year, say 75% fiction, plus watch a lot of telly)

• The only other mention of testers I remember in fiction was an episode of CSI Miami, where apparently somebody didn’t “need some mouth-breather telling me what’s wrong in my code”

• Actually, this doesn’t bother me at all. Maybe I just lack community spirit? But see also below

• The testing community has never, that I have seen, got anxious about the fact that we are portrayed in fiction as mouth-breathing torture victims.
  • Possibly this is because so many are trying to prove themselves to their colleagues that they don’t have time to care about what the wider world thinks. But I like to think it’s because we’re above it.
  • This is completely the opposite reaction to what the Science Fiction fandom community has in a similar situation. (In another one of my alternative lives, I lurk at the edges of SF fandom)
• Testing as an activity is in lots of fiction, just not software testing done by people called testers. Two examples from TV:

  • we have the Mythbusters, who basically design and execute engineering tests. The engineering they do well, the testing they do badly - I understand the restrictions they face, but I can’t stop myself shouting at the telly. You can’t prove something after just three tests ignoring all but a handful of factors!
  • then there is also CSI itself. The episodes I like best as a tester are the ones where they look back over their earlier work, and find new interpretations of the results that they got. I think that maybe they are also careful enough in the language they use to describe what they have found (except for David Caruso, who needs a punch in the face, but I digress).
• People who are big fans of either or both of the above shows have, without a trace of irony, told me that they find testing boring

• And as an aside, I get so irritated with people who don’t question themselves often enough to see the irony there...

And that’s all I have the energy for.

10 July 2008

Acceptance tests, part 1 of many

I have a new interview question:

"What is the difference between acceptance tests and systems tests?"

Like all good interview questions, I have my own opinions but do not have anything like a correct answer. In addition, it gets information about the candidate that is not necessarily what they think I am asking.

Because I am generally the "good interviewer", and because I like people, the only way not to score points with me is to answer with no thought involved at all - "acceptance tests are done to accept the system", "there is no difference", "everything stops if the acceptance tests fail" - although now that I think about it, the last answer could lead to an interesting discussion, just not one that matches any reality I have worked in.

Extra points are scored by anyone who asks me a question back, especially a question about what I mean by acceptance tests: build acceptance tests, user acceptance tests, deployment acceptance tests, other kinds of acceptance tests...? This is one thing that I am trying to figure out, when we talk about acceptance tests shouldn't we be very clear about who is accepting what? We continue the discussion with me indicating that I am talking about user acceptance tests.

I get a lot of answers talking about business requirements versus functional requirements, which is okay but a bit flat for my tastes. I have worked for years with no written business or functional requirements, and for years before that with out of date and incorrect requirements, and I know that I am not alone in this. So I ask the candidate that if I show them a particular test case, how will they be able to tell whether it is a system or user acceptance test.

Now we're into the real mess. I often get answers about user acceptance tests being at a "higher level" than system tests, but this can work both ways - what about a system test that was written before design was finalised? Won't that be as high level? And for truly high level acceptance tests, what is the difference between them and the actual business requirements that they are supposed to be testing? I'm not interested in the exam board answers, I am not in a world where they are true, and people lose points all over the place for spewing those answers back at me.

This is my opinion on the matter. There is no way to tell whether a particular test is a system test or a user acceptance test just by looking at the test case definition. The real factors that determine it are who executes the test, when, and what they are testing for.

I have written user acceptance tests for a number of projects, and the who determines how high level they are - a business owner who has followed the development project needs fewer instructions, and so a higher level test, than a business owner who has never been near the system, and will probably need instructions at least to what login to use.

The when and the what for are kind of linked together, and actually I don't want to say any more about them right now - we'll end up on a rant about agile development methodologies and it is late on a Thursday night during my vacation. Let's get back to them some other time.

20 June 2008

Testing your server log

Just a quick entry, because I’ve been on vacation for the past week, and my head is mainly full of swimming pools, sun, and too much rosé wine.

I’ve long been a fanatic for monitoring as many interfaces as possible when testing, and over the past year I’ve worked quite closely with our operations team and our internal users, which has just confirmed my fanaticism.

Introducing new testers to the project I’ve been working on, I give them a minimum of three locations to check for each test they execute: the GUI, the database, and the server log. What I absolutely don’t want happening is a failure in production to which we the testers can only answer “it worked on our test servers”. So the advantages to monitoring the database and the server log are actually twofold: 1) you are much more likely to catch intermittent problems with persistence, strange error messages, or buffer overflows, and 2) when something does go wrong in production, you already know what “normal” usage looks like.

I’ll talk more about database testing some other time, but here are some of the things I have tested for in the server log (with log4j):

• Errors are flagged with ERROR, and warnings are flagged with WARN

• Nothing is flagged as ERROR when it is not an error, or you know about every single exception to this (one of the third party components we use always writes to ERROR, for instance, so I pass this information on to the operations team)

• Keeping log levels at INFO, it is possible to reconstruct what a user has been doing

• Operations that execute continuously are logged as DEBUG, or can be filtered out of the log another way (and that the production servers are configured to filter it out)

• Passwords are not logged in plain text, but user names are (this depends a lot on context, privacy laws may apply, or you may have so many users that logging system log-ins will cost more in disk space than any possible worth)

• Errors either define the exact problem, or else point you to more information. “The file cannot be read” isn’t good enough if the file name isn’t printed out also with an ERROR flag. On the other hand, “SQL3096N, varchar(50), USER_ADDRESS” gives the knowledgeable debugger exactly enough information to pinpoint the error.

There’s more to it than this, I can spin out several scenarios to test that logging is adequate, but from the above I hope that a couple of things become clear: the end user is not necessarily the only user, and the users of the server log have radically different definitions of what is “user friendly”.

12 June 2008

I slink back to this blog, looking guilty. I hadn't forgotten about it, I hadn't abandoned it, and it isn't even that I didn't have anything to write about. I've been discovering a whole pile of interesting things over the past year, fodder for hundreds of blog entries, and I think that the very learning process I went through would have made interesting reading, but the plain truth is that I was so busy learning and working that I literally did not have any time for writing. Or for a life at all, really, although somehow I squeezed in the time to get married. (Two days before my wedding, I ran my last test of the day at 1.20am. Just to give some perspective.)


Now, I'm about to change jobs, to a company that promises me time outside of work too, and I'm going to go out on a limb and promise to blog here once a week. Even if it is only to talk about what I am doing on my holidays. I write slowly, and form my ideas slowly, testing them as I go, changing my mind, getting worried about how what I say will be interpreted, even when I am only talking about what I'm doing on my holidays, so my apologies in advance if I have over-committed myself.


So, what have I been doing? As a clue to what I will be talking about.


  • I have been dragged kicking and screaming into the Agile Development world, and though I am not passionate about it I can now admit that it works, and that testing has a place in it. In fact, given that the initial writings of the Agile gurus are so wrong-headed about testing, it's something that I can contribute to, can improve. What you won't hear about is how it has solved all my problems, how it has solved all the business's problems, and how the software we now deliver is bug free and all tests are fully automated.

  • I have had my first positive experiences with automating more than just the smoke and load tests. All those negative past experiences have stood me in good stead, and the project that I have been working on had some areas that were ideally suited for automation. The solution that I will leave behind me does what it is supposed to do, it isn't over-ambitious, but what I'm most interested in is that it meets the needs of more than just the testers, and those additional stakeholders have meant that the system will keep running. What you won't hear about is how the testers now have nothing to do because automation has solved all our problems.

  • I have been interviewing, hiring, and mentoring a lot of junior and not so junior testers. This has been a really good experience for me, I've moved from only hiring people based on their technical skills, people like me, to hiring testers who will give a different perspective, who I will argue with until we reach a compromise. A few weeks back, I had an epiphany about someone who I had done nothing but argue with, both of us always seeming to be heading in different directions. I started asking him different questions, deeper questions, until we found what exactly the source of our disagreement was, and then we together agreed a solution that would meet both his needs and my needs. It was a wonderful moment, and since then we've been working much better together. What you won't hear about is how my epiphany has solved all my problems, and will solve yours too if you just ask the same questions.

  • I have been introduced to a lot of new tools and methods, particularly for manual testing, and not a one of them will solve all my problems. More tools is good, new tools are also good, but sometimes, my old-fashioned Casio stopwatch is precisely the tool that I need for a particular test. People look at me strangely as I sit there hitting the lap button, then they borrow the stopwatch, and time how long they can hold their breath. A few days later, they are back again, borrowing the stopwatch for a test they want to run. I call that a success.

  • I've been thinking a lot about test data, creating it and using it, and been amazed at how little discussion there is about it. The industry that I work in has highly standardised data formats, and I need a lot of data, with different requirements for each test - sometimes one field should be restricted to just a few values, usually it should be random between a particular range, one field should be left out in this test, have a very long value in another test, and in addition to these requirements, all the other fields should contain values and if I'm running a very narrowly focused test those values should not affect the result. I usually use Perl for generating all this, sometimes Excel (although Excel has a nasty bug as far as CSV formatted data is concerned), but how come nobody talks about what they do?

And then I've been doing a lot of other things, trying to see how I can do things better, make things better, not just in testing but for the product as a whole, for my colleagues. Looking for solutions that are better than putting people in a meeting room and hoping that they sort the problem out, better than fixating on a magic bean and complaining until it is bought and found to be a dud. You know how, if you've been in a really fast car or train, when you get out of it you kind of stagger as your velocity returns to normal? That's pretty much how my life has been. With normal velocity, I'm going to talk about what I've learned.

22 January 2007

The categorisation problem

I don't have any answers to this one (yet), but I've been thinking about categorising test cases a lot lately.

What normally happens when a new product is tested is that an architect creates a document with one particular view of the product, and the test cases are categorised around that view, with a couple of extra folders for things like test scenarios and smoke tests. From then on, each new release of the product has a similar structure for the test cases, and the end result is that the testers only think of the product in that structure, from that particular view.

To give an example, for a simple program like a calculator. You could categorise your test cases based on the input to the test cases, e.g. all test cases using the integer 2, or all test cases using floating point numbers. Alternatively, you could categorise your test cases based on user actions, e.g. entering data, selecting functions, requesting results, saving and recalling results. You could categorise based on components (interface, calculating engine), and probably many other ways that I'm not thinking of right now. But I expect that the most usual method is to categorise by functionality: addition, subtraction, logarithms, etc.

Which is all very well. Users probably see a calculator from that view themselves, so that is one advantage to it, and you can add categories that don't quite fit for more complicated calculations involving more than one function. Most testers will be able to look at those categories and come up with good test cases for each one, and within a short amount of time you'll find that the bugs reported are being categorised that way as well.

However, the other methods of categorisation have merit as well. You don't want to define separate test cases for saving and recalling addition results, and saving and recalling subtraction results, because the likelihood that there are any differences between the code used is low (one would hope, but shouldn't assume). On the other hand, you do want to have separate test cases for adding integers, negative numbers, floating point numbers etc, and multiplying the same.

In Mercury's TestDirector, this can be done my defining your test cases along functional lines in the Test Plan part of the application, and grouping the test cases along data lines in the Test Lab part of the application. Other test management tools have other solutions, although out of the tools that I have used, I like TestDirector's approach the best.

A calculator is a simple program, and also a very familiar one. We are used to thinking about it from different views, like the functional and data views. When the product is solving a much less familiar problem, however, it becomes more difficult to think about it from different views. I've worked on projects where even the architects don't think of the product from more than one view, although they have more of an understanding that there are different views and what those might be.

In these more complicated projects, one particular view becomes dominant to such an extent that others become too difficult to contemplate. I'm not sure why this is so, but I can think of two good examples to demonstrate it.

1) Colours. In kindergarten, with our poster paints (and, in my case, an ungodly amount of mess) we learnt that the primary colours are red, yellow, and blue. Later, in physics, we learnt that in fact the primary colours are red, green, and blue. With an amount of experience in graphic design, for instance, people will eventually be able to translate "I want a yellower red" into RGB terms, but I for one have to experiment to find out what that means. And I still think in poster paint terms like "yellower red".

There are additional views of colours, too. CYMK is another graphic design one, wavelength is one from physics, there is monochrome vs polychrome, but with the exception of the last one I think that your average "user" will naturally gravitate towards a red, yellow and blue view.

2) Books. Most bookshops will categorise their stock along a variant of these lines: Fiction / Non-fiction, Genre / Subject, Author's surname, Author's first name or initials.* Foyles bookshop in London, however, used to categorise books primarily by publisher, and the practice was described as anything from "eccentric" to "a nightmare". "Nobody" thought of books along publisher dimensions (publishers, authors, other industry specialists, and the owners of Foyles excepted), and people found that they couldn't think along those lines. Beyond that, people hadn't even considered that thinking about books from this view was an option.

And that is the crux of the problem. In each of my examples, there are people (or, users) who think about a particular topic along different lines from the majority. As testers, we should try and cater for their worldviews as well as the majority view, even if any bugs found from that view are ultimately left unfixed, as part of our aim to report on how well the product we are testing solves the business problem. But once we have our test cases divided from one particular viewpoint, we too are hampered in trying to think of them from a different view.

I have some ideas for how tools could give more help with solving this problem, but that entry will wait for another day.

* The vagaries of categorising along genre lines is a topic that gets much debate in science fiction and fantasy circles, usually to propose that all fiction books be sorted alphabetically by author (with the subtext that of course ghetto genres like romance or crime fiction be kept out of it), or very tiresome discussions about how popular book X is science fiction but is shelved with mainstream fiction due to snobbery. What is seldom considered is that genre is just one way of describing a work of fiction, almost all of which have some subjective element to them. I would like to categorise my books along the lines of "good fantasy about dragons with no elves", for instance, and see how many dimensions I need before I have one and only one book in each category. I don't have the time, but it is fun to think about.

08 January 2007

Test Tools

(There has been news coverage recently on the number of blogs started and then abandoned - something like two million, if memory serves. Tied to the coverage, a new term was coined to describe people who start blogs and abandon them - cloggers. So, brought to you courtesy of a vague sense of guilt, a new entry.)

Very few of the tools I use for testing are specifically designed for testing. The honourable exceptions are defect reporting tools and test management software, but in both of those cases I would be (and have been) happy using a spreadsheet or some other general tool in certain circumstances.

The reason I use so few specialised testing tools is probably because most of my background has been on UNIX systems, where there are many powerful general tools that can be easily manipulated and strung together to produce tools for specific tasks. I also have a suspicion that specialised testing tools are hugely over-priced, and either solve problems that I don't need to solve or are so generalised that the effort required to configure them to do what I need to do is larger than that needed for me to write a tool myself in Perl. *

For the first time ever in my career, I am now working on a product that is installed onto a normal user PC running Windows. And I have found a wonderful, non-testing-specific tool that has found several bugs that I probably wouldn't have found without it, and helped diagnose countless others. Process Explorer, distributed free by what used to be sysinternals.com, is basically a souped up version of Task Manager, or a Windows GUI version of the top command. **

I can see what processes are running, when they started, and what started them. I can see CPU usage, RAM consumption, Virtual memory usage, and IO history for the whole system, or for each individual process. I can see what threads each process is running, and the stack trace on each of those threads.

I keep Process Explorer running on a second monitor, where I can see it out of the corner of my eye. New processes starting are coloured green, and processes terminating are coloured red (they can be changed, but I like the defaults), and this will catch my attention. I can see at a glance whether the amount of RAM has risen or fallen significantly, whether any particular process is using the CPU when it shouldn't be working, and basically where the activity is coming from.

But where it has proven most valuable is in noticing when processes haven't terminated when they should. Instead of finding these processes during a rare glance at Task Manager, when I have no idea when they started and what I was doing at the time, I will see it as it happens, and be able to investigate the bug while my previous actions are fresh in my mind.

And this feeds into something I believe about test case design and its value. I could have defined one test case (or several, or an infinity of test cases) that checked whether all processes terminated when they should have after executing various user scenarios. And I would have spent a few days working through those scenarios, driving myself half crazy with the tedium, and perhaps halfway through I would have noticed that I was running short on RAM, and I would start again to see which scenario was to blame, and a whole heap of time would have been taken up executing a test case that was better executed without being defined, while executing other test cases.

It isn't just with these kinds of tests, monitoring system behaviour, that I think leaving test cases initially undefined is a good idea, but that's a huge topic for another day.

* In fact, this is blatantly untrue. Open Source Testing has a lot of free resources listed, and although many of them don't solve my problems, some of them have proven useful and reduced my workload. If I can't overcome my prejudices, at least I can be aware of what they are.

** As implied, I have very little experience on Windows systems, and can just about navigate the file system in DOS. Although I tell this to anybody who listens, they seldom believe me. So while there may be DOS alternatives to do what I use Process Explorer to do, I didn't find information on them.

12 November 2006

Why a raven is like a writing desk

In my first post, I mentioned that I was far too busy outside of work - one of the things that I do outside of work is amateur dramatics, and last weekend I was stage managing a Shakespearean production in the old opera house in Helsinki.

I enjoy acting, and always have done, but recently I've come to appreciate working backstage instead. In part, it's because I like being in control, but also it is a completely different way of using the same skills I use in the office.

People look at me as if I'm insane when I say that, but it's true.

Observation: during rehearsals I am watching everything that goes on, from which of the actors chatter all the time, to who isn't ready for their entrance, to the magically vanishing prop (the actor is miming holding a cup right up until the moment they want to use both hands, and hey presto the imaginary cup is whisked out of existence). I watch for traffic jams, when too many people leave the stage at the same time, and run into people coming on stage.

During performances, I watch even closer. The final performance of our Shakespearean play, a piece of something floated down onto the stage from the flies. I needed to find a way to get it off the stage as soon as possible, but I also needed to find out what it was - we had an enormous contraption suspended above the stage, and if it was falling to pieces we would be in big trouble.

There was an easy solution to the first part, we had a scene change coming up and I could pick up the something en route, which turned out to be a non-essential bit of duct tape (which, incidentally, I should never have left behind in the first place - a mistake I won't be making again), but if I hadn't been looking that might have stayed on the stage for the rest of the evening. I could also reassure the few actors who noticed it that they weren't about to get any heavy surprises falling on their heads.

So, as suggested above, all my observation is for the purpose of anticipating problems before they occur. Loose edges of mats can be tripped over in the dark, loose cables can be dragged across the floor, unshielded lamps in the wings can flood onto the stage during blackouts, dropped props have to be taken off stage before the next scene begins, scenery needs to be put in the same place every time or the lighting will be wrong, and so on. The number of things that can go wrong backstage is huge, and I need to identify as many as possible. I also need to find solutions to the problems, and then implement the solutions, which is where the analogy to testing falls down (somewhat, if you think of the problems that I anticipate as bugs).

In a large production like our Shakespeare, I am not alone but leading a team who also look for problems and solve them. My Assistant Stage Manager came up with the idea for suspending our contraption above the stage after anticipating problems moving it on and off stage in a blackout, making no noise. One of my stage hands took on the problem of two slightly unruly actors who had to help in a scene change, and sorted them out so that they wouldn't collide with anybody or get in the way. An experienced actor impressed upon a twelve-year-old member of the cast the importance of not standing in front of the lighting rack in the wings, casting her shadow across the entire stage.

I don't want to push the analogy too far. In the office, somebody moving my tools from where I left them is irritating - backstage, my anger was incandescent when somebody (and nobody will own up to it) moved my roll of duct tape during performance. If I hadn't checked for it before I needed it, we would have had quite a problem.

One thing that I bring from stage management back to testing is that many problems can be solved with very low-tech methods. Sure, suspending a contraption over the stage to be lowered into place is pretty high-tech, but another tricky problem was solved with three bottles of water, some crumpled up newspaper and a napkin. So in testing, while to execute one test I might use packet sniffing software, a website mirroring tool and a testing interface to the test item, to execute another I use maybe a vi macro, or a three-line Perl script. The thing that matters is how well the solution solves the problem, not how technically advanced the solution is.

30 October 2006

My first bug

I found my first bug on my first day at the Giant Fruit Company. It was a crash bug.

I mention this in interviews a lot, as I think it sounds cool. At the time, it made me think that I could get into this "testing thing", but there is a better lesson to be learned from how and why I found it.

As I mentioned, I started working at the Giant Fruit Company testing the porting of existing software onto their range of laptops. My first day, after introductions, I was given the task of testing Microsoft Word on the new laptop. For this kind of system, we had two guidelines. One was for testing the things that all applications were supposed to do on the Giant Fruit Company's operating system, in other words the GUI consistency. The second was for testing the behaviour of the particular application itself. In current jargon, they were test oracles. Altogether we spent ten hours on each application, five or six hours following the guidelines, the rest of the time on exploratory testing, where we just used the application to complete a task it would normally be used for.

My mentor left for a meeting, leaving me armed with the guidelines, a prototype laptop with correct installation, and some excitement at My Very First Testing Attempt.

Twenty minutes later I had crashed the laptop. I had a moment of panic, before remembering that in my new career this was actually a Good Thing. I carefully wrote down what I had done, then rebooted and checked whether I could do it again. I could! But what if I was doing it wrong?

My mentor returned from his meeting, and I started into my preamble. I had got to this point in the first guideline, and double-clicked on the menu bar, and the result wasn't what it was supposed to be.

"Oh yeah," he started. "That test is out of date, they got rid of that functionality a couple of releases back. Just skip it."

"But," I told him, anxiously. And showed him I crashed the laptop by carrying on double-clicking on the menu bar.

For each bug, we had a set regression path to follow, up through previous laptop incarnations until we found where the bug was introduced. This one had been with range since its very start. It remained unknown until I, as a complete newbie, stumbled on it.

From this, I got two lessons:
One: fresh eyes and ignorance are good things
Two: when bored, break things


Okay, I'm cheating a bit. I broke it because I was confused, there was a bug in my test oracle, and repeating the test until it worked was the only thing I could think of doing. But the idea of carrying on doing unexpected things was valuable, and I've found a lot of bugs on that principle alone. It's something I do when I feel bored, when dutifully verifying that x variation of test dimension y becomes too dull. There are always bugs like that to find.

Sadly, a couple of months later I learned a third lesson from that bug:
Three: large companies aren't necessarily structured so that bugs get fixed

When I left the Giant Fruit Company, that bug was still unfixed. The test team had one point of contact outside the test organisation, somebody called the Quality Lead. It became obvious that the Quality Lead was being evaluated on the number of bugs open on his project domain in the database. By labeling the bug as "Third party problem", it moved out of his domain and became somebody else's problem. And since it was a problem found with third party software, his rule of thumb was that it was a third party problem - even though outside this range of laptops it didn't occur.

Can you see where this is leading? We spent a large part of our time testing how third party applications behaved on our new laptop. Any bugs we found were deemed to be third party problems and never fixed, even when they clearly weren't. Morale was pretty low. Bug reports started getting abusive, people were fired for abusive bug reports, change was promised but never quite happened.

Most of the problem? We never even met the Quality Lead, never mind the people actually responsible for developing the new laptop. They were based on the West Coast US, we were in Ireland, we had phone conferences with other testers but that was about it. This did actually begin to change before I left, but too little too late.

Welcome - why are we here at all?

Welcome to my new blog, where the main focus will be on my professional life as a software tester.

I decided to start blogging about testing for a host of reasons, but the main one is that so many people are talking absolute rubbish about testing, and I want to set the record straight. Or, join in with more rubbish. Sure, I don't know everything, but I have a lot of experience and a lot of ideas.

Another reason is that I've found that I work on instinct a lot, yet get good results. So in addition to pointing out the rubbish and laughing at it, I hope to formulate my own approach into some sort of coherent framework that could be taught to others. It is easy to point out where something is wrong, it is much more difficult to suggest a solution.

My background, with anonymised company names: I started working in the Giant Fruit Company in Cork (Ireland) as a software tester in 1996, first on testing the porting of software to their range of laptops, and then testing localized software bundles. In early 1998, I moved to Dublin to work for the Coffee Cup Stain, my introduction to testing to spec, working on very large systems for GSM network management. In 2002, I went-all-Irish and emigrated to Finland, to work for A Finnish Telecoms Company (not the one you're thinking of) (nor the other one you're thinking of), which is where I finally began to have some input into how I was testing. Finally getting frustrated with the product I was testing, in May 2006 I moved to my current employer, The Logistics People, where I am the lead tester and (I like to think) shining example to all.

In addition: I am a rare being, a senior tester who also really likes executing tests. I have trouble taking things seriously, although inside, y'know, I'm like tortured. I move at my own pace, I'm easily bored, and I am far too busy outside of work.