22 January 2007

The categorisation problem

I don't have any answers to this one (yet), but I've been thinking about categorising test cases a lot lately.

What normally happens when a new product is tested is that an architect creates a document with one particular view of the product, and the test cases are categorised around that view, with a couple of extra folders for things like test scenarios and smoke tests. From then on, each new release of the product has a similar structure for the test cases, and the end result is that the testers only think of the product in that structure, from that particular view.

To give an example, for a simple program like a calculator. You could categorise your test cases based on the input to the test cases, e.g. all test cases using the integer 2, or all test cases using floating point numbers. Alternatively, you could categorise your test cases based on user actions, e.g. entering data, selecting functions, requesting results, saving and recalling results. You could categorise based on components (interface, calculating engine), and probably many other ways that I'm not thinking of right now. But I expect that the most usual method is to categorise by functionality: addition, subtraction, logarithms, etc.

Which is all very well. Users probably see a calculator from that view themselves, so that is one advantage to it, and you can add categories that don't quite fit for more complicated calculations involving more than one function. Most testers will be able to look at those categories and come up with good test cases for each one, and within a short amount of time you'll find that the bugs reported are being categorised that way as well.

However, the other methods of categorisation have merit as well. You don't want to define separate test cases for saving and recalling addition results, and saving and recalling subtraction results, because the likelihood that there are any differences between the code used is low (one would hope, but shouldn't assume). On the other hand, you do want to have separate test cases for adding integers, negative numbers, floating point numbers etc, and multiplying the same.

In Mercury's TestDirector, this can be done my defining your test cases along functional lines in the Test Plan part of the application, and grouping the test cases along data lines in the Test Lab part of the application. Other test management tools have other solutions, although out of the tools that I have used, I like TestDirector's approach the best.

A calculator is a simple program, and also a very familiar one. We are used to thinking about it from different views, like the functional and data views. When the product is solving a much less familiar problem, however, it becomes more difficult to think about it from different views. I've worked on projects where even the architects don't think of the product from more than one view, although they have more of an understanding that there are different views and what those might be.

In these more complicated projects, one particular view becomes dominant to such an extent that others become too difficult to contemplate. I'm not sure why this is so, but I can think of two good examples to demonstrate it.

1) Colours. In kindergarten, with our poster paints (and, in my case, an ungodly amount of mess) we learnt that the primary colours are red, yellow, and blue. Later, in physics, we learnt that in fact the primary colours are red, green, and blue. With an amount of experience in graphic design, for instance, people will eventually be able to translate "I want a yellower red" into RGB terms, but I for one have to experiment to find out what that means. And I still think in poster paint terms like "yellower red".

There are additional views of colours, too. CYMK is another graphic design one, wavelength is one from physics, there is monochrome vs polychrome, but with the exception of the last one I think that your average "user" will naturally gravitate towards a red, yellow and blue view.

2) Books. Most bookshops will categorise their stock along a variant of these lines: Fiction / Non-fiction, Genre / Subject, Author's surname, Author's first name or initials.* Foyles bookshop in London, however, used to categorise books primarily by publisher, and the practice was described as anything from "eccentric" to "a nightmare". "Nobody" thought of books along publisher dimensions (publishers, authors, other industry specialists, and the owners of Foyles excepted), and people found that they couldn't think along those lines. Beyond that, people hadn't even considered that thinking about books from this view was an option.

And that is the crux of the problem. In each of my examples, there are people (or, users) who think about a particular topic along different lines from the majority. As testers, we should try and cater for their worldviews as well as the majority view, even if any bugs found from that view are ultimately left unfixed, as part of our aim to report on how well the product we are testing solves the business problem. But once we have our test cases divided from one particular viewpoint, we too are hampered in trying to think of them from a different view.

I have some ideas for how tools could give more help with solving this problem, but that entry will wait for another day.

* The vagaries of categorising along genre lines is a topic that gets much debate in science fiction and fantasy circles, usually to propose that all fiction books be sorted alphabetically by author (with the subtext that of course ghetto genres like romance or crime fiction be kept out of it), or very tiresome discussions about how popular book X is science fiction but is shelved with mainstream fiction due to snobbery. What is seldom considered is that genre is just one way of describing a work of fiction, almost all of which have some subjective element to them. I would like to categorise my books along the lines of "good fantasy about dragons with no elves", for instance, and see how many dimensions I need before I have one and only one book in each category. I don't have the time, but it is fun to think about.

08 January 2007

Test Tools

(There has been news coverage recently on the number of blogs started and then abandoned - something like two million, if memory serves. Tied to the coverage, a new term was coined to describe people who start blogs and abandon them - cloggers. So, brought to you courtesy of a vague sense of guilt, a new entry.)

Very few of the tools I use for testing are specifically designed for testing. The honourable exceptions are defect reporting tools and test management software, but in both of those cases I would be (and have been) happy using a spreadsheet or some other general tool in certain circumstances.

The reason I use so few specialised testing tools is probably because most of my background has been on UNIX systems, where there are many powerful general tools that can be easily manipulated and strung together to produce tools for specific tasks. I also have a suspicion that specialised testing tools are hugely over-priced, and either solve problems that I don't need to solve or are so generalised that the effort required to configure them to do what I need to do is larger than that needed for me to write a tool myself in Perl. *

For the first time ever in my career, I am now working on a product that is installed onto a normal user PC running Windows. And I have found a wonderful, non-testing-specific tool that has found several bugs that I probably wouldn't have found without it, and helped diagnose countless others. Process Explorer, distributed free by what used to be sysinternals.com, is basically a souped up version of Task Manager, or a Windows GUI version of the top command. **

I can see what processes are running, when they started, and what started them. I can see CPU usage, RAM consumption, Virtual memory usage, and IO history for the whole system, or for each individual process. I can see what threads each process is running, and the stack trace on each of those threads.

I keep Process Explorer running on a second monitor, where I can see it out of the corner of my eye. New processes starting are coloured green, and processes terminating are coloured red (they can be changed, but I like the defaults), and this will catch my attention. I can see at a glance whether the amount of RAM has risen or fallen significantly, whether any particular process is using the CPU when it shouldn't be working, and basically where the activity is coming from.

But where it has proven most valuable is in noticing when processes haven't terminated when they should. Instead of finding these processes during a rare glance at Task Manager, when I have no idea when they started and what I was doing at the time, I will see it as it happens, and be able to investigate the bug while my previous actions are fresh in my mind.

And this feeds into something I believe about test case design and its value. I could have defined one test case (or several, or an infinity of test cases) that checked whether all processes terminated when they should have after executing various user scenarios. And I would have spent a few days working through those scenarios, driving myself half crazy with the tedium, and perhaps halfway through I would have noticed that I was running short on RAM, and I would start again to see which scenario was to blame, and a whole heap of time would have been taken up executing a test case that was better executed without being defined, while executing other test cases.

It isn't just with these kinds of tests, monitoring system behaviour, that I think leaving test cases initially undefined is a good idea, but that's a huge topic for another day.

* In fact, this is blatantly untrue. Open Source Testing has a lot of free resources listed, and although many of them don't solve my problems, some of them have proven useful and reduced my workload. If I can't overcome my prejudices, at least I can be aware of what they are.

** As implied, I have very little experience on Windows systems, and can just about navigate the file system in DOS. Although I tell this to anybody who listens, they seldom believe me. So while there may be DOS alternatives to do what I use Process Explorer to do, I didn't find information on them.