InCtrl5. This utility is designed to allow you to record changes to files and the registry made when installing software. Has a companion utiltity that will completely reverse an installation. This can be a quick way to reset an installation. Availability: Windows 9x thru 2000. XP support unclear. Free download, with source available. Download requires registration.
FileMon and RegMon. These monitor and display file system and registry access in real-time. These give you an inside look into what the software is doing and what dependencies may be in place. Availability: Windows 9x thru XP. Linux version of FileMon. Free download. Commercial versions are also available.
AppVerifier. This utility monitors an application for use of hard-coded paths and inappropriate references to the registry, as well as several other checks of stack, heap, handle and lock usage. Availability: Windows XP. Subset of features (including the path and registry checks) work on Win2K (despite claims to the contrary). Free download.
InstallWatch. Monitors and records the effects of an installation. Includes utilities for replicating an installation on other machines. Availability: Windows. Free download.
Over the past two years, i have had dozens of interesting conversations with test-infected programmers. Many of these occurred at small workshops, many of which i hosted myself. I have also gotten a lot out of the XP Agile Universe conference the past two years, but i hesitate to recommend it many of my tester friends who are put off by some of the religious trappings of XP.
I would, however, like to see these kinds of discussions become more commonplace. Testers have a lot to learn from test-driven developers and programmers have a lot to learn from committed testers. PNSQC may very well be the best place to have this discussion on a large scale. It is held in Portland, Oregon, in October. They recently posted their call for papers.
Most conferences are hosted either by academics or by companies that specialize in trade-focussed publishing, training and consulting. And both types can easily suffer from being out of touch with the day-to-day realities of software development. PNSQC is different. This conference has been hosted for twenty years by a local volunteer organization. The volunteers have regular jobs in software testing, programming and management. They have vested interests in ensuring that the program is practical and relevant. Most will be sending others from their companies and will be depending on the conference to help keep their company's staff's skills current. The conference prefers practice-based papers. They also have an excellent peer review process that new writers may find very helpful.
You don't have to take my word for it. Their proceedings are freely available online. Check them out. Here are some highlights from last years conference.
In August, an electrical blackout that begin in the midwest spread to affect several states and provinces from Toronto to New York City. Alarm failures in Ohio kept operators from taking prompt action that could have limited the area impacted by the power outage. These failures have recently been diagnosed as bugs in the General Electric XA/21 energy management software. The operators were unaware of the alarm failure at the time.
The energy utility claims the alarm failure was "triggered by a unique combination of events and alarm conditions on the equipment it was monitoring." One commentator questions whether it is fair to blame the buggy software: "If the XA/21 system is so bad, how come we're not seeing it fail all over the place?" Uh, because it was a silent failure?
There are two ways an alarm can fail. A false alarm is triggered when there is no underlying problem. These are easy to notice when they happen. A failed alarm occurs when an underlying problem goes unreported. They are the bugbear of any testing or monitoring system. I regularly find defects in automated testing systems that could allow failed alarms. Too often developers get caught in a find and fix cycle that only addresses false alarms. Failed alarms are often the bugs you don't hear about until it is too late.
Reports
This article describes three types of software bugs that are difficult to find by traditional testing and inspection methods: stack overflows, race conditions, and deadlocks. Examples of each are given, along with analysis methods and design rules that can assure that these kinds of bugs are avoided.
The main reason for the recent upsurge in homebrew automation is XP. The majority of the open-source testing tools available today have been built by test-infected programmers. XP is based on tight iterations and constant code refactoring. This makes automated tests a requirement. Automated unit tests do part of the work, but automated system acceptance tests are also needed. This has motivated talented test-infected developers to figure out ways to automate acceptance tests for their software products. The results are fascinating. Indeed, i think they are the most important thing happening in the world of automated testing. I've been spent most of my research time over the past year or two learning from this community.
I've been doing automated system testing for many years, using home-grown tools, commercial tools and open-source tools. One of my first automation projects was on a home-grown tool. We had about four people building a tool for automating GUI testing. This over a decade ago, just before commercial GUI test tools were becoming available. We had some very good ideas, and we also got caught in some ratholes. Overall our progress was slow. This was during the last recession. The cold war ended when the Berlin Wall fell in 1989. My company counted a lot of defense contractors amongst its customers. Defense budgets were cut when the cold war ended, when my company looked around for people they could afford to cut, they chose most of the automation team. We were infrastructure, which is never a good place to be when layoffs come.
My next job was with a GUI test tool vendor, and i saw a lot more success. I stayed with them for many years, first in their consulting group, helping their customers, and then for several years later working as a tool champion for a couple of their customers. The lesson i learned from this was too much work to build and support a test tool internally. It was better to use a commercial tool.
But now i've changed my mind. I've interviewed dozens of XP programmers who've built their own acceptance testing frameworks. Many of these frameworks are now open-source. I've been able to put together homebrew testing systems for several of my own clients over the past year. And you can too.
Why is XP seeing success with homebrew test automation where others have failed? XP has a couple of rules that prove to be crucial to this matter. First is that programmers have to write their own automated unit tests. This has gotten programmers interested in testing. Moreover the widely-available and most free xUnit unit test harnesses have helped jump start their work, so they see success faster. And of course we now see more unit testing than ever.
Secondly, the XP rules say that they need to have automated acceptance tests. The desire to automate system testing is nothing new. But what is new is the sense that you have to have these tests automated up-front. So with XP, you can't just assign someone to work on automated testing in whatever time they have available and hope that the tests are automated when you come to need them. Rather, they are seen as a critical path item. If there is trouble automating them, then additional team resources need to be applied until the obstacle is overcome.
And third, with XP, programmers and testers are expected to work together on the acceptance tests. Traditional work rules often separate the programmers and testers -- indeed they sometimes insist that the two groups be separated to ensure objective testing. I think this benefit is vastly overrated, and it certainly complicates automation. Indeed much of my own consulting practice is based on finding ways to get testers and programmers to work together more effectively.
There are a couple immediate benefits that come from getting testers and programmers to work together. For one, testability negotiations become trivial. Testers routinely wish they had access to product internals of one sort or another. These can often have a major impact on the testers' ability to automate their tests. If they are working together, then they can get and use the interfaces they need. Secondly, automation benefits from having skilled programmers. Too many traditional test automation efforts are driven by testers with weak or immature programming skills.
Several years ago, i looked back on all the test automation projects that i'd worked on and actually tabulated the factors that seemed to be most strongly correlated with success. I found three. First was that it had to be treated as a full-time activity. Secondly the whole team had to be committed to test automation. And third, they had to start early. Well, it turns out that these three factors so important to test automation success are actually assured by the XP rules.
So what does automation look like on XP project?
By and large they don't use commercial test tools. Indeed, i was surprised by how many seriously evaluated and then rejected the commercial offerings. Some even had copies that had already been purchased. Why? Two reasons.
First, has to do with the pricing of the tools. XP quite reasonably expects everyone and anyone to run automated tests. Automated tests are run on private builds before checking code in. They are run on every official build, often several per day. Test tools are seen as key tools, just like an IDE.
But they are not priced like IDE's. These days most commercial GUI test tool licenses run around $7,500 per seat. So companies that use them only buy a couple of licenses. Few companies even buy a license for each of their testers. This kind of pricing, which i've thought for years to be absurd, makes XP programmers pretty interested in finding alternatives.
The other reason is that many of these tools have crappy languages. They are idiosyncratic and weak. One language has supports inheritance, except that when you override a method, the original method is still the one that is inherited. The vendor claims that this is how it is supposed to work. Another language supports dynamic arrays, but doesn't provide a method for determining how many elements a particular array has. When i spent most of my time working with such tools, i had a whole array of clever techniques i used to work around the strange flaws in these languages. XP programmers are largely unwilling to do this. "Heinous" is what one critic called them.
As a result, XP programmers have found out how to build their own test automation frameworks. And they have taken advantage of the fact that modern operating systems have a lot more testability built in from the start. And this community has shown a strong interest in sharing their work as open source.
We saw some of these tools at the recent AWTA workshop. I'll be describing more here.
The first sign of this bug was a warning message that said it had been 38 days since my last backup. Yesterday it had warned me that it'd been 8 days. So i checked my clock and saw that it thought it was March 1st. And that's how i know that the problem occurred some time in the past 24 hours.
I could run some experiments. I could set the date back to Jan 31 and wait and see if rolls forward to the next day correctly. I have another machine also running W2K. It doesn't show this date problem. Both machines are IBM ThinkPads, but the other is a couple years older -- different hardware. It's also not hooked up to the internet -- at least not for the past week or so. It's hard to analyze a failure until you have a reproducible case. If this bug appeared in software i was testing for a client, i'd probably spend a couple hours seeing if i could find a way to reproduce it.
But this bug isn't worth that much time on my part. I see bugs every day. Even when i'm testing for a client, i often have to decided which anomolies justify further investigation. Good testers do this regularly, but i don't know how to teach this or test for it. Except perhaps to talk about what i do.
My clock is still set on Central Time, so i don't see any further indications of problems. I've reset the clock to the right date, and i'll see if happens again. And now i have one more test technique to use when testing software that handles dates: see if it handles the turnover from Jan 31 to Feb 1 correctly.
This is actually how i think test techniques are developed. By observing bugs, speculating on their causes, and then developing heuristics for how to find similar problems systematically in the future.
In this case, i've decided that an important condition for the bug relates to the end of month rollover. It's a theory and i haven't really proven it. But it'll affect how i test dates in the future.