The Warning Signs

A Pop Quiz on Quality

By Robert Green and David Greer

Abstract

Everyone is in favor of software quality, but not everyone is producing quality software. How can you tell if a software group has gone off the track? It could be your DP department, your computer manufacturer, or even one of your software suppliers. The ideal software group uses feedback and repeated development cycles to find out what users really need. But it also uses rigorous software engineering. Despite the constant changes, the ideal software group ensures that existing features continue to work and future changes are possible. We have organized our ideas into a non-threatening Pop Quiz, consisting of tell-tale phrases that you may recognize, phrases that warn of a troubled software project, phrases such as "that's not my job" and "it's against our policy."

Robelle Solutions Technology Inc.
Suite 201, 15399-102A Ave.
Surrey, B.C. Canada V3R 7K1
Phone: (604) 582-1700
Fax: (604) 582-1799

Permission is granted to reprint this document (but not for profit), provided that copyright notice is given.

The Warning Signs

A Pop Quiz on Quality

Everyone is in favor of software quality, but not everyone is producing quality software. How can you tell if a software group has gone off the track? It could be your DP department, your computer manufacturer, or even one of your software suppliers.

Introduction by Robert Green

I was presenting my paper, Improving Software Quality, before a hometown crowd in Vancouver, when a local consultant rose to ask a question. "Responding to the users sounds fine," she said, "but I've seen a lot of systems where the programmers did constant changes and patches to give the users whatever they wanted. Most of the systems were a buggy, kludgy, impossible mess. How do you avoid that?"

"That's the topic for another paper," I replied, with a quick tap dance to distract the audience. Later, I admitted to myself what a good point she had made.

Robelle creates software tools. About once a month we release a new version of Qedit and Suprtool. We are constantly on the go, so how do we avoid producing a disaster?

I went to my partner David Greer for the answer. At Robelle I'm the one who leans toward wild, creative impulses. David is stronger on reliability, discipline, and long-range thinking. I was so concerned by bureaucratic software groups which didn't satisfy user needs that I was overlooking another style of shop: groups with weak software skills who react to user complaints as fire fighting.

The ideal software group uses feedback and repeated development cycles to find out what users really need. But it also uses rigorous software engineering. Despite the constant changes, the ideal software group ensures that existing features continue to work and future changes are possible. My previous paper on improving software quality was only half the story. Now David Greer and I finish it off.

Teaching Quality Without "Should" and "Ought"

To avoid having our paper degenerate into a sermon, we have organized our ideas into a non-threatening Pop Quiz -- a way to quietly rate your quality record. The quiz consists of giveaway phrases that warn of a troubled software project, remarks such as "that's not my job" and "it's against our policy."

See how many of these phrases you've heard. Score 5 penalty points for each one you hear regularly. We made the penalty 5 points so you could fudge if you want to.

Disclaimer: Despite the fact that we use Hewlett-Packard for some examples, we are not out to get HP. We mention HP because we know them and you know them, not because we think their software quality is substandard. To show it isn't personal, we've thrown in a few Robelle follies too. All shops are guilty of some of these quality slips from time to time.

Ready? Here's our first warning sign, a classic programmer excuse:

That's a feature, not a bug.

If the user then points out the spot in the documentation showing that he is right, there is still the second tell-tale excuse:

That's an error in the manual.

With those two replies, a programmer can deflect any conceivable bug report.

The longest-running problem report at Robelle is caused by a "feature" of MPE. A batch job can log onto another MPE system by doing a Remote Hello Command and creating a remote session. But, if the job runs Qedit (or any program) on the remote system, MPE tells Qedit it is in an interactive session, not a batch job. Qedit cannot tell this phoney session from a true session.

Qedit attempts to do a CRT status request, assuming that the session comes with a person and a terminal. Because no one is there running the sesion, the first Qedit command gets read as the CRT status! The DS/3000 Manual from 1978 is aware of this problem and warns about it, but in all these years HP has not fixed it. It's a feature, not a bug.

Reluctance to admit problems leads to a reputation as a "Black Hole". Questions go in, but nothing useful ever comes out.

When users gather, they swap "war" stories. One user was overhead at lunch, complaining about his software vendor. When he called the support line, their first question always seemed to be:

Why would you want to do that?

This statement suggests a superior attitude on the part of the programmer, an attitude that is seldom backed by reality. Customers do find unexpected and incredible ways to apply programs to solve their problems. The sign of a truly great program is that even when used in unanticipated ways, it still works.

Whether a program is useful is up to the customer to judge. Take an example from the carpet business, where despite superior technical knowledge at the factory, the customer knew best:

One of our customers in Europe came to us several years ago with his own testing spec for carpet foam backing. We were a bit put out that someone thought they could test it better than we could. We told him not to worry. Dow measures for foam stability, molecular weight distribution, particle size conformity, percent of unreacted monomer, adhesion strength -- all the vital things. We told him, "You're going to get the best there is, real quality!"
Well, three times we tried to ship him the order, and three times he sent it back. Now, that gets annoying. So we asked him, "What's the deal?" And he told us, "Your product can't pass my roll-stool test!"... What he did was take the bottom half of an office chair, put a weight on it, and spin it around on a piece of test carpet 30,000 times... If the carpet sample didn't delaminate from the foam, you passed the test and got the order... Quality is what the customer says he needs, not what our tests indicate is satisfactory. [I. Snyder of Dow, in T. Peters]

If there are no questions, everyone must be happy.

At a Management Roundtable where the users submitted only 18 questions, the moderator jokingly remarked, "This proves the customers are happy."

Wrong.

Customers who don't voice their complaints, when given a chance, have given up expecting answers. The unconscious attitude behind this warning sign is that customer complaints are an irritant to be tolerated.

Real user complaints are good, not bad. People who use and like software constantly think of more problems the software could solve. This shows up as increased "complaints". If you actually fix something for them, watch out! The complaints will escalate dramatically.

At Robelle, we have a saying: If we send out a batch of new software and no one complains, there is only one conclusion.

It means that no one tried it.

We've lost the source code.

On his first outside consulting job, Bob Green found that the client had lost his source code.

The client's billing program was taking 2 days to run on a sampling of accounts--to do all the accounts it would be running continually. The system was designed by an expensive consulting firm, then programmed by contractors. The client had only a junior programmer to make the inevitable patches and fixes. There were lots of source files, but no one knew which ones compiled into the current version of the billing program. It took most of a day to find the proper source files and get them to re-compile. After that it took only an hour to fix the program.

Programs are valuable assets that depend upon an infrastructure for their preservation. If you take shortcuts in development, you will lose control of this asset.

What is needed is simple, but requires discipline. For every program, there is a job stream that recompiles the current version. This job stream shows the source files that go into the program and how to combine them. The test environment is kept separate from production. Another standard job moves a new version from testing into production.

We're too busy to document that.

Having each programmer in a shop re-invent the wheel is inefficient. Software techniques uncovered by a software group can be a valuable asset if they are documented. There is an even better way of retaining knowledge. Use libraries of code. Reusable code leverages the investment. One programmer suffers to solve a problem so the others won't have to suffer as well.

As an example, consider double-sided printing on the LaserJet. In our shop, Dave Lo first added this feature to our Prose text formatter by hard-coding the Escape sequence into the program. Then Bob Green hard-coded it into Qedit. When David Greer wanted to add double-sided printing to Suprtool, he asked, "Why is the Escape sequence hard-coded in Prose and Qedit, when we have a library routine that could have held that knowledge?"

David added the Escape sequence to our library code and removed it from the individual programs. Now we can add double-sided printing to the rest of our programs without having to re-learn the Escape sequence.

The good programmer writes software that can be reused, even if he doesn't see a reuse immediately on the horizon. His experience ... assures him that someone from the next hallway, reasonably soon, will be asking, "Do you have a module that ... ." He also knows that writing reusably will force him to define clean interfaces. [D. Boundy]

When a programmer can't find the reason for a particularly bizarre and puzzling bug, there is one sure fire excuse.

It must be a hardware problem.

In all the bugs we have investigated in over 20 years in business, only one turned out to be a true hardware problem.

Our Qedit program was repeatedly losing track of lines in a file at one customer site, but we couldn't repeat the problem on our machine. Through painful hours spent in Debug, we finally proved that the Load Instruction was failing on this one customer's computer, but only when indexing was done without indirection. And Qedit was the only program on the system that used this unusual mode of the Load Instruction.

The ugly truth is that hardware is amazingly reliable and software is not.

It would cost too much to update the paperwork.

The late Dr. Richard Feynman of Cal Tech did a famous study of the Challenger shuttle disaster. He ranged widely throughout NASA and its contractors, talking to anyone who could shed light on the quality problems.

A group of production workers had found a simple way to improve the calibration of the rocket engines, but it was never implemented.

The foreman said he wrote a memo with this suggestion to his superiors two years ago, but nothing had happened yet. When he asked why, he was told the suggestion was too expensive. "Too expensive to paint four little lines?" I said in disbelief. They all laughed, "It's not the paint; it's the paperwork. They would have to revise all the manuals."
The assembly workers had other observations and suggestions... I got the impression they were very interested in what they were doing, but they weren't being given much encouragement. Nobody was paying much attention to them. It was remarkable that their morale was as high as it was under the circumstances. [R. Feynman]

When people want to cut corners, you hear assertions like this:

No one will ever notice.

What they mean is that the users are too stupid to recognize quality when they see it.

At Ford Motor Company, they once came up with a scheme called PIP, Profit Improvement Program.

The purpose of PIPs [Profit Improvement Programs at Ford] was to bring down the costs of making a car by taking them out of an existing budget; an example might be the decision to equip a Mercury with Ford upholstery, which was cheaper. Some traditionalists were convinced that the PIPs systematically reduced quality, that it was automotive sleight of hand, and that the covert philosophy behind the program was that the customer would never know the difference. PIPs quickly became part of the vernacular, turning into a verb. "What happened to that hood ornament?" "Oh, it got pipped." [D. Halberstam]

You seldom know which features will be important to your users. Success demands attention to all details. In software, users notice the little things. They seem to be more sensitive to details than to the big picture, perhaps because they take the overall objective for granted but the details drive them crazy day after day. The one enhancement to Qedit which received the most positive feedback from the users was a tiny and simple change: allowing the entry of MPE commands without the preceding colon.

One more Go To won't hurt much.

Good programmers don't use Go To to solve their logic problems. They structure their code as easily as they breathe. They keep their programs within their intellectual grasp by using limited control structures: While, Do-Until, If-Then_Else, and Case. Keeping it simple keeps it easy to understand. The problem with Go Tos is that you can build convoluted structures with them.

Well-structured programmers limit the scope of their data structures. If a variable is only needed within a procedure, they make it a local variable not a global variable. If a procedure needs to access a global they either pass it in as a parameter or export it. If the programming language allows, they distinguish parameters which are input only from those that are both input and output (i.e., call by value versus call by reference). The fewer places in the code that can touch a variable, the easier it is to debug a program.

The competent programmer uses top-down design and bottom-up implementation:

The first thing he does in any programming task is to analyze the entire problem into smaller problems that can be solved separately. He begins coding by writing functions that implement the primitives and building blocks he will need. The rest of the program almost writes itself. [D. Boundy]

We'll just reuse this data item.

The disciplined programmer does not give two names to one thing nor attribute two things to one name. Names are meaningful and specific, and their length is proportional to their scope. A loop variable used only once in a two-statement loop may be called "i", but a global variable that may be used anywhere in the program will have a long name that accurately describes its usage. [Boundy's Laws of Naming]

The disciplined programmer adheres to the standards of his workgroup, even when these standards appear arbitrary. A standard as small as indentation style makes the code more readable for a person who doesn't know it. That person might be the programmer, two years later, after the code is forgotten.

A program is written once, but read many times. Why not make it easy on the next person who reads the code?

That could never fail--don't bother testing for it.

When he first began to program, one of the authors of this paper (please don't ask which) developed an unfortunate impatience with writing code to test for error conditions. In his programs, he would skip error checking after certain operating system calls, such as Fclose and Fgetinfo which he "knew" could never fail.

Then he discovered that a :File Command could cause the Fclose Intrinsic to fail, leaving the file hanging open. And Fgetinfo failed once, undetected, when he closed the file by mistake, and another time when it was accessing a remote file on another system, and a third time when the calls were being intercepted by another vendor's run-time library.

This programmer learned the hard way to test every system call and every library function for failure. If he doesn't, his program may go merrily along, processing with the wrong data. He accepts the fact that most of his programs will have more lines of code to handle failure than to handle success.

Sometimes we develop a similar attitude toward manual procedures, such as installing a new version -- "I've done this so many times, I could do it in my sleep!" Implying, "I couldn't possibly make a mistake". Unfortunately, uninspiring tasks are the most likely place to misstep, since it is difficult to keep your concentration focused. When we install new software, we have to follow a written checklist. The checklist includes every step to create a new version of the product, plus steps to verify that the installation was done properly (i.e, make a demo tape and install it on another machine).

Fascinating project -- too bad it failed.

Fascination with grandiose schemes and bigness leads to White Elephants. The symptoms are pretentious objectives, high cost, fantastic claims, neglect of other projects, fanatical denial of failure, and a sudden, total write-off. All so unnecessary. In software, big results can come from tiny investments and tiny results can come from big investments.

The original Lotus 1-2-3 (tm) and dBASE (tm) programs were two of the most successful application programs ever written. 1-2-3 was written mostly by one person in eighteen months. The macro capability, one of the things that made 1-2-3 really successful, was added by the developer at the end because he had some extra time -- it wasn't even in the informal spec he had. dBASE was written by one person over a two-year period while he also held a full-time job. [D. Thielen]

Hewlett-Packard has done two large, exotic database projects for the HP 3000 that were never released to users. Those projects consumed R&D resources that could have provided badly-needed enhancements to HP's popular but untrendy IMAGE product.

There is a popular view that technology is only technology if it is high-tech -- sort of a "Big Bang" theory of technological development where somebody suddenly thinks of a major innovation or invention. But most of the technological change that goes on in society is not the "Big Bang" type. Rather it is small, gradual, marginal things. [Michael Bradfield, Dalhousie University, Globe and Mail newspaper.]

Robelle's biggest failure was the Virtual Fortran compiler. We had no Fortran expertise and no local test sites, but we did have big plans. We were going to run big, scientific Fortran programs on a 16-bit HP 3000, instead of waiting for a new RISC computer. Although we did complete the project, it was late and never performed fast enough. We were seduced by the glamour of the project. We squandered two man-years we could have spent on more humble but more successful projects.

We tested it once by hand, isn't that enough?

You can't test for the complete absence of bugs, because you can't try every path through the code. But you can test the typical cases and the boundary cases: minimum value, maximum value, and no value. In rigorous testing environments, such as software for jet aircraft, it is standard practice to measure the percentage of program statements being executed by the tests and aim for 100% coverage.

Automated testing is the answer. The computer can do more tests than we can manually and do them more reliably. We borrowed the idea for test jobs from the Pascal validation suite, a series of tests that told whether a Pascal compiler was up to the standard. A typical job tests one command, often by modifying data two different ways and comparing the results. For example, copy a file with Suprtool and again with Fcopy. Any difference and the job aborts.

When we are revising a program, we schedule the test suite to run at night. It is amazing how often a seemingly minor change causes 10 or 20 test jobs to fail.

When working on a bug, a good practice is to add a test that reproduces the problem. When the bug is fixed, the test passes. Reproducing a bug in the test suite also provides a warning if the bug creeps back into the code. It is embarrassing how often old bugs resurface.

It's fixed, but is waiting for the next release cycle.

Installing new software into production is an error-prone process. The program has to match the source code, the help file, and the manual; everything has to be tested; and so on. The more error-proof the installation process, the more onerous and time-consuming it becomes. This discourages frequent software updates, causing release cycles to stretch out until it takes two years to get the simplest bug fix into production. Look how many years it has taken HP to add support for IEEE floating-point numbers to TurboIMAGE.

The way to quicken development cycles is to automate every step in sight.

For our products such as Suprtool, we have a automated batch script that regenerates a new version. The batch job

recompiles all the code, including supporting modules and programs,
runs the test suite against the NM and CM versions of the product,
reformats the documentation files to check for hyphenation errors, which it delivers to the programmer through electronic mail, and
generates a new help file.

Now if we could figure out a way to have the computer actually fix the bugs and write the documentation, we could retire.

It can't be changed, too much code references it.

Global variables are the worst enemy of good software. This insight came to us in two waves.

First we learned not to use a literal constant, such as "80", when an identifier like "buffer-size" was allowed. Which would be easier if you want to change the buffer size?

Later we learned that this is not enough. The wider the access to any data structure, the more code to be checked when that data structure is revised.

For example, Suprtool has a fixed-size table for the Extract Command. On Classic machines, the space used by Extract limits the space left for buffering. We want to convert the Extract table into a linked list. So far we haven't found the time, because there are so many places in the code where the table is indexed as an array. If we were doing the Extract Command today, we would hide the data structure in a separate module. The rest of Suprtool would call procedures in that module to access the data structure.

The AIFs for MPE XL are a good example of making programs data independent. An AIF is a fast subroutine for accessing system tables. When a system tool uses AIFs, it doesn't know the location and structure of system tables. MPE XL may be improved and changed drastically, but the system tools will still run properly.

That would mean changing all the programs.

The year 2000 haunts those of us with only two digits reserved for the year instead of four. Will our invoices be dated January 1st 1900 at the turn of the century? How are we going to sort by date when "01" is less than "99"? Why did the millennium have to occur in our generation?

We wish now that we hadn't hard-coded the date format into all those programs, nor sprinkled ad-hoc code throughout our systems to edit and format dates. Couldn't we just extend the year 1999 instead of going to 2000?

With perfect hindsight, we would have used modular programming in the first place. Programs wouldn't "know" about dates--they would depend on a date module for that knowledge; a module that holds all the functions and data structures for dates and reduces them to a clean, published interface; a module that edits dates, converts them between different formats, compares dates, and does date arithmetic. To change a date format, we change the module.

There is still time to correct past follies. You can break up large modules into smaller ones, write new, generalized modules to replace the old, restrictive ones, and design modules to be used as tool kits upon which other can build.

We gave the users what they asked for.

As an experienced developer once quipped, "The firmer the specs, the more likely to be wrong."

It is natural to plan for only one release of a new program. Unfortunately, the original software design is seldom what the users really need. The harder we pressure the users to tell us what they want, the more frustrated we both get. Users can't always tell us what they need, but they certainly recognize what they don't like when they see it.

The key to writing quality software is to do it in several releases. Once the users have the first release, the programmer incorporates their feedback (i.e., complaints) into the next release. At our firm, we send new "beta test" software to our customers every month. During a year, we may have 80 test installations for a product with 800 active users -- about 10% of the customer base.

At Robelle we use a development method called "Step by Step", created by Michel Kohon. It breaks a large project into small, two-week steps. This does not mean ignoring long-term goals, but once we learn more about the users' actual needs from each step, we commonly adjust our long-term goals as the program evolves.

The final aim is the program, not the analysis. So, until the program or its results are in the hands of user, nothing is completed. [M. Kohon]

Computing Your Score

That is the last of our warning signs. Did you recognize many of them? Now calculate your score? Remember, five points per warning sign, and the lower the score, the better.

0 is too good. You must have cheated.

20 is excellent. Congratulations.

40 is respectable. Good work.

60 is worrying.

80 or more is a disaster. Update your resume.

The Warning Signs

A Pop Quiz on Quality

By Robert Green and David Greer

Abstract

The Warning Signs

A Pop Quiz on Quality

Introduction by Robert Green

Teaching Quality Without "Should" and "Ought"

That's a feature, not a bug.

That's an error in the manual.

Why would you want to do that?

If there are no questions, everyone must be happy.

We've lost the source code.

We're too busy to document that.

It must be a hardware problem.

It would cost too much to update the paperwork.

No one will ever notice.

One more Go To won't hurt much.

We'll just reuse this data item.

That could never fail--don't bother testing for it.

Fascinating project -- too bad it failed.

We tested it once by hand, isn't that enough?

It's fixed, but is waiting for the next release cycle.

It can't be changed, too much code references it.

That would mean changing all the programs.

We gave the users what they asked for.

Computing Your Score

Suggested Readings