AskTog: Interaction Design Solutions for the Real World
 
Interaction Design Section   Living Section   About Bruce Tognazzini
NN/g Home > AskTog > When Good Design => Bad Product

AskTog, December, 2003

When Good Design => Bad Product

What a strange situation. You take a mediocre product and rework the design to make it better. Your design is a success, by any reasonable measure, but the resulting new release is actually worse. You redouble your efforts and matters become untenable. It doesn’t matter how brilliant and effective your designs, the more they improve the product, the less usable the product becomes.

What could cause such a situation? Industrial sabotage? A rip in the seam of the universe? No, poor quality assurance (QA) procedures.

Dish Network, with its digital recorder/receivers, has shown evidence of this phenomenon for several years now, with no end in sight. They have recently released version 115 of their software for their model 721, and it appears to have almost as many bugs as their very first release. These bugs are not subtle. They are easily reproduced and, in a few cases, fatal. Their genesis, in several instances, appears to be a reworking of the interaction design of existing features. The new designs are a great improvement over what came before—or would be if they worked.

I am not privy to the inner workings of what passes for a quality control/quality assurance program at Dish Network. What seems evident from the type of bugs that are being released in the wild, however, is that little more is going on than asking a bunch of people to play with the software for a few hours/days to see what they might come across. This is how we did it back in the ‘70s, with our old steam-powered computers. It’s not how it is done in the new millennium.

Testing for edge conditions

What a trained quality assurance professional brings to the software party is a knowledge of how to put together a test plan, one that systematically looks at the full range of every feature under all reasonably expected conditions, including edge conditions.

Version 115 is at least usable when the 721 is connected to its antenna in the usual fashion (although now there’s sometimes no sound or the sound is decoupled from the picture by a few seconds—just a minor bother if you can read lips. I'm getting pretty good at it.) One of the marketed features of the Dish Network recorder/receivers, however, is that you can disconnect them from their antennas and carry them, for example, to your weekend retreat, where you can play back all your recordings, even though you have carried no antenna with you. That was true until release 115, where the machine locks up completely if no antenna is present.

With a test plan, such fatal bugs would be spotted instantly. What might have happened at Dish Network is, lacking an employee trained in building QA test plans—or at least lacking any management support of that employee—the boss just told a whole bunch of people to beat on the design for a while, and none of them happened to be headed for a weekend retreat. As a result, a fatal flaw in one promoted and easily tested use went unnoticed.

It can work to have a “testing bee,” where everyone pitches in for a few days of franticly beating on a new release. However, the employees each need to be armed with a script outlining the specific tests they each need to perform, including the conditions under which each should test. For example, Dish Network receivers should operate when connected to one satellite, two satellites, three satellites, or no satellites. Testing them for operation under just the normal two-satellite configuration is inadequate. Assuming someone around the office will do so without explicit instructions is naive.

Of course, such testing "bees" may not be cost effective. Professional QA testers not only are proficient at their jobs, but they come relatively cheap. Having senior engineers, who are not trained for this and not "up to speed," can cost several times more. QA people also test constantly, feeding back bugs in a timely manner, squashing small bugs before they become big ones later on as they interact with bugs formed in other areas of the code.

Ignored bugs become ever bigger

115 has turned an ill-designed feature into a huge bug. The problem started out with some heavy-handed graphics. When you pause the machine, instead of getting some subtle notification, like a pause symbol watermark, you get a giant opaque rectangle that spreads across 80% of the width of the screen, blocking the lower region of the image from view. This is the same lower region that typically holds news and stock tickers, so any time you pause, for example, Headline News to read the ticker, you can’t read anything for the giant blob covering it.

The problem was then compounded by a work-around solution earlier in 2003, when they added the capability to step backward and forward frame-by-frame (sort of). As soon as you so step, the giant PAUSE rectangle disappears. Wonderful! The problem was, if you stepped back to eliminate the box, then pressed the Pause button twice, the system would lose its place in the program you were watching, and suddenly you’d be back at the start of the show. (You might suggest that users not press Pause twice, but that’s the fastest, most natural way to get out of this special mode.)

I reported this bug in May of 2003. It was properly entered into the system, but nothing was done about it. When 115 was released, in October of 2003, not only had they not fixed the bug, they had made it worse. Now, stepping forward, as well as back, did the same thing. Pressing forward is the more natural way to overcome the giant opaque block problem, so now a lot more people are complaining.

(Dish is good at not addressing bugs. One particularly irritating bug in the 721, which causes users to lose their place in shows not watched all the way through the first time, has persisted across all 115 versions. I know it’s been reported; I’ve reported it three times myself. It’s one of those little “five-minute fixes” that, in real life, probably eat up a full hour, but no one’s had a full hour in the last 18 months. In the meantime, if it’s like things were at Sun Micro when I was there, they’ve spent at least five hours continuing to catalog and discuss the bug.)

Damage from bugs persists

Albert Payson Terhune, the author who taught the world to love collies (Lad, A Dog , et. al.), once wrote an article for the Saturday Evening Post (March 26, 1927 issue) about his beloved collie, Fair Ellen.

Terhune explained that Fair Ellen (that's her on the left) had been born blind, but learned to live quite happily, except for one small quirk:

If I stand beside her kennel yard and call to her to come and be put up, she does not approach me in a straight line, but along an imaginary path which has perhaps six or seven twists and turns.

This used to puzzle me, until one day I saw her run against a wheelbarrow which one of the men had left in the open patch of fairway between the house and her kennel.  That was three years ago.  Never since then does she come to that spot without making a careful detour around the imaginary barrow.

Her twisting course, along all familiar bits of ground, is due to her effort to skirt some box or rake or other obstruction which at some times she has struck against.  She has preternatural memory for such things and for the precise spot in which once they were.

Users do the same thing.

115 has taught Dish Network users to abandon the easily-accessed Pause button to resume play after looking at headline tickers. Instead, they must either look at the remote control to find the Play button or feel around for the Play button, an activity that not only takes high cognitive engagement, but lots more time.

Eventually, Dish Network will fix the bug. Users' behavior will not necessarily change with it. Once people have learned something no longer works, once they have formed a new habit, no matter how inefficient that habit is, they tend to perpetuate it. Years from now, many users will still be feeling around for the play button, long after they could have reverted to double-pressing Pause.

Another bug has actually made it across far more than 115 releases. It has been carefully copied from the earlier Dish Network 501 box, intact. This one changes the function of the Stop key from "I don't want to watch any more right now, thank you," to, "I never want to watch any more, thank you, so throw the rest of the show away without warning me," depending on whether the entire show had been recorded at the time you pressed the key. Same motivation, same action, difficult-to-predict result. At first glance, it's amazing that such a destructive design error could persist so long. Is it that no one at Dish uses their own product, or is it just that they, like we, have learned to avoid ever, ever pressing the Stop key?

The costs of poor quality assurance

Putting out products significantly poorer than their predecessors has both direct and indirect costs. The most direct cost is support. As soon as new bugs escape the factory, the phone lines are flooded. It’s bad enough when the early adopters snatch up Version 1.0 and start immediately complaining. Most of us eventually learned to wait around for Version 1.1. Dish Network customers don't have that option, though, because they release their products on their subscribers without any notification or choice, by downloading the new software into unsuspecting receivers overnight. There is little or no ramp-up. All of a sudden, Support is getting a whole bunch of phone calls, and they don’t even know about the problems, let alone have work-arounds.

Other costs include upsetting the workforce. I’ve worked on new products that, because of rushed schedules or poor QA, were disasters. Not only is everyone thrown into a tizzy trying to put out version 1.01, productivity plummets from the combination of stress, depression, and embarrassment. (I can’t even imagine what it must feel like to be the designers who cleaned up so many parts of the 721’s interface in the 115 release, only to see their good work indirectly cripple the product due to the lack of systematic testing.)

Then, there’s the problem of millions of people telling their friends about Dish's unreliable, cantankerous receivers. The average Joe cannot differentiate between bad hardware and software. All he knows is that it used to work and now it doesn’t.

Which leads to another direct cost. When I reported an inability to watch shows off-line, Dish Network’s Level One support people immediately announced that there must be a hardware problem with my 721, and began to arrange for a replacement unit to be shipped to me. (This is not the first time this has happened.) It was only after I insisted that I was looking at a software problem and demanded to be switched to Advanced Support that I was able to get someone capable of exploring the actual problem.

I don’t know what percentage of units returned to Dish turn out not to have anything wrong with the hardware, but my guess, from my own experience over the years, is that the figure would be high, a direct result of improperly-tested software being released on an unsuspecting public and an equally-unsuspecting support staff.

Of course, when you have a high percentage of properly functioning hardware being returned, units with genuine, but intermittent, problems will also tend to judged as being OK, absent the intermittent problem revealing itself at the moment of service. These will then be recycled to users, thereby replacing good receivers with a temporary software problem with bad receivers with real, but intermittent problems. If proper tracking is not done, this cycle can repeat itself over and over before a user finally defenestrates the offending receiver.

What this means to designers

While it is not officially your problem to see that your company carries out unrelated engineering procedures properly, as user-advocate, you need to make it your problem. The best design in the world is worthless if shipped riddled with bugs.

If you work for a company that has no formal quality assurance program, begin to educate people. Work with marketing people. They can easily grasp the damage poor products can do to your company. Teach then how QA works, the ways it can save time, money, and embarrassment. They can escalate the issue to upper management, something perhaps difficult for you to do if you are working under the person who is the problem.

When you come up with improved designs, take it upon yourself to test how they work. Formulate your own test plan and try them out under a variety of conditions. If the engineering process is so badly broken that changing the old designs will almost certainly make things worse, back off on how much you want to change in each release and work with your engineers to do more informal QA testing before the release is assembled.

Final thoughts

Would I recommend a Dish Network digital recorder/receiver to my friends? Sure, with the proviso that they understand that, periodically, it will break. Why recommend it at all? Because it is cheaper and less intrusive than the other leading brand. DirectTV’s TIVO box costs an extra $5 a month and must be plugged in to your phone line to avoid having a nagging message appear every day after a multi-week grace period. Most Dish Network receivers are free of such charges and can run forever without reporting back any information about your viewing habits to headquarters or hassling you about not allowing them to peer around in your system.

On the other hand, TIVO is a PVR, and Dish Network still doesn’t offer a true PVR. Instead, they offer a video recorder that happens to use a hard disk, instead of tape. The difference? PVRs offer two ways to record shows. One is the traditional, program-a-timer method. Both companies offer that.

The other way sets PVRs apart. I call it “lying-in-wait.” If you are a Clay Aiken fan, for example, you tell your PVR you want it to record any show that has Clay Aiken on it. Six months from now, if Clay shows up in a scheduled MTV interview, he will be recorded with no further intervention on your part. You are freed from having to search the listings each and every week.

Will Dish Network ever finally offer true PVR functionality? Probably, but I’m not looking forward to it. Why? Because, based on the last one hundred and fifteen releases, it will take weeks or months for the dust to settle. In the meantime, I might not be able to record at all.


Have a comment about this article? Send a message to Tog.

Previous AskTog Columns >


Don't miss the next action-packed column!
Receive a brief notice when new columns are posted by sending a blank email to asktoglist-subscribe@yahoogroups.com.

return to top

---
 
Contact Us:  Bruce Tognazzini
 
Copyright Bruce Tognazzini.  All Rights Reserved