|
A PTS Professional Paper |
![]() Web Site Design, Usability, Documentation
Three Usability Methods |
![]() Phillip T. Scarborough PTS Technical Writing |
|
A COMPARISON OF
THREE METHODS FOR THE
This paper compares three usability assessment methods for Web sites: usability focus groups, heuristic evaluation, and laboratory testing. It describes Web site usability issues and presents case histories of studies conducted by the authors, illustrating why each assessment method was chosen and how it was applied. Usability Focus Groups Usability focus groups apply a method that originated in market research to obtain rich qualitative information from target users, often when a Web site is still in the planning or early development stages. Focus groups are helpful for:
In a focus group, people with similar characteristics who don’t know one another discuss selected topics with the assistance of a moderator [1]. Focus groups place people in relaxed group situations, as opposed to the controlled situations typical of laboratory testing.
Strengths
of Focus Groups Allow discussion flexibility. The relaxed, open environment of the focus group allows new issues to emerge for discussion. The moderator can explore these unanticipated issues with participants and ask probing questions to collect detailed information. Enable larger sample sizes. Laboratory
testing is usually done with fewer than 20 participants because of the time
and cost constraints of individual sessions. Not only are group (rather than
individual) sessions a “multiplier” of the moderator’s time, but also the
groups can grow somewhat in size—from a low of 5 or 6 people to a high of
about 10 people per session—without dramatically increasing the resources
needed. Produce difficult-to-analyze data. Participants may modify or even reverse their positions after interacting with others, making focus group data more difficult to analyze. Usability specialists must take care to interpret comments in context and in order, and to avoid drawing conclusions too early. Require trained moderators. A skilled moderator is key to obtaining the best results from a focus group. Untrained moderators typically lack the expertise to ask open-ended questions, use techniques like pauses and probes, and avoid influencing the discussion (especially if they are part of the product development or marketing team). Behave in varied ways. Each focus group has unique behaviors. One group might be lethargic and dull, the next energetic and stimulating. Conducting multiple sessions helps balance the individual differences that may emerge from a single focus group. Complicate scheduling. Each focus group
requires that all participants come to a specified place at an appointed
time. Laboratory testing offers more flexibility in scheduling, because
individual participants usually have choices of test times and dates. We use two-person teams because an effective focus group moderator must concentrate on the facilitating task: drawing out quiet group members, eliciting explanations of ambiguous or incomplete comments, making sure everyone’s opinions are respected. As a result, the moderator’s notes about the participants may be sketchy and uneven. Also, a trained observer provides “outside objectivity” that’s hard for members of a product team to achieve. The same immersion in their development and marketing goals that creates a successful product makes it difficult for the product team to step back and see it with fresh eyes. Each focus group typically consists of 7 to 10 participants who share key characteristics that relate to the study topics. The authors’ firm employs trained interviewers on our staff to recruit participants. Depending on the focus group topic, participant candidates may come from existing customer lists, our internal database of potential participants, or targeted newspaper ads. A detailed screening script and questionnaire, developed in conjunction with the client, ensure consistency in selecting participants. The moderator conducts the focus groups from a script outline that lists high-level questions and issues, with supplementary “probing questions” for each. However, because of the dynamics of the group environment, focus group scripts can’t be as detailed as those for laboratory testing. The moderator must remain flexible, which in turn puts more demands for in-depth note-taking on the observer. We supplement our detailed notes with videotapes and back-up audiotapes of the sessions. We then analyze the collected information and prepare a written report on the focus group findings. Product developers are sometimes concerned that usability specialists who moderate focus groups won’t have the in-depth product knowledge to answer participants’ questions, especially if the moderator comes from a consulting firm. The authors normally invite developers to observe focus group sessions. If difficult questions arise, the moderator offers the participants the opportunity to talk with the developers at the end of the session. This approach also gives developers the opportunity to ask questions that occur to them while observing, without interfering with the planned agenda of the focus group.
Usability
Focus Group Case History: Two 90-minute focus group sessions were held in the company’s offices, in a conference room set up with portable video equipment. A monitor was set up in an adjacent room so that the developers could observe the focus groups without disturbing or influencing the participants. One usability specialist moderated the sessions, and another took notes at the back of the room. The focus groups began with questions about the participants’ current Web use, which helped us to understand the participants’ opinions in the context of their experiences. For example, we asked the participants how they currently conducted Web searches and what they thought about various aspects of browser and search engine technology. We next introduced the first user interface design for the product by projecting slides on a screen at the front of the room. We asked participants what they perceived as the advantages and disadvantages of the product, how it would help or hinder their own search processes, and how they would feel about using it on a daily basis. We repeated similar questions for each of the three user interface designs being considered. To counterbalance the effects of which user interface the participants saw first, we reversed the order of presentation of the alternative designs for the second group. We would have preferred to conduct three focus group sessions, but project resources did not permit it. However, the behavior and opinions of the two groups were quite consistent. At the end of each session, we asked what participants thought about specific features that the developers wanted feedback on. We also discussed which of the alternative designs the participants preferred, and why. The focus group results showed that most participants in both groups clearly preferred the same user interface design. Typical concerns centered on the placement of buttons and icons, and on the overall ease of use of the product. However, participants did not agree on which features would be more or less useful to them. Heuristic Evaluation Heuristic evaluations are expert evaluations of products or systems, including information systems and documentation. They’re conducted by usability specialists, domain experts, or—preferably— “double experts” with both usability and domain experience. Evaluators use industry-accepted guidelines for usability (“heuristics”), their own experience from prior usability studies, their domain knowledge, and their ability to “put on the user’s hat” when identifying problems and recommending solutions. Heuristic evaluation by two or more usability specialists can identify a majority of the usability problems in a Web site, with the problem-identification percentage increasing as you add evaluators [2]. More evaluators not only find more problems, but also provide a better indication of problem severity when they jointly analyze the results of their independent evaluations.
Strengths
of Heuristic Evaluation Increases the value of lab testing. As the first phase of a two-phase usability effort, heuristic evaluation can greatly increase the value of laboratory testing. Heuristic evaluation “harvests the low-hanging fruit” by identifying obvious or clear-cut usability problems. Unmasks hidden usability problems. Without prior heuristic evaluation, test participants may spend much of their sessions struggling with an obvious usability problem. Meanwhile, other equally important usability problems can be “masked” by the first problem and not be found during testing. Is suitable for early use. The two-phase approach of heuristic evaluation followed by laboratory testing is consistent with current iterative software development practices. For example, heuristic evaluation can take place on an early prototype, before changes become costly, while laboratory testing can follow at the alpha stage.
Concerns
About Heuristic Evaluation Rarely emulates key audience groups. Heuristic evaluation rarely emulates all the key audience groups for a Web site. For example, user groups accessing a site devoted to financial planning might include accountants, stock brokers, insurance company professionals, insurance agents, financial planners, bankers, SEC attorneys, and more. Depends on evaluator expertise. Heuristic evaluation is highly dependent on the skills and experience of the evaluators. Usability specialists may lack domain expertise; domain specialists are rarely trained or experienced in usability methodology. The authors prefer to concentrate on usability expertise because the developers can usually fill gaps in domain knowledge. Can appear to be just another opinion. For every new Web site, the developers often have strong design opinions. The results of an heuristic evaluation can sound like just another opinion, and why should the developers accept the usability specialists’ opinion over their own?
Methodology for Heuristic Evaluation The findings are usability problems and concerns about the Web site, as well as notes of successful features that shouldn’t be changed. Often we can recommend specific improvements; sometimes we only suggest design directions to follow. We generally organize our findings into four categories: user task support, UI behavior, presentation, and terminology. Although there tends to be overlap in findings among these categories, using the categories ensures that we give full attention to each aspect of a usability problem. The evaluation team always delivers a written report of findings and recommendations. When practical, we give an oral results presentation as well, to discuss the findings with the developers.
Heuristic Evaluation Case History: The publisher commissioned a series of usability studies of the Web site user interface. The first study was an heuristic evaluation to identify first-tier problems that did not require collection of user data to identify—problems such as inconvenient placement of screen elements, unfamiliar terminology, and cross-platform readability issues. The software engineers developing the site were receptive to the value added by usability assessments. Many of the issues the evaluators identified had already emerged in development discussions and informal UI walkthroughs. In addition, although the prototype user interface had not yet undergone graphic redesign, the heuristic evaluation results gave the site’s graphic designers more insight into how users approached their search tasks. Meanwhile, the developers worked from the evaluators’ suggestions to create a more usable interface for the next prototype, on which we conducted laboratory testing. Laboratory Testing In laboratory-based usability testing, people whose characteristics (or “profiles”) match those of the Web site’s target audience perform a sequence of typical tasks using the site. The test participants, usually working one at a time, perform the same tasks under controlled conditions. A detailed description of formal usability testing methodology is beyond the scope of this paper. Several recent books and papers discuss laboratory testing in detail [3, 4]. A previous paper by one of the authors compares laboratory testing to several other usability methods [5]. Laboratory testing of Web sites can explore questions with measurable answers, confirm or challenge the assumptions of developers, and help choose between design alternatives. Recording user behavior on Web sites is especially challenging, because users of Web sites can take numerous possible paths to reach their goal—and often cycle through pages repeatedly.
Strengths of Laboratory Testing · Which of two alternative designs for a home page is more successful, and why? ·
What problems do people encounter performing product
registration on a Web ·
How long does it take people to find desired information on a
search site? ·
What problems do people encounter when downloading software
from a Web
Reassures managers. Corporate managers accustomed to numerical data usually find laboratory testing reassuring. Convinces observers. If the Web site developers can watch actual test participants having problems using the site, this experience is often more convincing than the opinions of usability specialists, however similar. (A dedicated laboratory facility isn’t required; developers can observe at a remote monitor through a video-camera feed, or watch videotapes after the test sessions.)
Web sites can be moving targets. Especially when navigation from the home page is an issue, a changing Web site can degrade the script developed to explore the issues identified for the lab test. Web site developers have to resist modifying a particular Web page version while laboratory testing takes place; usability specialists have to be willing to adjust the script right up to the day before the test. Lab tests require more resources. Even usability testing with tightly focused issues and 4 to 6 participants per audience group [6] requires more resources than heuristic evaluation and usually requires more resources than focus groups. Lab tests often take longer. Because of the need to recruit participants with profiles that match the target audience for the Web site, it’s difficult to gain reliable data from a laboratory test in less than three weeks from the project start date, and many laboratory tests take considerably longer. The entire process typically takes from four to six weeks, including results reporting.
Methodology
for Laboratory Testing The vast number of user path alternatives at a Web site, especially a large informational Web site, makes usability test task scenarios trickier to scope. Rather than directing users to specific paths, our approach has been to allow users to go wherever they please to perform a task; we track where they go and their stated reasons. The greater the number of users recruited, the more we can assess which pathways are more frequently traveled and why. The browser history list does not adequately record the order of pages visited, the links selected, or how much time users spent on each page. Server logs provide vast amounts of data that requires time-consuming analysis, and even then one does not know why a user spent a lot of time on a page. The note-taking method used at the authors’ firm captures these types of information, which we believe are critical to understanding the scope of usability problems at a Web site [7]. We embed in the script specific prompts for note-taking about user activities. We also have a printout of the Web pages themselves on which to note where users visited and in what page order. Of course, we also videotape the test sessions, but our clients usually want the results more quickly than we can deliver if we need to watch all the videos.
Laboratory
Testing Case History: Two usability specialists administered and observed the test sessions, using participants who met the screening criteria for people who would be likely users of the Web site. During the test sessions, participants “walked through” some two dozen Web pages to search or browse for particular types of products, view product information, and refine searches. The usability team collected both qualitative and quantitative data, including which choices users made to find product information and how satisfied they were with the results. The test administrator also interviewed participants about the improvements they wanted in the final Web site and their preferences for planned features, such as automatic updates about selected products. Participant experiences and comments indicated that the Web pages were generally logical and easy to use. However, certain terminology, screen elements, and page-layout choices continued to slow first-time use; participants also wanted more options at the highest level of the information hierarchy. The authors’ recommendations included alternatives that would address participants’ problems in these areas. Conclusions In considering which of the three methods presented in this paper to try first for evaluating Web site usability, let’s suppose an organization or company has just a small window of time in which to prove the value of usability research in the development cycle. In that case, the authors recommend starting with collection of primary user data through laboratory testing. Actual user data will convince more people, especially in engineering-driven companies, than will usability focus groups or heuristic evaluation. If an organization is receptive to usability research or already has a usability program in place, an iterative sequence of usability focus groups, heuristic evaluation, and laboratory testing achieves the greatest value from each method: ·
The usability focus groups gather user requirements and
opinions to support ·
The heuristic evaluation makes a pass at catching the most
visible usability ·
The laboratory testing validates the resulting product
improvements and focuses Would the authors ever recommend usability focus groups or heuristic evaluation alone? Yes, when resources and priorities make it necessary—because some usability evaluation is better than none at all. For example, if a new Web site differs greatly from its predecessors or if the development team recognizes they need more information about user needs and desires, it’s especially important to conduct usability focus groups early in the design process. Another opportunity may occur later to obtain resources for usability testing. On the other hand, if a site has already undergone iterative testing and is now receiving minor revisions, or if a site has an extremely small usability budget, then a modest heuristic evaluation project can be entirely appropriate. Every usability project the authors have performed over the past ten years has produced many recommendations, usually both ideas for immediate implementation and others that influence long-term development. Although developers can generalize more reliably from laboratory testing data, all usability research methods can produce valuable results. AcknowledgmentsPortions of this paper appeared in a slightly different form in the SIQDOC 97 Proceedings, published by the Association for Computing Machinery.
References 2. Nielsen, J. (1993). Usability Engineering.
New York, NY: Academic 3. Dumas, J.S. and Redish, J.C. (1993). A
practical guide to usability testing. 4. Kantner, L. (1994). Techniques for Managing a
Usability Test. IEEE 5. Rosenbaum, S. and Kantner, L. (1995). “Alternative
Methods for Usability 6. Virzi, R.A. (1992). “Refining the Test Phase of
Usability Evaluation: How 7. Kantner, L. (in press). “Following a Fast-Moving
Target: Recording User |
![]() |