Back to Diane Nahl Home Page
Back to Table of Contents of the User-Centered Revolution Article

Search Behavior as Problem-Solving

There are three broad components of information seeking (p. 402): Information need, search strategy, and execution of strategy. Search behavior is redefined as "a problem-solving activity in which subskills are organized and retrieved according the individual's perception of a current information need" (p. 403). For example, conducting a search is a broader unit within which are embedded within hierarchical sub-skills one of which might be the strategy of using periodical indexes. Similarly, this sub-skill has other organized sub-skills within it, such as deciding on search terms, selecting the right index, looking up subject in the index with most current first, etc. Each sub-skill is itself composed of other hierarchically organized skills. Search behavior is thus "goal-directed problem solving" (57), (p.403). User-centered instruction takes this into account: "Library use is an activity of one mind seeking contact with other minds. The study of the cognitive processes of library users allows librarians to develop a new focus on the inner micro-environment of information seekers" (57) (p. 407).

Indexing is the language chosen by information specialists to reflect the "aboutness" or content of documents. "User-centered indexing," according to (58) (p.572), "cannot be developed before searching behavior is understood better." A number of issues that guide the practice of indexers are related to users. One is the issue of user language, that is the level and specificity of the index terms. Another is the exhaustiveness or comprehensiveness, i.e., what to cover and what to leave out. In the older paradigm, also known as "the document-oriented approach" (58) (p.573), "indexing can be done with no knowledge or consideration of users or their needs," but in the new paradigm, "request-oriented indexing" acknowledges the centrality of users. One example cited by Fidel is a study in which a "filtering" technique is employed to check each descriptor chosen by the indexer: "Would any one of our users who is interested in the content of this document use this descriptor as part of the query formulation?" (58)(p.574). Request-oriented indexing depends on the indexer's ability to anticipate user requests. Each document is represented in the database by a list of anticipated requests, and these elements form the index language.

Automated indexing has grown in the past 30 years and its developers claim that retrieval performance is at least as good as systems that use intellectual indexing (58)(p.575). Several reasons are given in favor of the user-friendliness of automated indexing. One is that search requests expressed in natural language are accepted and Boolean query formulation is not required. A second advantage cited is the provision for relevance feedback based on user ratings of retrievals, and how these can improve the search. Other advantages mentioned are ranked output of retrievals in terms of relevance and the capability of automatic query expansion. Thus, on the one hand, automated indexing provides advantages in flexibility, dynamism and control. On the other hand, its terms are limited to the language of the text itself since no descriptors or keywords are employed.

An experimental study attempted to implement a user-responsive search system that allowed people to "add keywords to the indexing of an online retrieval system based on their use of the documents in the system" (59)(p. 153). A dynamic system that adapts to the needs of users is indeed a user-centered feature, though we still lack the practical knowledge for implementing such a system. In this case, the system had difficulty dealing with user language and error control. Variant spellings, errors in spelling, and terms that are too broad need more complex expert systems than are now routinely available. In an attempt to remedy this insufficiency (7) tested ELSA, an intermediary system that sports the capability of "intelligent search functions" that solves some of the problems associated with user spelling and language. For example, when users are not familiar with the syntax of a system, they may type in "Search James, William" which does not work as the system expects "aut/James, William." ELSA helps with this problem by accepting variant forms for author searches and by making available lists of available concepts as well as lists containing synonyms that trigger available concepts.

There has always been the recognition among indexers that indexing and searching are related, but now the powerful tools of automation are providing new user-centered capabilities. The trend is not to choose one system over another, but to make all systems available simultaneously in the service of accommodating user styles, needs, and backgrounds. This is especially desirable since there is "little agreement in the names people use and the names recommended for use by LC [Library of Congress Subject Headings], implying that retrieval systems should do more to accommodate common naming behavior" (60) (p.116).

Types of Moves in Searching

To understand the user's perspective, one researcher (61-63) has studied how users make decisions by observing their 'moves' within a search activity. Examples of moves made by searchers include any activity that allows the searcher to continue progressing:

-- using help files

-- reading the screen

-- issuing commands

-- guessing

-- re-reading the screen

-- asking questions of themselves or others

-- consulting documentation

-- trying to get to a menu

-- trying to get out of a menu

-- trying to figure out where they are

-- modifying a search string

-- displaying a record

Fidel has studied several aspects of online searching behavior, including, (1) choosing databases, (2) choosing search terms, (3) conducting the search, (4) reviewing results (feedback review), (5) making new decisions in response to the evaluation of results and, (6) terminating search and getting printout (61-63). Her approach consists in identifying incidents where a search key was selected, and then fitting each incident into a decision tree, resulting in a catalog of criteria for decisions. For instance, during the process of choosing search terms, the first decision point is to determine whether a "single-meaning term" (which is good for free-text searching) is wanted or a "common term (broad and fuzzy meaning with too many contextual variations)" (61) (p.493). In case it is a single-meaning term, the next decision point is to map the term to a descriptor. For example, the topic "Anxiety about using computers" is matched to descriptors such as TECHNOPHOBIA (narrower) or ANXIETY (broader). This matching process involves a semantic content (or the concept) and the system language (controlled vocabulary). If no exact match is found, then a partial match may work in conjunction with textwords (key terms) for an inclusive search. If no match is found, textwords can be used to probe indexing further. This case history approach demonstrates that searchers use intuitive as well as explicit "rules" for decisions. Descriptive efforts on how users make decisions in an information retrieval situation produce knowledge engineering trees that can help in the design of intermediary systems that assist the searcher through 'dialog box' inquiries about the user's purpose and scope of the search as well as requesting evaluation of sample retrievals. According to Fidel,

Understanding how searchers of all types look for information, and how they interact with existing systems, can provide guidelines for searchers' training and assistance (62) (p.501).

It is evident that decisions during searching are guided by users' perceptions. These can be identified by asking them to state the reason for a particular decision. These reasons or perceptions can often be validated by objectively observable evidence such as transaction logs that record what keys have actually been pressed in sequence. Another method of validation consists in examining frequency distributions and correlations across several searchers for a single search, or across several searches for one individual. Other variations include the number of databases, different subject areas, type of information venue (research, industry, other), and number of moves required (task complexity). Having conducted research with these variations, in her conclusions, Fidel emphasizes the finding that search decisions and strategy are heavily determined by system features, especially the structure of controlled vocabulary and the availability of effective online thesauri:

Thus, research should be carried out to discover which features of databases and their thesauri can be standardized without affecting retrieval quality. The role of intermediary expert systems will then be to bridge the necessary differences, employing switching languages and other terminological and semantic networks (62) (p.514).

Fidel analyzed the verbal and search protocols of 47 professionals in the hope of discovering "what characteristics of searching behavior constitute a searching style" and "in what way one individual searcher is different from another, all external conditions being equal" (63) (p.515). Fidel, citing Fenichel (64), estimates that research on online search behavior has declined recently because prior experiments have failed to provide conclusive results on individual differences such as experience, cognitive attributes (65), personality traits (66), and type of request (67).

A common explanation given for these partly negative results is that "individual search styles override most measured attributes of searching behavior" (63), (p.515). Fidel thinks that this attitude is premature since we don't yet understand the specifics of search styles. She analyzed data on verbal protocols of thought processes while searching and from interviews with searchers to determine reasons for their search key selections. Fidel built a two-layered model to represent the derivation of six types of moves. A move is defined as any modification in search strategy aimed at improving results. Operational moves do not change the meaning of a request, while conceptual moves do. For instance, Type 2 moves are operational moves aimed at increasing precision. Other examples of moves include the following:

Type 1

-- Intersect free-text terms to occur in a predetermined field.

-- Limit time (by date).

Type 3

-- Add synonyms and variant spellings.

-- Eliminate restrictions previously imposed.

Type 4

-- Intersect a set with a set representing another query component.

-- Select a narrower concept.

Type 6

-- Enter a broader descriptor or term.

-- Group together search terms to broaden the meaning of a set.

Of the 1,244 moves made by the 47 professionals in their 281 searches, 60% were operational and 40% were conceptual. The number of moves to increase recall was about double the number of moves to increase precision. Fidel takes this as a sign of "the difficulty in achieving satisfactory recall in the databases currently available" (63) (p.518). Operational moves are used to improve precision or to reduce a set, while conceptual moves are used to improve both precision and recall. The average number of moves per search was 5, with a range of 1 to 18. It is of interest that these professional searchers only used 25% of the available moves, indicating that "search systems should remind searchers of the complete array of moves possible in online searching" (63) (p.519). Interactivity (number of moves) was not related to purpose, concern with recall, or subject area. It thus appears to be an element of search style.

In conclusion, Fidel uncovered three search style characteristics of professionals that were not related to results or success:

(1) Level of interaction or number of moves;

(2) Preference for operational versus conceptual moves;

(3) Preference for using textwords versus descriptors.

Nahl believes that these search style measures can be influenced by the speech act content of instructions, as well as by other searcher characteristics such as success, satisfaction, self-confidence, and frustration (20). Since both the cognitive and affective domains of the user's world must be addressed, it is clear that the new user-centered perspective needs to focus on a broader spectrum of the user's involvement in the search process.

Reformulations

According to Dalrymple, "the user-centered approach derives its questions and methodology from the behavioral sciences" (68) (p.272). In her study, subjects were assigned search problems and their cognitive processes were studied by having them think-aloud while searching. Of special interest were their "reformulations," an expression which refers to the attempt to refine the terms of a query. Reformulations are assumed to be cognitive processes that interact with searchers' long term memory retrieval operations, and may be conceptually similar to database access and retrieval operations. "Presearch reformulations" transform the language of the information need into terminology that initiates the search. "Search reformulations" are modifications of search terms during the search. For example, the statement, "I'm looking for books about women writers in 20th century German literature" needs to be translated into the form, "Feminist writers." This presearch reformulation may be modified later during the search by adding, "Feminist writers German literature." Dalrymple found that a significantly greater number of reformulations were made when subjects searched an online catalog as opposed to a card catalog. She calls for additional investigations on reformulation to determine whether it can be taught, how it is affected by the urgency of the information need or by individual differences in search style, and whether it acts as a request for feedback from the system.

Transaction-Log Analysis

The need to study more closely the cognitive processes of searchers was aided by software capabilities that permit a detailed approach to the study of online search behavior known as transaction log analysis. The authors of a "manifesto regarding the future of transaction log analysis" Sandore and colleagues (69) urge the development of unobtrusive measures for expanding the information base librarians have about users' style and pattern of information seeking, including searching, browsing habits, circulation activities, and evaluating and digesting information. Such expanded "tracking" will allow the resolution of the fundamental questions "whether it is the system or the user that requires improvement" (69)(p.106). The authors suggest that

It is possible to conceive of an IR system that was so adaptable that it could make searches successful when conducted by someone with a totally misguided mental model of the search process. If we design systems that are 'idiot proof,' are we encouraging mediocrity by not attempting to educate users with feedback from their searching mistakes? (69, p.106).

These researchers advocate helping users become proficient searchers, and to accomplish this, one needs a research program using transaction log analysis to advance the knowledge base that information professionals have available about users. Transaction log analysis "provides a glimpse of the world of user searching that is otherwise inaccessible" (69) (p.106). The sequence of cognitive activity is captured in the verbatim record of key strokes of a searcher, representing decisions made throughout the search. Common strategies, varieties of errors, predominent search modes, and the like are revealed, providing heretofore unavailable information on how users adapt to the system.

User-centered design in the 1990s dovetails with the area of human computer interaction (HCI), incorporating cognitive engineering, software engineering and usability engineering, including artificial intelligence (AI) (70). There is a new awareness and concern about the need to "overcome the gulf that exists between system designers and users" (70) (p.438). There is a new desire to deal with users' psychological blocks and frustrations by building systems that accommodate the various levels of skills and try to eliminate unnatural human-computer exchanges such as, for example, having to wait for the execution of a command without knowing how long it will take, or even, if it is processing the command. Currently, more information retrieval software designers are providing transaction log utilities that can aid in studying local use of systems. Transaction logs do not capture the thoughts and feelings of searchers, and thus cannot provide insight as to why search decisions were made. To achieve a fuller description of an individual's search process, a complementary methodology is required to supplement the transaction log data.

The Affective-Cognitive Connection

The desire to assist novice users in learning to operate unfamiliar information systems has naturally focused on what knowledge needs to be imparted to allow users to become independent explorers. However, working with novices within an educational setting made it clear to several researchers (19, 56, 60, 71, 78) that dealing with cognitive issues is not sufficient. The affective domain in learning is equally important and possibly more fundamental. Without the desire and continuous motivation to learn, few novices would become independent searchers and cyberspace navigators. There is a significant dimension of resistance to information seeking that reflects feelings of technophobia, information shock, and depression commonly experienced by novices within a bewilderingly complex information world.

There has been a lack of integration of cognitive and affective functions in the user as illustrated in an experimental study on 120 sixth graders, comparing their recall scores on entries from Compton's Multimedia Encyclopedia on CD-ROM (72). The research question was whether multimedia presentations were more effective than simple text-on-screen. No significant main effects were found, and the authors conclude that "it is not sufficient to add animation to text in order automatically to enhance children's learning" (72)(p.527). Though this conclusion is qualified in some ways, it appears that the system's assessment is based on the children's performance on cognitive skills alone (recall and inference). The authors report that "reactions to the CD-ROM were almost uniformly very positive" and that students "found the animation sequences appealing," but still conclude that "its effectiveness as a means to improved learning is dependent on a number of factors such as text, text type, and media integration" (72)(p.527). It is clear that the affective aspect is given no weight in the assessment of improved learning. The tendency to use comprehension measures without integrating them with affective measures is a characteristic feature of user studies in the older paradigm.

The following is an example of an experimental study which does not integrate the affective and the cognitive domains, but gives equal weight to them in systematic contrasts and evaluation. Seventy graduate students in library and information studies were introduced to HyperLynx, a hypertext bibliographic information retrieval system (73). They were given five search queries, one of which was "to find documents that discuss information retrieval effectiveness." Search success varied with the query, some were more difficult than others. In the cognitive domain, knowledge of hypertext systems was assessed in interviews, and it was found that this variable had no effect on ability to search or success. In fact, the knowledge differential was small since few students had more than a slight reading knowledge of what hypertext is. In the affective domain, users were asked to compare the hypertext system with the Boolean-based systems they had experienced before. There was a generally favorable response to the hypertext system by the majority, but at the same time all subjects expressed frustration at not being able to use Boolean searching with the full text documents. It is concluded that user "preference for string searching for particular search tasks such as authors and very specific content suggests the need for a hybrid system with both string and hypertext search capabilities" (73)(p.27). In this study the cognitive and the affective domains were equally addressed, to the greater benefit of users as they struggle to operate in fluid information environments.

An explicit attempt to integrate the affective and cognitive domains of information behavior is found in a study by Metoyer-Duran (74)(p.320). She was interested in how information in a community is diffused by "information gatekeepers," or "persons who help individuals gain access to resources needed to solve problems." Information providers and librarians are information gatekeepers by profession, but many people in a community act as agents of information diffusion, including for example, housewives who attempt to alter the family's eating habits and face them with a new dietary philosophy for which they rely on new information. She presents a taxonomic framework with three interacting domains: (a) cognitive skills, specifying the conceptual knowledge needed for information literacy, including terminology and logical reasoning strategy; (b) technological skills, referring to the operational literacy of information systems and data structures; (c) affective skills, referring to the gatekeeper's "innovation diffusion and learning behavior" (74)(p.332), including resisting or impeding the flow of information, or absorbing information selectively and distributing it to others.

The importance of connecting affective and cognitive functions is well known in theory, though considered difficult to apply in practice. Two comments by psychologists illustrate this emphasis, the first goes back to the first modern behaviorist (i.e., after Descartes) and the second is a current research report sponsored by the Air Force Human Resources Lab:

Man has two faculties: will and understanding. When the understanding is governed by the will they together constitute one mind, and thus one life, for then what the man wills and does he also thinks and intends (75). (No.35)

these two parts, the will and the understanding, are most distinct from each other, and for this reason, as before said, the human brain is divided into two parts, called hemispheres. To its left hemisphere pertain the intellectual faculties, and to the right those of the will (75)(Nos. 45; 644).

All activities are changes of state and variations of form, and the latter are from the former. By state in man his love is meant, and by changes of state the affections of love; by form in man his intelligence is meant, and by variations of form his thoughts; and thoughts are from affections (76)(No. 1146)

It is simply not possible to design either cognitive or psychomotor instruction without including some affective component. The very act of establishing an instructional goal implies some value to the person, organization or society in its achievement. The motivation to learn may already exist in the student before instruction, or it may need to be generated or enhanced by the instructional program. It is precisely because the affective is so entwined with the cognitive and psychomotor learning achievements that it needs careful attention during the design and development of instruction (77)(p. 18).

Back to Diane Nahl Home Page
Back to Table of Contents of the User-Centered Revolution Article