5. Term Preferences. The availability of more than one term to represent a single concept is often advantageous, but it mey also be confusing. Readers, for example, may not know whether the different terms are technical synonyms for a single idea, or whether some subtle distinctions are intended. To overcome the confusions that may arise from this fact, an effort is made in some terminological systems to select one of the synonyms for a given concept as its "preferred term." Synonyms for the preferred term are then listed as "permitted" or "deprecated" terms. Unless there is already a consensus on the choice, however, attempts to standardize usage in this way are likely fail. The Committee on Terminology of the International Standardization Organization (ISO/TC37) has evolved a procedure whereby authoritative committees of a subject field publish "standardized vocabularies" in which a preferred term for each defined concept is prescribed.
Although this approach may eventually prove feasible in the social sciences, at present more modest goals seem to be necessary. Accordingly, no effort is made in an INTERCOCTA glossary to prescribe one of the technical synonyms for a concept as its preferred term. Rather, it seems desirable to give users as much information as possible about each of the terms for a concept so that they can make their own choices. The fact that several synonymous terms for a coneept are available should lead users to mention them, to specify which will be used, and to warn users when the use of a synonym might lead to ambiguty. Such enumerations also facilitate information retrieval by the identification of various possible index terms (descriptors).
5.1 Marking Terms. Eventually we intend to label terms in various ways to help users determine which will most effectively comnunicate to prospective audiences. However, in this pilot edition we restrict our markinq of terms to three basic features identified in each record as "UT," "ET," and "ST." An "unequivocal term" (UT) is one that has no other meaning in the given subject field. Such terms may, of course, be polysemous, i.e., they have other meanings in different fields of knowledge. However, on the assumption that users will know that a word falls within the domain of "ethnicity research," a term will be considered "unequivocal" when it has only one sense in this field.
By contrast, a term is "equivocal" (ET) if it is likely to be ambiguous when used within the field of research on ethnicity. Normally this means that two or more concepts important to the field can be designated by the same term. In such cases the index to the glossary refers users to the separate records in which each meaning of an equivocal term is defined. To help users select the appropriate record, a second term, in parentheses, is added after the index entry for an equivocal term. Sometimes, however, a term is also considered equivocal just because, as used in research on ethnicity, it is likely to be ambiguous -- a note is then included in the record in indicate why such ambiguity may be likely.
When all the available terms for a concept are equivocal (and also if the unequivocal terms are very awkward and inconvenient to use) a "suggested term" (ST) is added. A special rule governs the use of such terms: they must not be quoted and used on the authority of the INTERCOCTA glossary. However, contrary to normal usage, anyone is free to use a suggested term as though it were his or her own idea. Anyone who wants to use a concept identified in a record but finds the suggested term unsuitable can also propose any other terms that he or she prefers. After doing so, however, the user is strongly urged to submit a copy of the text or a citation that will, subsequently, justify the inclusion of whatever term was used as an addition to the glossary record -- preferably as a new "unequivocal term" (UT).
5.2. Lexiconizations. By contrast with suggested terms, we may notice that sometimes specialists use words in a technical sense that corresponds with a sense the same waord or expression already has in general language usage. An example is the word, 'ethnocentrism,' which is defined in Webster's dictionary as "regarding one's own race or cultural group as superior to others." Such terms may be called "lexiconizations" because their relevant meanings have been lexiconized, i.e., included in a general dictionary. To call the attention of users to this fact a copy of the dictionary definition should be included in the record as a citation, and a code letter for the dictionary used may follow the term, in parentheses: e.g., 'ethno- centrism' (W). Lexiconizations can normally be used with confidence but sometimes it happens that popular connotations of a lexiconized term give it additional meanings that are not intended or may even prove confusing in a technical context of use. It seems useful, therefore, to sensitize users of an INTERCOCTA glossary to the fact, when it is true, that one of the terms they use in technical discourse carries a similar if not identical meaning in general language contexts.
5.3 Special Coding. In every field of knowledge, there are a variety of points of view, paradigms, frameworks, ideologies, etc. which affect the use and interpretation of terms. The simplest examples involve geographic differences: what the English call a "lift" is what Americans call an "elevator" -- or what British anthropologists call "cross-cultural contact" may be what Americans mean by "acculturation." In the field of ethnic studies, Soviet scholars may use a term that translates as "ethnic appurtenance" to mean what American or English writers will call "ethnic marker." One cannot say that one such usage is better than another, but it is important to know who uses which term for what concept -- and, of course, what theoretical or paradigmatic frame of reference underlies each usage.
The INTERCOCA glossary gives users a systematic way to untangle the confusions that such differences in usage might cause. Each glossary can be accompanied by a special coding system for its subject field. Such a system might involve the use of differnt letter -- e.g., A: American; B: British; R: Russian -- to identify different contexts of use. Accordingly, whenever this information seems to the editors to be useful and relevant to user needs, an appropriate code, in parentheses is added to each term. Using this model we might illustrate by an expanded hypothetical text for [1] as follows: [la] *** : ethnic marker(B); ethnic quality(A); ethnic appurtenance (R)
Having such information available will not only enable one to interpret a text more precisely but also, if one is writing for a particular audience, one will know which terms comnunicate more clearly to that audience. If one has several concurrent audiences in mind, one can also use synonyms pleonastically, putting parentheses around synonyms that might be understood better or more precisely by different readers. We could, for example, write "ethnic marker (appurtenance)" if we wished to reach not only a Western but also a Soviet audience, or one could explain in more detail that what we refer to as "X" is the same as what someone else calls "Y". In future editions of this work, we expect to provide such codings for the terms defined in it -- however, in this pilot edition the only labels offered are those that distinguish between equivocal (ET), unequivocal (UT) and suggested (ST) terms. 6. Selecting Terms. The treatment of "term preferences" discussed above will help users choose the terms that enable them most con- veniently and unambiguously to express their intentions. The editor of an INTERCOCTA glossary, however, also has to select terms to use in definitions. To illustrate, let us suppose that in one definition we use 'ethnic marker' for the concept defined in [1], but in another we use 'ethnic quality' and in a third, 'ethnic appurtenance.' We will then run up against a serious difficulty in the design of the alphabetical index. In such an index, the three different synonyms for a single concept would necessarily occur in three different entries, and this would frustrate our intention to bring together in one place all the reference numbers for the records in which any given concept is used (entailed).
To make sure that all index references to the same concept come rogether in one index entry, therefore, we must "select" one of the synonymous terms for each concept as the only one to use in defining other concepts -- i.e., as its "entailed term." This synonym is what we call a "selected term." It is very important to remember that a selected term is not a preferred term. In other words, although just one term for a concept is selected for use within the glossary in defining other concepts, users should feel no pressure to choose that term in preference to others when they use the designated concept in their own research and writings. Remember that each "selected term" is just the editor's choice of a synonym to be used as the "entailed term" when writing definitions.
Although writers should feel no obligation to use the selected term, they need to know which term to look for in the index if they want to find all the records in which it is entailed. (References to entailed concepts are not given in the index entries for non-
selected terms). To help them -- and to help the editor also -- a simple convention has been adopted, namly to list the "selected term" first in line in the record where it is defined, i.e., as a "lead term." Normally this will be an unequivocal term (UT) -- hence the selected term is the first term in the first line. However, if there is no unequivocal term for a concept, our rules call for the additlon of a suggested term (ST), and for its use as the selected term. If more than one ST is offered, the first In line -- i.e., the "lead term"-will be the selected term for that concept. However, this will be a rare situation. In no case, incidentally, can an equivocal term (ET) be used as a selected term: whereas the context of textual use can disamblguate an equivocal term, its use in a definition would be ambiguous.
As noted above, when an equivocal term is defined in two or more records, separate index entries are used to refer readers to each record. However, a synonym is inserted in parentheses to help users discover which of the several posslble meanlngs of an index tern is intended. The "pleonastic" term used in thls way (in parentheses) should also be the selected tern for the concept concerned.
7. "Framing" the Glossary." The main text of an INTERCOCTA glossary consists of a taxonomy of records for individual concepts used in a subject field. Such a text requires more explanation than is normal for alphabetical glossaries and it is therefore, necessary to "frame" the main text with a variety of elements that both precede and follow it. The volume starts with an "introduction" -- i.e., the present text -- which explains the rationale for, and ways to use, the glossary.
There is also a "preface" which provides organizational and historical information about the circumstances leading to the launching of the INTERCOCTA process, and those guiding us to the selection of "ethnicity research" as the focus of this pilot project. Two handbooks are also being prepared that will provide: (l) an introduction to the purposes and structure of INTERCOCTA qlossaries for anyone interested in startinq one for their own field1 and (2) a manual with concrete quidelines and examples intended to enable the editors of future qlossaries to compile them in a "standardized" way so that they can easily be distributed in series with the present volume and a consolidated index, thereby constituting our planned conceptual encyclopedia for the social sclences.1 The remaining elements in the frame of an INTERCOCTA glossary follow this introductlon.
7.1 The Parameters. In order to decide what concepts to include in a glossary and what to exclude it is necessary to start wlth a workinq definition of the subject field to be covered. An essay by Eric Casino, "The Parameters of Ethnicity Research," reproduced in Part I of this volume, defines the scope of the field and identifies various ways of looking at and studying it. It identifies some of the approaches, paradigmatic and theoretical framworks, and ideological presuppositions of specialists writing in this field. A list of organizations and journals concerned with ethnicity helps to identify the community of scholars from whose work the glossary has been derived, and to whom it is now addressed.
Although, in a logical or philosophical sense, the scope of an INTERCOCTA glossary has to be bounded by rational criteria, in a pragmatic or sociological way the existence of a viable discourse community is central to the launching and future development of our project. The point is that our onomasiological, computer-based methodology can succeed only to the degree that scholars actually doing research in the intended subject field -- -e.g., "ethnlcity research" -- become actively involved in the project. First of all, the concepts and terms reported in the glossary must, above all, be those that have evolved in their work. But even a perfected glossary -- and this one is far from "perfect" -- would serve no useful purpose if it was not actively consulted and quoted. Only as writers come to depend on the glossary for cues to help them select terms, in cases of doubt, so as to enhance the clarity of their communications, and as they also rely on the glossary to give readers a clear sense of what their terms mean so that they can avoid continuously re-defining key words, will the true value of this approach be realized.
We anticipate, of course, that the active use of the glossary by researchers -- and by the editors who publish their work -- will in turn induce readers to rely on it as a desk companion to help them interpret texts. As readers become more familiar with the glossary, a process of positive reinforcement (circular causation) should become increasingly vivid: writers, knowing that their readers have access to this tool will become increaslogly committed to its use. This commitment will not, assuredly, involve any slavish pressure to restrict their choice of vocabulary since, after all, they are not only offered a choice of terms for many individual concepts, but they are even encouraged to coin and report new terms if, for any reason, they find terms already in use are either cumbersome or ambiguous. Increasingly, moreover, information specialists, as they design or revise retrieval systems, will utilize the glossary when indexing documents and selecting descriptors. This extension of the INTERCOCTA process will, of course, enable researchers to find, with less effort, documents that contribute most efficiently and precisely to their investigations.
The essay on the paramters of the glossary's subject field, therefore, seeks not only to outline the logics of the field, but also shed light on its organization and points of vlew. It helps the editor select documents, concepts, and terms from the research literature; it facilitates the denvelopment of a lively network of users and participants in the further revision and augmentation of the computerized data base; and it provides important background infomation for anyone seeking to draw maximum benefit from the use of the glossary.
7.2 The Taxonomy. An overriding norm for any INTERCOCTA glossary, as we have emphasized above, involves the capacity of users to figure out whether or not any given concept has already been included in its data base. To meet this need a classified scheme (taxonomy) is required that places concepts in a predictable location. In the complete glossary there is so much detail that it is difficult to grasp the logic of the scheme as a system of concepts. Consequently, we need a conspectus of the whole scheme.
The first part of the conspectus is an outline that presents the main logical categories of the project. These categories fall into two broad groups: first the core concepts that belong, somehow, to the field as a whole; and, second the concepts used in various social science disciplines to handle research concening the ethnic features or problems that come up in that discipline.
Within the first group of categories, basic divisions are made by formal rather than substantive criteria: in other words, distinctions are drawn between activities, properties and entities. In a library classification scheme, where the items to be collocated are documents rather than concepts, it is no doubt more appropriate to use "subjects" (each of which includes activities, properties, and entities) as the primary categories of analysis. The form categories, however, seem better adaped to our present purposes because it is easier to determine, in advance, whether any given cnncept relates to an entity, or to one of the properties or activities that can be attributed to such entities.
Whenever an ethnic concept has a special relation to any one of the social science disciplines, however, preference is given to its placement within a category dedicated to that area of inquiry. Within each discipline the sub-categories are again arranged accordlng to the sequence of activites (including, processes), properties, and entities.
Introductory and concluding sections are added: the former provides general concepts pertaining to the whole field and to ways of looking at it, whereas the latter is intended for concepts concerning contexts (milieu) and methods relevant to the study of ethnicity.
The preliminary outline of main categories provides an overview of the classes mentioned above. It is followed by an expanded outline that gives sub-classes and also, most importantly, it lists the terms for each concept that are defined in the glossary. Since all these terms are entered alphabetically in the index, users can easily find not only the glossary record for any given term but also the place in the outline where it is mentioned. This enables them to see at a glance what synonyms are listed and also to find out how each concept nests with closely related concepts. Readers may well find that before looking up the detailed information on any concept it will be helpful to see how its terms are placed in the taxonomy. 7.3 The Notation. An alphabetical notation scheme is used to tag the classes of concepts. Twenty six letters provide more elements than a decimal scheme and thereby make it possible to keep the class numbers as short as possible. Broad divisions (parts) of the scheme can be tagged by "ranges" consisting of several letters. The hierarchic relations of classes to each other, then, are expressed in the following way:
Parts: the main parts of the scheme are expressed by ranges, i.e., set of letters (e.g., A/E, F/I, J/L, etc.)
Categegories: individual letters are assigned to each category (e.g., A, B, C, etc.)
Classes: two-letter combinations are used. Typically, a vowel follows the first letter -- thereby space is left for future insertions, and the combination is often pronounceable, making it easier to remember the notation when referring from index to text (e.g., ME, MI, MO, etc.)
Sub-classes: three-letter cominations (e.g., NED, NEK, NEN, etc.)
These four levels of hierarchy handle all the material given so far in this glossary. However, a fifth level, using four-letter symbols, and so on, could easily be inserted as the data base grows.
The hierarchic principle may be compromised for the sake of simplicity, however. For example, if a concept found in one class has many sub-concepts, it can be used as the label for a new class. Thus, "ethnic community" occurs in class N for types of ethnic collectivities. This term is then picked up for use in the label for class NE, "Types of Ethnic Community," which, in turn, is subdivided by several different criteria to generate parallel sets of sub-concepts.
The arrangement of data is facilitated by a sharp distinction betwen the terms for concepts, and the labels for classes. In the notation scheme, numbers are assigned to each concept within a class. Consequently, while the class symbols consist of letters only, every concept has an alphanumeric notation. Hierarchic ordering of concepts within a class is indicated in the outline by indentions. Moreover, the numbering scheme assigns single odd digits to first level concepts and even digits to the second level. Third level concepts normally have two-digit numbers, in an even-odd sequience. Additional numbers, as needed, are inserted on the principle that two odd digits can be treated like one odd digit, and two even digits like an even digit. All numbers are ordered decimally, as though preceded by a decimal point. More detailed rules for the assignment of notation symbols are given in the editor's guidelines.
Users do not need to remember these rules but they are mentioned here because, if one does use them, they will facilitate recognition of the logical relations between concepts. The notation symbols, of course, are also needed to enable users to go from the alphabetical listing of terms in the index to the points in the outline as well as in the glossary where the meanings of concepts and their logical relations are specified.
7.4 The Bibliography. A bibliography of sources used both for the essay on parameters of ethnicity research by Eric Casino, and for the concept records contained in the glossary is given, in alphabetical order by author and date immediately following the glossary. The listing is necessarily preliminary and subject to continuous amplification. Its inclusion in the glossary's data base makes it easy to insert new titles as they are published and/or reported.
Readers are invited to suggest additions. However, they should remember that the listing is not intended to be either comprehensive or fully representative. To compile such a bibliography would command resources not available to us. A more restricted goal has been sought: namely to include works that provide examples of as many relevant concepts and terms as possible. If a concept is used and its meaning clarified in one text our purposes are as well served as if we quoted from a different source, even one that is superior in other respects.
Code symbols given after some items in the bibliography are used in the glossary's records to identify the sources from which a quotation has been excerpted. Page numbers follow the code symbol to help users find the citation in its original source and, thereby, to discover the significance an relevance of its use in a theoretical and research context.
The documents that lack code symbols have not yet been searched for material to add to the glossary. As time and resources permit, they will be processed. However, the advice of users is invited to help the editor assign priorities for the further development of the data base.
Most references are to works easily found in research libraries. However, some unpublished papers and reports are included in the listing. These may not be readily available. Consequently it is intended that they should be reproduced in microfiche for distribution to users of the glossary.
7.5 The Indexes. Two indexes follow the bibliography. The first is a guide to citations. It enables users to find the records in which citations from any work listed in the bibliography have been used. It is arranged by code symbols and page numbers. By means of this index, users can track the work of any given author who has been quoted in the preparation of glossary records.
The alphabetical index of all the defined and entailed terms given in the records presented in the glossary comes at the very end of the book. The underlined references that follow each index term guide users to the "defining" record for the concept designated by this term. Since several synonyms may be identified for any defined concept, the index will contain a separate entry for each of the synonymous terms -- all of them lead users to the same record. When a term is "multivocal" -- designating two or more concepts -- it will be followed by unequivocal synonyms in parentheses, to guide users to the appropriate record for the intended concept.
The "entailed terms" are those which occur in a definition. They are marked and numbered to enable users to trace linkages between concepts. Every entailed term, of course, is also defined in its own defining record. Consequently, in an index entry one may find, following the underlined reference to the term's definition, a set of references to the other records in which the same term is entailed. As noted above, in #6, only one of the synonymous terms for a concept is "selected" for use in all the records where it appears as an entailed term.
The index gives terms in both direct and inverted order. It serves some of the purposes of a standard alphabetical glossary since users can easily find any particular term and, through the references, discover what sense (or senses) it may have in the glossary's subject field. However, these functions are intentionally subordinated to those emphasized above in #1-5. By looking up the records referred to in the index users will discover synonyms for the concepts they have in mind. They can also identify hitherto unreported concepts and terms within the vocabulary of their subject field and, we hope, they will report them to the editor for addition to the data base. The index, finally, will supply clues leading to the theoretical and paradigmatic contexts in which given concepts and terms have been used, by reference to their definitions, citations, and source documents.