Comments Welcome
George W. Grace
University of Hawaii

Ethnolinguistic Notes

Series 4, Number 19

THE GENETIC HYPOTHESIS

9. DISCOVERING GENETIC RELATIONSHIPS: MORE ABOUT THE EVIDENCE

QUESTION: When we ended the last Note (Grace 2000), I was saying that we had apparently completed our discussion of how the quality of the evidence was to be evaluated, and was asking how we could proceed from there to the overall (quantitative and whatever else) evaluation.

ANSWER: Well, as you pointed out at there, the "quality" that we were talking about referred only to the evidence for a single pair of lexifications in a single pair of languages considered in isolation. Other considerations of course do come into play when we go beyond either of these restrictions and take into account either more lexifications or more languages.

QUESTION: What I had most particularly in mind was an overall evaluation of the evidence from all relevant pairs of lexifications. What's the point about "more languages"?

ANSWER: Most proto languages have more than two daughters. Suppose that while considering a particular pair of comparata, we decide to look at additional languages. Now, suppose one or more of the other languages also has an item that's phonetically and semantically comparable to our original pair. My point is that that fact increases the probability of the original pair's having had a common history. Therefore, we can add the following fifth condition to the list (begun in the last Note [Grace 2000]) of conditions relevant to determining the probability that a particular pair of items derive from a common etymon:

Fifth, the more languages showing a match, the greater the probability that the pair in question has a common history.

QUESTION: Can you expand on that point?

ANSWER: I'll try. Suppose we're comparing languages by pairs and in our comparison of A with B are trying to determine the probability that a particular pair of lexifications derive from a common etymon. Now, we decide to look at a third language, C, and we find a particular C lexification that (according to our evaluation) has a greater-than-chance probability of having a common history with the A lexification in question. Likewise, in a B-C comparison we judge the same C lexification to have a similar probability of having a common history with the particular B lexification.

This evidence that each of the lexifications in question is traceable to the same etymon as this third-language lexification clearly, in itself, constitutes evidence that they share the etymon with each other.

QUESTION: Your point is that any etymon that's common to C and A and is also common to C and B is necessarily common to A and B as well.

ANSWER: Yes. It's also obvious that the strength of the case is further increased with each additional language that is found to have a match with the previous ones.

QUESTION: I agree that the additional languages make a strong case for some historical connection, but not necessarily that these words are cognate; they could have spread by borrowing.

ANSWER: Of course. But what we're discussing now is your first question which concerned the quality and quantity of evidence required to prove that comparable items in different languages are due to something in the history of the languages (which could be either past unity or contact with diffusion). (And of course we've already attempted [in Grace 1998 and again in Grace 2000], to deal with the problem of distinguishing borrowing from retention).

QUESTION: O.k. Let's go on to the next step, which should be the overall evaluation of the evidence from all relevant pairs of lexifications.

ANSWER: I'll try. Where do you want to start?

QUESTION: Well, generally I want to know what's still left to be done, but let me try to put it in terms of a more specific question. Let's suppose we've managed to evaluate the quality of the evidence provided by each pair of comparata--which is to say, to determine the probability that their similarity (and whatever else has made us regard them as comparable) isn't due to chance. What about the question of quantity? How do we go about calculating whether or not there are enough pairs of enough quality to persuade us that a historical explanation is required? Is it essentially a matter of summing up the probabilities of all the individual pairs?

ANSWER: I agree that that's what it essentially is, but I'd like to add two qualifications,. The first is that it's important to remember that the probability initially computed for any particular pair of lexifications according to the criteria we've discussed will need to have been revised along the way as additional pairs were examined.

QUESTION: To take into account patterns such as regular sound correspondences that might have been discovered along the way?

ANSWER: Yes. We might say that the identification of regular sound correspondences allows us to adapt our criteria for "phonetic comparability" from the original entirely general ones to criteria specific to the languages being compared.

QUESTION: When we start talking about regular sound correspondences, we've clearly gotten pretty far away from our original focus on a "single pair of items in a single pair of languages", haven't we?

ANSWER: Well, we obviously have had to look at a lot more pairs of items in order to discover the regularities, but once they're known, they can be applied to the probability of any particular pair of lexifications having had a common history. In fact, regular sound correspondences would radically alter the calculus of probabilities in the case of an individual pair. In short, we're still talking about how the quality of the (individual items of) evidence is to be evaluated.

QUESTION: All right, that seems clear enough; now what's the second qualification?

ANSWER: It has to do with the fact, which we alluded to above, that most language families contain more than two languages. Therefore, in most cases where the question of relationship among languages arises, it concerns more than two languages. In other words, the transitivity of genetic relationship introduces a further qualification to the statement that the overall evaluation of the case for relatedness between two languages is essentially a matter of summing up the probabilities for all their pairs of comparable lexifications.

QUESTION: What do you mean?

ANSWER: Well, imagine for a moment that we've reached some decision as to how, for single pairs of languages, we can compute an index of the overall probability of the two languages' being related to each other.

QUESTION: I assume that you mean that this probability will have been calculated on the basis of the pairs of comparata we've been discussing, and that you intend "related" to include both common ancestry and a borrowing relationship.

ANSWER: Yes. To be specific, then, we're supposing that we've (1) decided how to assign an exact probability to the evidence provided by each pair of comparata, and we've (2) decided how to sum up the probabilities of all the individual pairs so as to calculate an overall probability for the relationship of the particular pair of languages.

And suppose further that we've decided on a threshold such that if the calculated probability exceeds that threshold, we'll declare the hypothesis (that the languages are related) to have passed the test, and if it falls below it, we'll declare it to have failed.

QUESTION: All right.

ANSWER: Now suppose the probability that we calculate for a relationship between two particular languages, A and B, falls below the threshold.

QUESTION: O.k.

ANSWER: But now suppose further that there's some third language, C, such that our calculated probabilities for an A-C relationship and likewise those for a B-C relationship are well above the threshold. The criteria that we've set would require us to say that there was an ancestral language that was shared by A and C, and likewise by B and C. But there's no way that C could have shared an ancestor with both A and B without A and B sharing it with each other. Thus, the results of the A-C and B-C comparisons must be interpreted as additional evidence in support of the A-B relationship (just as the A-B results might constitute evidence weighing against the A-C and B-C relationships).

QUESTION: Of course you've talked about the evidence of "language C" before.

ANSWER: Yes, but a quite different kind of evidence. There, we were talking about evidence bearing on the sharing of a particular etymon; here, we're not concerned with whether of not any of the etyma shared by any pair of languages is shared by any other of the pairs.

QUESTION: There's still a possibility that the comparability is due to borrowing, isn't there?

ANSWER: Yes, that's true. Borrowing would need to be excluded as the explanation. However, the more languages are involved, the easier it is likely to be to distinguish a pattern of sharing that results from borrowing from one resulting from genetic relationship.

QUESTION: Why do you say that?

ANSWER: It's most likely when there's a pattern of borrowing that involves a number of languages, that there was a single major source and a fairly consistent pattern of diffusion routes and that the motives for borrowing were very similar throughout. As a general rule, therefore, the more languages involved, the more likely that a relatively specific hypothesis as to the source, routes, and motives underlying the borrowing will emerge.

QUESTION: All right, your two qualifications have been made. Now can we get back to the question: How do we go about summing up the probabilities for all of the sets of comparata so as to reach an overall evaluation?

ANSWER: Well, any very exact computation of an overall probability for even a single pair of languages would present a very difficult problem--one to which I certainly can't claim to have a solution. However, it is probably better to wait until the next Note to enter into that discussion.

REFERENCES

Grace, George W. 1998. The genetic hypothesis: 5. Lexical borrowing. Ethnolinguistic Notes, Series 4, Number 14. Internet WWW page at http://www2.hawaii.edu/~grace/elniv14.html. (Back up)

Grace, George W. 2000. The genetic hypothesis: 8. Discovering genetic relationships: the evidence. Ethnolinguistic Notes Series 4, Number 18. Internet WWW page at http://www2.hawaii.edu/~grace/elniv18.html. (Back up)


To go to other places in this website, click on one of the cells below

Home Page The Ethnolinguistic Notes The Ethnolinguistic Notes, Series 1 and 2 Ethnolinguistic Notes, Series 3 The Ethnolinguistic Notes, Series 4 Reflections: Language Evolution
Reflections: Knowledge of Language Personal Page The Human Predicament Why Write Unpublishable Things? Modest Proposals Odds and Ends Pictures

First put on the Web on 26 January 2001
1433