The Speech-Communication Center at University of Hawaii at Manoa
The history of literacy programs in Hawaii is vast and interesting, but one of the most interesting programs of all occurred at the University of Hawaii at Manoa a quarter of a century ago. From about the end of World War II until 1965, the University of Hawaii evaluated each entering student's "command of a generally intelligible and acceptable form of spoken English." Each new student appeared before a Review Board, a 3-person panel of faculty members, and spoke for one minute on any topic, and then answered questions for another minute. Then the faculty members rated the student on a scale from 1 to 7 on the extent of that command. Any who received a sum of ratings of 11 or lower were remanded to Speech 101 for training. At the end of that semester they again appeared before another panel and, if again their sum of ratings was 11 or lower, they received an "F" for the course and had to enroll the next semester in Speech 102 and the process was repeated. A series of four such courses, Speech 101, 102, 106 & 107, was taken by hundreds of students, and many had to leave the University without graduation after they failed Speech 107. A few in the state legislature had been forced thus to leave the University and had gone to the Mainland to earn their degrees, and so it was not unexpected that the 1965 legislature issued a mandate to the University to replace that system with a more effective one.
The University's response was to abolish forthwith the 10 to 15 sections of each of those four courses, and to seek to hire someone (Heinberg) to try something different that would result in getting that huge backlog of remanded students to appear before Review Boards and get exempted.
If he were to succeed in that venture, that would tend to imply failure by the dozens of faculty members involved in the current system of training. It is not surprising, therefore, that his first two semesters were mostly spent in justifying to various committees that what would be tried had some chance of success, while more students were remanded for training in those semesters. Eventually, Arts & Sciences Dean David Contois issued an order that the trial of Heinberg would be deferred until he had failed.
Establishment of the Speech-Communication Center
A Research & Development Speech-Communication Center (SCC) was created with three goals: (1) to determine the extent to which the ratings of the Review Boards were valid (across various occupations of raters) and reliable (extent of agreement among raters); (2) if they were both valid and reliable, to develop a paper-and-pencil type of test that would predict the Review Boards' ratings; (3) to evaluate the effectiveness of training when such tests were converted into training instruments (merely by providing the student with knowledge of correctness of his or her response to each test item before presenting the next test item).
To test the validity of ratings, the first one-minute of each of 49 interviews was recorded and played to four groups of raters: professors in Linguistics and English as a Second Language, professors of English, senior-level educational administrators, and corporation executives engaged in personnel employment and training. Correlations of each of these groups' ratings of the one-minute, audio-only samples with the ratings of their face-to-face complete interviews ranged from .63 to .74. That is the range of correlations that generally prevails between face-to-face and audio-only samples of one-minute duration.
A remanded but not-yet-trained student was allowed to request an appearance before a Review Board, and each semester a few dozen students elected to do so. The test-retest reliability of a sample of such students was .97 in terms of agreement of the sums of ratings on the two occasions.
The extent of agreement among the three raters on each of dozens of Review Boards was computed, and this measure of inter-rater reliability generally ranged from .80 to .96. A few unreliable raters were identified by this procedure, and were not asked to serve on subsequent Review Boards.
With the ratings of Review Boards being found to be both valid and reliable, the road to development of an effective training system could take either of two forms. One method would be to develop tests that would predict that criterion (sums of ratings) and then to convert those tests to training by providing immediate knowledge of correctness on each test item. A second method would be to develop a kind of training that led to success on the criterion, and then to convert those training instruments into tests by withholding knowledge of correctness on each test item, and those tests would thereby yield a reliable and cheaper criterion than panels of three professors.
The first effort at test development was monumental. A battery of 55 tests was designed that took two sessions of 100 minutes each to administer. Most tests in which the stimuli could be presented either in print or via a recording were devised in both versions to constitute two separate tests. Half of all subjects received the audio tests before the printed versions, and the other half took those tests in the reverse order. The battery was administered to 168 students who also appeared before Review Boards.
The extent to which students' scores on all tests could predict their sums of ratings (called a squared multiple regression coefficient) was only .355, meaning only 35.5 percent of the variation in rating sums was predictable from the best-weighted combination of the best of those tests.
The magnitude of prediction was not a flame to set the world afire, but there was a burning ember in the ashes. All seven of the best predictors involved audio stimuli; not one involved stimuli that appeared in print. And the correlations between the two types of tests were generally quite low. This was the first breakthrough in testing which contributed to training. Clearly we were looking for kinds of oral literacy which remanded students could demonstrate, but could not demonstrate within the typical time-constraints of real communication.
The second goal, searching for real-time measures of communication proficiency, was more actively pursued when it became clear that test selection and refinement had yielded its full harvest, and the crop was lean.
Perhaps the moral of what followed is that oftentimes people don't know why they do what they do, and the cost of learning that moral was one semester wasted. Since many readers had offered comments on their rating forms, and since the most common comment was a comment related to pidgin characteristics, a group of 40 students was trained to remove all such markers from their speech. They were told to speak on any topic and they could be interrupted at any time with a "HM um," at which time they had to begin that same sentence again. Sometimes a sentence might be repeated as often as 20 times before the signal to continue. "um HM," was given. After such training, a recorded one-minute sample of the speech of each of the 40 was submitted to a group of 3 linguists to indicate pidgin markers in their speech. None of the 40 samples had any such indication. Yet, when those 40 persons appeared before Review Boards, all but 3 were again remanded for training! Clearly, their tendency to speak with pidgin characteristics was not why they had been remanded. Pidgin was an explanation for, rather than a cause of, low ratings!
After the end of that cul-de-sac was reached, the same system of "HM um" and "um HM" was then tried to modify the intelligibility (of every consonant and vowel) of each of a group of 97 students. This training resulted in 58 of them being exempted.
The next task was to convert that time-consuming training by each staff member into students' training of one another, with response time controlled by the training director rather than by each student. A sample of that training instrument is shown in Figures 1A and 1B. Students sit in chairs in an outer circle facing their partners in chairs in an inner circle. Students in the outer circle hold copies of the form shown in Figure 1A. Their partners in the inner circle hold copies of the form shown in Figure 1B.
When the signal to begin is given, each pair has two minutes to get through the total of 40 items. Those in the outside circle say the first word "team," to their partners in the inside row. Each partner in the inside row sees on her or his form as the first word, "teen," preceded by the letter, B. The responder knows that, if what is seen is the same as what is heard, the responder is to say the letter in front of that item. And if what is seen is not the same as what is heard, then the responder is to say the letter other that the letter in front of that item. In this case, hearing "team" and seeing "teen" means that the responder should say the letter other than the letter B. The responder should therefore say, "A," and the sayer, seeing "A" and hearing "A," knows that they have succeeded on that first item. The sayer indicates their success by saying the next word on the sayer's form, "sad."
Notice that an item such as the third item, "friend," which is the same on both forms, is preceded by the same letter on both forms, whereas an item such as the second item is preceded by different letters because the word is not the same on the two forms ("sad" and "shad").
When the sayer sees A and the responder says B, or the sayer sees B and the responder says A, an error is made. When an error is made, the sayer says, "top," and then says the first word in that group of four words again. An error therefore means to start at the first of those four words again.
At the end of two minutes, if every pair has not completed all of the 40 items, those in the inner circle move one chair to their right to form new partners and the process is repeated. If every pair has completed all of the 40 items, they rotate the same way, but they go on to a new page of 40 items.
It is important to note that the training forms can be easily converted into test forms, simply by deleting the A's and B's from the sayer's forms and by placing empty parentheses in front of the A's and B's on both forms. Then each sayer says A or B for each test item, and both mark the letter that the responder says in the parenthesis. A training instrument thus becomes a test instrument. And that test instrument can eventually be used instead of a Review Board. With this training, which took a maximum of four hours for most remanded students, the exemption rate was a rather steady sixty percent. Clearly, forty percent of remanded students needed some kind of training other than to make them more intelligible in rapidly spoken English.
Observations of Review Boards from behind a one-way mirror seemed to indicate that those still remanded after intelligibility training were doing well in the one-minute speech but not doing well in the one-minute of answering Board members' questions. Hence, a mock Review Board was created of Center staff members, and each of dozens of students who had completed the intelligibility training but were again remanded then appeared before mock Review Boards. Again the procedure was employed of asking a question, and saying "HM um" to mean start that answer again. Upon appearing before a subsequent Review Board, all but 3 of those thus trained were exempted.
Again the search began for a training instrument that would convert that time-consuming training of each student by 3 staff members into students' training of one another to respond adequately and rapidly to questions.
The first attempt was made by cutting variously shaped turned pieces of wood into complex, different shapes, and by creating sets of matched pairs of 13 blocks. Students who had completed intelligibility training but were again remanded were then seated in pairs back-to-back. Their task was for one of them to select any three blocks and arrange them such that each block touched at least one other, and then to talk over their shoulder to their partner to enable that partner to select and arrange blocks identically. Each trainee had to earn a perfect score as both sayer and responder with each of 3 partners. This was performance in the didactic mode (I talk; you listen and arrange). Then each pair had to reach that same criterion level of performance in the interrogatory mode. That requires the one who is trying to select and arrange blocks to ask questions of her or his partner that can only be answered "yes," "no" or "I don't know," When those persons thus trained (for about 5 hours on average) appeared before Review Boards, all but about 3 percent were exempted.
Keeping track of sets of blocks became an almost insurmountable inventory problem. Hence, the search began for a two-dimensional version of that 3-dimensional task. The result is shown in Figures 2A and 2B.
In any column of figures, there are four figures and one blank. Three of the 4 figures on one form are identical with 3 of the 4 figures on the partner's form. Hence, on each form there is one figure that is not on the partner's form. The function of the blank rectangle is that, if a figure that is referred to is not on the partner's form, the correct response is the number of the row in which there is the blank rectangle.
In the interrogatory mode, the partner holding the form shown in Figure 2A for Item Number 1 asks questions of her or his partner to discover which of the four figures in Column A, if any, is the figure identified on Figure 2B for Item Number 1 as "My A-5." When the partner holding the A Form has asked enough questions to determine that his or her partner is looking at figure (A-5) that is identical to his or her A-3, he or she says, "3" or "A-3." The partner holding the B form knows that they have successfully completed that item because B is looking at the statement for Item Number 1: "My A-5 is my partner's A-(3)." Four minutes are allowed for each pair of students to complete the 18 items, 9 as interrogator and 9 as responder.
The finding was that intelligibility training exempted approximately 60 percent of remanded students, and interrogation training exempted another 30 to 32 percent. This left a backlog of 8 to 10 of every 100 remanded students who could not be exempted by further training to a higher criterion level in intelligibility and interrogation. Observations from behind the one-way mirror of that 8 to 10 percent in their appearances before Review Boards led to the suspicion that the affects they tended to produce in Board members were not appropriate for that occasion.
To improve their ability to communicate affects in three dimensions (status, amity, and mood), a dyadic training instrument was created (see Figures 3A and 3B).
Students sit in two concentric circles as in the other two kinds of training. Students in the outer circle have one carrier sentence to say on each of their Say Items: "I want to talk with you," and students in the inner circle have different carrier sentence: "I need to see you soon."
When the A Forms are held by those in the outer circle, it can be seen that, for Item 1, each student in the outer circle says to her/his partner, " I want to talk with you," trying to imply that/he is nonchalant (defined to the group as unconcerned about the mood that is conveyed) and is the superordinate. For the dyad to be correct on that item, each partner in the inner circle should respond, "two three," the 2 to indicate superordinate and the 3 to indicate nonchalant. The sayer does not know which affect the 2 refers to, so that a response of 33 when the sayer sees 23, does not notify the sayer of which affect to modify.
The criterion for any group of 3 to 6 dyads is a sequence of four successes in a row, where a success is every pair completing 16 items in 3 minutes with each of 4 different partners.
The few students who had completed the intelligibility training and the interrogatory training and who were still remanded for additional training then completed the affects training. When they then appeared before Review Boards, all of them managed to be exempted.
An Intelligibility Test was then created by deleting the "A's" and "B's" from each of the sayer's items, and an Interrogation Test was created by deleting the number in parentheses from each of the responder's items, and an Affects Test was created by deleting the numbers in parentheses from each of the sayer's items. Then a group of 40 unremanded students who were scheduled to appear before Review Boards were tested with each of 8 different partners. The score earned by each dyad was assigned to each partner, and a student's score was the sum of his score with each of 8 partners. Then those 40 appeared before Review Boards. The ability to predict their ratings from a knowledge of their Intelligibility, Interrogation and Affect scores (i. e., the squared multiple regression coefficient) was .86. In other words, 86 percent of the variation in their sums of ratings could be predicted from a knowledge of the variation in their three test scores. With that finding, the Review Boards--with no better reliability than that--could be and were dispensed with.
The SCC existed for one more year and exempted more than a thousand remanded students in that last year. Why was the SCC disbanded? It had to be a confluence of many factors. Professors with Ph.D.s felt it was demeaning of their expertise to monitor rather than teach, i.e., to supervise the interactions of students facing one another in pairs. And the R & D venture was complete; it was now merely a training site. And foreign students were sneaking over and paying the non-student fee of five dollars and raising their TOEFL scores enough to exempt them from ESL non-credit courses. And the legislature lost interest once the Speech 101-107 courses were abolished. And the director (Heinberg) shifted his interest to dyadic second language learning and the resulting methodology was patented in 1981. And improved education in Hawaii meant fewer students were remanded from training. But one cause of its demise could not be a lack of interest because, in the four years of its existence, the SCC hosted more than three hundred researchers who spent from one day to one month observing what went on.
Paul Heinberg is Full Professor, and Graduate Chair of Communications, Department of Communications, 2560 Campus Road, University of Hawaii at Manoa, Honolulu, Hawaii 96822.