Distributed February 17, 2003
For Immediate Release

News Service Contact: Kristen Cole

Language processing

Eye movements indicate initial attempts to process what humans hear

Even before a speaker completes a sentence, a listener attempts to interpret what he or she is hearing by searching out visual cues, according to new research at Brown University. Julie Sedivy, assistant professor of cognitive and linguistic sciences, discussed her findings Feb. 17, during the annual American Association for the Advancement of Science (AAAS) meeting in Denver.

DENVER — By mapping eye movements in fractions of a second, a Brown researcher has found humans attempt to make sense of what they are hearing through visual cues long before they have heard an entire idea. The finding offers insight into how the mind uses vision to rapidly process information.

Julie Sedivy, assistant professor of cognitive and linguistic sciences, will present her research during the annual meeting of the American Association for the Advancement of Science (AAAS) in Denver. Sedivy will participate in a panel discussion, “The Eyes Have It: Eye Movements and the Spoken Language,” Feb. 17, 2003, at 8:30 a.m.

Sedivy is interested in the process by which humans assign meaning to words and phrases. Psycholinguists know that as humans process language they make many split-second decisions about the words they are hearing. But questions remain about how humans cope with uncertainty at every stage of that moment-by-moment decision process.

In a series of studies involving approximately 150 people, participants sat either in front of a computer screen that displayed an image of objects or in front a work surface set with objects and received verbal instructions concerning the objects. Researchers used a headband-mounted camera to map the participants’ eye movements every thirtieth of a second.

Given a scene of a table set with a drinking glass and pitcher, the participants heard instructions such as “pick up the tall glass.” Researchers found that participants frequently looked first at a pitcher in the display, indicating attempts to interpret “tall” early, and prior to hearing the entire noun “glass.”

“On the basis of one or two sounds, we saw the participants’ eye movements begin to shift,” said Sedivy. “As soon as they identified a word, they began to map it.”

However, when a short glass was added to the scene so that there were three objects – a pitcher, tall glass, and short glass – participants were more likely to look at the taller of the drinking glasses when they heard “tall” because size was the distinguishing factor between the two glasses.

The finding suggests that humans consult a whole domain of information, including visual cues and expectations about rational communicative behavior, in resolving the uncertainty involved in processing a sentence, according to Sedivy.

There appears to be a set of mutual expectations between conversational partners, for example, that redundant information is typically avoided. In the example with the pitcher and two drinking glasses, “tall” would be redundant in referring to the pitcher, because there is only one pitcher, while there are two glasses, Sedivy said.

If that type of complex and subtle information were not available, the immediate moment-by-moment mapping of sounds to meaning would only serve to introduce a great deal of uncertainty to language processing, according to Sedivy. For example, if mapping to an object begins upon hearing “tall” rather than waiting until the following word “glass,” given a scene in which there are two tall objects, the chance of an initial mapping guess being correct is only fifty percent.

Not subject to the conscious control by humans, the automatic eye movements are so subtle they are unnoticed by study participants, who may feel simply that their eyes are taking in the whole scene all at once when, in fact, the eyes are darting rapidly from one very specific location to another.

“This is a surprising relationship between highly intelligent processes of language understanding and low-level automatic processes such as eye movements,” Sedivy said. “As humans, we have to deal with multiple levels of information simultaneously, and those different levels of information must be incorporated into the study of linguistics.”

Sedivy conducted the research with Daniel Grodner, a postdoctoral fellow in cognitive and linguistic sciences; Anjula Joshi, research technician; and current and former undergraduate assistants including Charles Joseph, Estelle Reyes, Gitana Chunyo and Rachel Sussman. The work was funded by grants from the National Science Foundation and the National Institutes of Health.