George Street Journal October 19, 2001


GSJ HOME
@BROWN
LIBERAL ARTS
INQUIRING MINDS
FACES OF BROWN
OFF HOURS
PAGE TURNERS
NEWS BYTES
LAST WORD
Archives
About the staff
Deadlines
Subscriptions
Feedback
Jobs
Events at Brown
About Brown
Academic calendar
Search the GSJ

Charniak and others push into new areas of speech recognition

Five cutting-edge computer science projects at Brown were among the winners of the highly competitive National Science Foundation awards for information technology research. These extraordinary projects — which, if fully realized, will allow medical students to participate in a surgery that may have taken place months ago, or quadriplegics to move a robotic arm simply by thinking about it — won the scientists nearly $5 million of the $156 million distributed. See additional articles:

  • Trio collaborates on modeling brain cell behavior
  • Van Dam project hopes to marry 3-D graphics with interactive electronic books that one day may train surgeons using virtual reality
  • Upfal project explores dynamic behavior of networks
  • Van Hentenryck in the hunt for algorithm that takes uncertainty into account

by Cynthia Ferguson

Speech recognition systems are certainly better than they were just five years ago, but their flaws continue to plague those who use and develop them. With a National Science Foundation award of $449,442, Computer Science Professor Eugene Charniak (below) hopes to attack some of the problems inherent in current systems.

 Speech recognition systems rely on two functions — sound recognition and grammatical context. Sound without context is very difficult for the computer — or the brain, for that matter — to translate. The word "variation," for example, is easily mistaken for "very Asian" if it is not heard in the context of a sentence.

"If you taped a sentence and clipped just one word out of it, most people would have trouble recognizing that word in isolation," notes Charniak.

When you add context to that word, however, it is far more recognizable. If we hear what sounds like "pig dog," we know almost instinctively that what was really said was probably "big dog," because the probability of the word "pig" preceding the word "dog" is very small.

It is this facet of speech recognition — the relationship of one word with another — that most interests Charniak, but he is moving it a step beyond current work in the field. For some time now, computer scientists have used "trigrams" — three-word phrases — and fed computers information about the probability of one word following the other two. The problem with this approach, Charniak maintains, is that the trigram model doesn’t know any grammar.

Without grammar, he notes, the sentence "Put the paper in the folder" could be mistaken for "Put the paper and the folder." The words "in" and "and" sound very much alike in everyday speech and the trigram model would be of little help determining the correct word.

For the past few years, Charniak has been developing a program that understands the grammar — or syntax — of a sentence. He has done this with statistical parsing, assigning a high probability to structures that are likely to occur and a low probability to those that are rare or grammatically incorrect.

To do this, Charniak relies on work done at the University of Pennsylvania some years ago. A group of graduate students in linguistics were hired to parse — or diagram — some 40,000 sentences taken from the Wall Street Journal. In what was a momentous task, the linguists indicated such structures as "preposition phrases," "noun phrases" and "verb phrases." When this information was fed to the computer, the computer actually "learned" a grammar, figuring out the rules from their statistical occurrence.

In his current project, Charniak intends to refine his parsing program further. Troubling him, for example, is the fact that the University of Pennsylvania grammar does not indicate — and his program, therefore, can’t recognize — that when we say "New York Stock Exchange" we are not referring to a location but an organization. And when it encounters a phrase with three nouns, such as Monday Night Football, it doesn’t know whether we mean Monday-Night Football or Monday Night-Football. Charniak hopes to use automatic techniques that will let the computer figure out structures like these for itself.

Charniak will work with colleague Mark Johnson, professor of cognitive and linguistic sciences, on the research involved in the NSF project. With a finely-grained syntactic analysis, Charniak believes, he can vastly improve current language models, ultimately making speech recognition systems far more accessible and useful.


Photo by Glenn Turner