![]() |
February 19, 2001
Q&A:
|
"(Speech segmentation) is especially useful for the seeing impaired and for reading e-mail over the phone." |
Eileen Fitzpatrick
Professor, Linguistics
Eileen Fitzpatrick can sum up three decades as a computational linguistics specialist
with one sentence: "After he ate my cat Freddie took a nap." Whether this cat
eats lunch or becomes lunch depends on the reader.
"Pause in speech is important, but it is not a well-understood phenomena," said Fitzpatrick, who works on the prosodyÑthe metrical structure of verseÑin text-to-speech information retrieval. "Text-to-speech systems avoid very clear pauses because the computers don't know where to put them. Part of that is what gives text-to-speech an artificial sound."
In the classroom Fitzpatrick draws from the experience she gained from a 25-year career at AT&T Bell Labs, where she set up three projects that use language data as input to the decision-making process of the business.
Her ties to the corporate world also helped the Linguistics Department garner a $16,000 grant from Lucent Technologies Bell Laboratories. Fitzpatrick recently discussed text-to-speech information retrieval and how this funding allows MSU graduate students to apply their linguistics training at Lucent.
INSIGHT:
What is text-to-speech information retrieval?
Fitzpatrick: The processing of language by computer in its broadest sense. It
covers speech synthesis, speech recognition and text understanding, parts of
informational retrieval text summarization and grammar checker.
INSIGHT: Tell us about the Lucent grant.
Fitzpatrick: The project, "Lucent Technologies Speech Segmentation," employs
our graduate students trained in phonetics to segment speech data for Lucent's
text-to-speech implementations, and to correct pre-assigned phonetics labels.
The students will visit Bell Labs to receive demonstrations of how their work
is being used, and will attend colloquia and presentations.
INSIGHT: How did Lucent select Montclair State for this grant?
Fitzpatrick: Three years ago, at President [Susan A.] Cole's suggestion, our
department initiated an industry advisory board. The board includes people trained
in linguistics from IBM, AT&T, Lucent, Random House, Educational Testing Service,
the New Jersey Department of Education, Centers for Applied Linguistics, the
Army Research Lab in Washington and various speech companies. A Lucent board
member told me the company needed a lot of segmented speech to build a more
accurate text-to-speech system. Because the signal processing system we use
for speech analysis is also at Lucent, we could easily segment speech for them.
INSIGHT: What is the value of speech segmentation?
Fitzpatrick: It's especially useful for the seeing impaired and for reading
e-mail over the phone. It can also be used for sending messages to pilots or
diversÑany situation where your eyes are not available to read. Quality has
improved tremendously since I began working in the field because of the enormous
amounts of data being collated. However, we're still not to the point where
we want to be in presenting this technology. When reading free text a sentence
can be ambiguous and convey the wrong information. However, the needs of the
visually impaired are so high they will take what current technology can offer,
even with its flaws.
INSIGHT: Tell us about your research.
Fitzpatrick: I'm working on two projects. One is at Lucent. In text-to-speech
we use a single speaker as a model. But that tells us nothing about the average
characteristics of speech. I'm looking at the average duration of sounds over
several speakers. For instance, a Spanish speaker learning English will pronounce
"organize" as "organise," because the Spanish speaker has not adjusted the duration
of the vowel before the "z." We won't know how to help Spanish-speaking people
with that accent difference until we can model English for them. I'm also working
with Steve Seegmiller [of Linguistics] collecting written English data from
non-native speakers. Our objective is to build a corpus of non-native written
English for various applications for research in second language acquisition
and computer applications such as grammar checking.