Thursday, June 4, 2009

S2S (Speech-to-Speech Translation)

Construction of robust systems for
speech-to-speech translation to facilitate cross-lingual oral
communication has been the dream of speech and natural language
researchers for decades. It is technically extremely difficult because
of the need to integrate a set of complex technologies – Automatic
Speech Recognition (ASR), Natural Language Understanding (NLU), Machine
Translation (MT), Natural Language Generation (NLG), and Text-to-Speech
Synthesis (TTS) – that are far from mature on an individual basis, much
less when cascaded together. Blindly integrating ASR, MT and TTS
components does not provide acceptable results because typical machine
translation technologies, primarily oriented towards well-formed
written text, are not adequate to process conversation speech materials
rife with imperfect syntax and speech recognition errors. Initial work
in this area in the 1990s, for example, by researchers at CMU and
Japan’s ATR labs, resulted in systems severely limited to a small
vocabulary or otherwise constrained in the variety of expressions
supported. Currently, the only commercial available speech translation
technology is Phraselator, a simple unidirectional translation device
that is customized for military use. It searches from a fixed number of
English sentences and plays out the corresponding voice recordings in
foreign languages, and cannot handle bidirectional speech.

Resources:

IBM Lab: Speech-to-Speech Translation
http://domino.watson.ibm.com/comm/research.nsf/pages/r.uit.innovation.html

http://www.google.com/goog411/
http://googlesystem.blogspot.com/2008/10/machine-translation-and-speech.html

TC-STAR
http://www.tc-star.org/

CMU-LTI
http://www.lti.cs.cmu.edu/Research/cmt-projects.html

http://domino.watson.ibm.com/comm/research.nsf/pages/r.uit.innovation.html/$FILE/speech_to_speech.mpg

Books:

Incremental speech translation

http://books.google.com.sg/books?id=QEr6dTamixQC&printsec=frontcover&dq=speech+translation&ei=QbonSuPPNY2GkQTb3KjaCg#PPA1,M1

Verbmobil: Foundations of Speech-to-Speech Translation

 By Wolfgang Wahlster
http://books.google.com.sg/books?id=RiT0aAzeudkC&printsec=frontcover

Speech-to-speech translation

http://books.google.com.sg/books?id=T0diAAAAMAAJ&q=speech+translation&dq=speech+translation&ei=QbonSuPPNY2GkQTb3KjaCg&pgis=1

Machine Translation

 By Conrad Sabourin, Laurent Bourbeau
http://books.google.com.sg/books?id=IsqLGQAACAAJ&dq=speech+translation&ei=QbonSuPPNY2GkQTb3KjaCg

KI 2006

 By Christian Freksa, Michael Kohlhase, Kerstin Schill
One of the main lessons learned from all the research during the past three decades is that the problems of natural language understanding can only be cracked by the combined muscle of deep and shallow processing approaches. This means that corpus-based and probabilistic methods must be integrated with logic-based and linguistically inspired approaches to achieve true progress on this AI-complete problem.


No comments:

Post a Comment

Google+