Research Interests & Projects
Engin Erzin's research interests
include speech processing, multimodal signal processing, pattern
recognition and human-computer interfaces. Prof. Erzin is a member of
Multimedia, Vision and Graphics Laboratory (MVGL), where he is actively
part of many national and international research projects.
The speech processing research area, which refers to analysis, synthesis and recognition of speech signals, is playing a key role in the state-of-art digital speech communication and multimedia services. While Internet and wireless telephony is expected to remain one of the most important application for several years to come, the use of speech processing applications, such as automatic speech recognition (ASR), text-to-speech synthesis (TTS), speaker identification/verification, emotion and mood analysis from speech, is expected to increase in multimedia-rich scenarios.
The speech processing research area, which refers to analysis, synthesis and recognition of speech signals, is playing a key role in the state-of-art digital speech communication and multimedia services. While Internet and wireless telephony is expected to remain one of the most important application for several years to come, the use of speech processing applications, such as automatic speech recognition (ASR), text-to-speech synthesis (TTS), speaker identification/verification, emotion and mood analysis from speech, is expected to increase in multimedia-rich scenarios.
"the use of speech processing applications is expected to surge in multimedia-rich scenarios"
Multimodal signal processing
refers to combined processing of signals from multiple modalities such
as speech, still images, video, and other sources. It plays a key role
in the design of future human-computer interfaces and intelligent
systems, such as intelligent vehicles. The ultimate goal of
human-computer interface research is to develop a machine that is able
to identify humans, to analyze and understand them from biometric input
signals and to synthesize a human-like output in response, in a similar
way to human-to-human communication. The study of relations and
correlations between diffierent modality signals plays an important
role in effective use of multimodal information. Prof. Erzin's active
research activities in the area of multimodal signal processing include
speech/speaker recognition, body motion analysis, speech-driven face
gesture analysis and synthesis, speaker animation, audio-driven body
animation and driver behavior modeling.
Research Topics
- Automatic Transcription of Dance, 2012 -- Hiring MSc & PhD Students!
- Speech Driven Upper Body Animation, 2011 -- Hiring MSc & PhD Students!
- Music Driven Choreography Synthesis, 2010
- Emotion Recognition from Speech, 2010
- Prosody Driven Head-Gesture Animation,
2006
Current & Recent Projects
- E. Erzin, Y. Yemez, A.M. Tekalp, "Speech Driven Upper Body Animation," funded by Turk Telekom, 2010-2012.
- E. Erzin, Y. Yemez, A.M. Tekalp, "COST ACTION 2102: Cross Modal Analysis of Verbal and Nonverbal Communication," funded by TUBITAK & COST, 2008-2010.
- E. Erzin, "Joint Processing of Throat-, Bone- and Acoustic-Microphone Recordings for Robust Speech Recognition," funded by TUBITAK, 2005-2008.
- E. Erzin, Y. Yemez, A.M. Tekalp, "NEDO Project: International Research Coordination of Driving Behavior Signal Processing based on Large Scale Real World Database," funded by Japanese Government, 2005-2007.
- E. Erzin, A. M. Tekalp, Y. Yemez, "DRIVE-SAFE: Signal Processing and Advance Information Technologies for Improving Driver/Driving Prudence and Accident Reduction," funded by DPT, 2005-2007.
- A. M. Tekalp, E. Erzin, Y. Yemez, ``SIMILAR: The European taskforce creating human-machine interfaces SIMILAR to human-human communication,'' FP6 Network of Excellence, 2003-2007.
- E. Erzin, A. E. \c{C}etin, K. Oflazer and H. Erdo\u{g}an, ``COST
278: Spoken Language
Interactions in Telecommunications,'' funded by TUBITAK, March 2002 - March 2006. - A. M. Tekalp, E. Erzin and Y. Yemez, ``Multi-Stage and
Multi-Modal Signal Processing
for Person Identification,'' funded by TUBITAK, March 2002 - March 2004.
