Representative Publications
A. Recent Work
1. |
Content Retrieval−Beyond
Cascading Speech Recognition with Text Retrieval”, to
appear on IEEE/ACM Transactions on Audio, Speech and Language
Processing, Vol.
23, No. 9, Sept. 2015, pp. 1389-1420. cover
![]() (This
an overview paper in the
journal. The overview papers were started in 2010 in this journal. It
has to be
on an important problem in the area, having co-authors from different
groups to
have a balanced view, can be much longer than a regular paper and at
most 4
such papers each year. This paper gives an overview of the new concepts
directions of speech information retrieval, co-authored by a scientist
in MIT.)
2. |
Document Understanding and Organization”, IEEE
Signal Processing Magazine, Special Issue on Speech Technology in
Communication, Vol. 22, No.5, Sept. 2005, pp.42-60. cover
(This is one of 9
papers selected by the special issue on speech technology. Usually
there is
only one such special issue in a few years. This is a review paper
the new framework for easier browsing the retrieved speech information,
including a prototype system developed at Taiwan University with
first seen in the world at the time of publication.)![]() |
3. |
Information Retrieval - how far are we from the text-based information
retrieval?” (invited paper), IEEE Automatic Speech Recognition and
Understanding Workshop, Merano, Italy, Dec 2009.
(invited paper for the invited speech in the workshop,
overviewing the
technology of voice-based information retrieval, comparing it with the
currently very successful text-based counterpart, analyzing the
differences and
presenting possible future directions)
4. |
Summarization of Spoken Document Archives by Information Extraction and
Semantic Structuring”, Interspeech Conference, Sept. 2006, Pittsburgh,
International Speech Communication Association (ISCA).
(full paper presenting the technologies and performance
evaluation of
“spoken document understanding and organization”; presented in the
Session of Speech Summarization in the Interspeech Conference as 1 out
of 6
papers accepted globally; the presentation including a demonstration
with functionalities not seen elsewhere before)
5. | “Voice
Access of Global Information for Broadband Wireless: Technologies of
Today and
Challenges of Tomorrow” (invited paper), Proceedings of the IEEE, Jan.
2001,pp.41-57. (invited paper in a Special Issue on Broadband Wireless Communications) |
6. | “Spoken Knowledge Organization by Semantic Structuring and
Prototype Course Lecture System for Personalized Learning”,
IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 22, No. 5, May
2014, pp. 883-898. (proposing new technologies for semantic structuring and organizing the recorded courses using speech technologies for convenient learning selected knowledge by learners, with a courses lecture system developed at National Taiwan University as an example) |
7. | “Improved Semantic Retrieval of
Spoken Content by Document/Query Expansion with Random Walk over
Similarity Graphs”, IEEE/ACM
Transactions on Audio, Speech, and Language Processing, Vol. 22, No. 1,
2014, pp. 80-94. (proposing new approaches for semantic retrieval of spoken content by random walk) |
8. |
Analysis and Organization of Spoken Documents Based on Parameters
Derived from
Latent Topics”, IEEE Transactions on Audio, Speech and Language
Vol. 19, No. 7, Sept 2011, pp. 1875-1889. (proposing two parameters useful in different tasks of semantic analysis of speech, including key term extraction, title and summary generation and topic classification; results not limited to Chinese although tested with data in Chinese) |
9. |
Recognition and Retrieval with Relevance Feedback for Spoken Term
IEEE Transactions on Audio, Speech and Language Processing, Vol. 20,
No. 7, Sep
2012, pp. 2095-2110. (proposing a completely new concept for speech information retrieval: integrating recognition and retrieval and optimize them together as a whole rather than directly cascading them; not limited to Chinese although tested with data in Chinese) |
10. | “Model-based
Unsupervised Spoken Term Detection with Spoken Queries”, IEEE
Transactions on
Audio, Speech, and Language Processing, Vol. 21, No. 7, Jul 2013, pp.
1330-1342. (proposing a completely new concept for unsupervised speech information retrieval without recognition: matching based on models rather than on signals; not limited to Chinese although tested with data in Chinese) |
11. |
Spoken Term Detection Using Support Vector
Machines and Weighted Pseudo Examples”, IEEE
Transactions on Audio, Speech and Language Processing,
Vol. 21, No. 6, Jun
2013, pp. 1272-1284. (proposing new methods for spoken term detection using support vector machine) |
12. | “Interactive
Spoken Document
Retrieval with Suggested Key Terms Ranked by a Markov Decision
Process”, IEEE Transactions
on Audio, Speech and Language Processing, Vol. 20, No. 2, Feb 2012, pp.
632-645. (proposing new approaches for interactive retrieval of spoken documents) |
13. | “Performance
Analysis for Lattice-Based Speech Indexing Approaches Using Words and
Units”, IEEE Transactions on Audio, Speech and Language Processing,
Vol. 18,
No. 6, Aug 2010, pp. 1562-1574. (the first complete analysis on a whole set of speech indexing approaches considering retrieval accuracy and computation requirements; results not limited to Chinese although tested with data in Chinese) |
14. | “Discriminating
Capabilities of Syllable-based Features and Approaches of Utilizing
Them for
Voice Retrieval of Speech Information in Mandarin Chinese”, IEEE
on Speech and Audio Processing, Vol.10, No.5, July 2002, pp. 303-314. (first journal paper proposing most discriminative indexing features for Chinese spoken document retrieval) |
15. | “A Recursive Dialogue Game for Personalized
Computer-Aided Pronunciation Training”, IEEE/ACM Transactions on Audio,
and Language Processing, Vol. 23, No. 1, Jan 2015, pp. 127-141. (proposing new approaches for computer assisted language learning using dialogue games) |
16. |
Detection and Unsupervised Discovery of Pronunciation Error Patterns
Computer-Assisted Language Learning”, IEEE/ACM Transactions on Audio,
and Language Processing, Vol. 23, No. 3, Mar 2015, pp. 564-579. (proposing new approaches to detect and discover pronunciation error patterns for computer-assisted language learning) |
17. | “An
Framework for Recognizing Highly Imbalanced Bilingual Code-Switched
with Cross-Language Acoustic Modeling and Frame-Level Language
IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol.
23, No. 7,
Jul 2015, pp. 1144-1159. (first proposing complete framework for recognizing Mandarin-English bilingual code-switched speech) |
18. |
Features and Models for Detecting Edit Disfluencies in Transcribing
Mandarin Speech”, IEEE Transactions on Audio, Speech and Language
Vol. 17, No. 7, Sept 2009, pp. 1263-1278. (proposing a whole set of features and models for processing the very difficult spontaneous speech in Chinese) |
19. | “Higher
Order Cepstral Moment Normalization for Improved Robust Speech
Recognition”, IEEE
Transactions on Audio, Speech and Language Processing, Vol. 17, No. 2,
Feb 2009,
pp.205-220. (proposing the new family of higher order cepstrum moment normalization techniques for robust speech recognition under noisy environment) |
20. | “Pronunciation
Modeling with Reduced Confusion for Mandarin Chinese Using A
Framework”, IEEE Transactions on Audio, Speech and Language Processing,
No.2, Feb 2007, pp.661-675. (proposing a new framework to model pronunciation variation in spontaneous Mandarin speech) |
B. Pioneering Contributions on Chinese Spoken Language Processing
21. | “Voice
Dictation of Mandarin Chinese”, Special
Section on Signal Processing in Asia, IEEE Signal Processing Magazine,
Vol. 14,
Nov 4, July 1997, pp. 63-101. cover ![]() (This is a review paper presenting the problems and approaches in speech recognition for Mandarin Chinese. This is one of the two feature articles of the issue, showing the pioneering leadership of the nominee in the area. The national flags of Taiwan (ROC) and China (PRC) were both shown on the cover of the issue sent to readers and libraries worldwide, with that of Taiwan (ROC) slightly larger, obviously because people in both Taiwan and China speak Chinese, but the author of the paper is in Taiwan.) |
22. | “Structural
Features of Chinese Language: Why Chinese Spoken Language Processing is
and Where We Are”, Keynote Speech, International Symposium on Chinese
Language Processing, Singapore, Dec 1998, pp. 1-15. (opening keynote speech in the first International Symposium on Chinese Spoken Language Processing, reviewing the past achievements in different areas and looking forward to the future, primarily focusing on the approaches to handling structural features of Chinese language and considering the network environment.) |
23. | “Golden
Mandarin (I) ¾ A
Real-time Mandarin Speech Dictation
Machine for Chinese Language with Very Large Vocabulary”, IEEE
Transactions on
Speech and Audio Processing, Vol. 1, No.2, Apr 1993, pp. 158-179. (first journal paper on Mandarin dictation system in the world) |
24. | “Complete
Recognition of Continuous Mandarin Speech for Chinese Language with
Very Large
Vocabulary Using Very Limited Training Data”, IEEE Transactions on
Speech and
Audio Processing, Vol.5, No.2, March 1997, pp. 195-200. (first journal paper on continuous Mandarin Speech dictation system in the world) |
25. |
Efficient Natural Language Processing System Specially Designed for
the Chinese Language”,
Computational Linguistics, Vol. 17, No.
4, Dec 1991, pp. 347-374. cover ![]() (The first journal paper in the world presenting computer analysis of Chinese natural language sentence structures) |
26. | “Special
Speech Recognition Approaches for the Highly Confusing Mandarin
Syllables Based
on Hidden Markov Models”, Computer Speech and Language, Vol.5, No.2,
Apr 1991,
pp.181‑201. (first journal paper on Mandarin syllable recognition using Hidden Markov Models in the world) |
27. | “Markov
Modeling of Mandarin Chinese for Decoding the Phonetic Sequence into
Characters”, Computer Speech and Language, Vol.5, No.4, Oct 1991,
pp.363‑377. (first journal paper on Chinese language modeling in the world) |
28. | “The
Synthesis Rules in A Chinese Text-to-Speech System”, IEEE
Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-37,
No. 9,
Sept 1989, pp. 1309-1320. cover ![]() (The first journal paper in the world presenting Chinese text-to-speech synthesis) |
C. Typical Contributions in Communications
29. |
Exact Performance Analysis of the Clipped Diversity Combining Receiver
FH/MFSK Systems Against A Band Multitone Jammer”, IEEE Transactions on
Communications, Vol.42, No.2/3/4, Feb/ March/ Apr 1994, pp.700-710. (typical example for work in spread spectrum) |
30. | “Multi‑h
Phase Coded Modulations with Asymmetric Modulation Indices”, IEEE
Journal on
Selected Areas in Communications, Vol.SAC‑7, No.9, Dec 1989, Special
Issue on
Bandwidth and Power Efficient Coded Modulations, pp.1450‑1461. (typical example for work on modulation techniques) |
31. | “Minimum
Likelihood -
A New Concept for Symbol Synchronization”, IEEE Transactions on
Vol. COM‑35, No.5, May 1987, pp. 545‑549. (typical example for work on synchronization) |
32. | “A
General Theory for Asynchronous Speech Encryption Techniques”, IEEE
Journal on
Selected Areas in Communications, Vol. SAC‑4, No.2, Mar. 1986, Special
Issue on
Military Communications, pp.280‑287. (typical example for work on communication security) |
Contribution in Engineering Education
33. | “Taiwan:
Meeting the New Challenges” in “Special Issue on Engineering Education
- A
Global View”, IEEE Communications Magazine, Vol. 30, No. 11, Nov. 1992.
18-26. (on engineering education, one out of the nine papers invited globally in the Special Issue) |