I have launched a new website. Please visit my new website for latest information.
Conference
- Chung-Ming Chien, Hung-yi Lee, HIERARCHICAL PROSODY MODELING FOR NON-AUTOREGRESSIVE SPEECH SYNTHESIS, SLT, 2021
- Po-Han Chi, Pei-Hung Chung, Tsung-Han Wu, Chun-Cheng Hsieh, Yen-Hao Chen, Shang-Wen Li, Hung-yi Lee, AUDIO ALBERT: A LITE BERT FOR SELF-SUPERVISED LEARNING OF AUDIO REPRESENTATION, SLT, 2021
- Tzu-hsien Huang, Jheng-hao Lin, Hung-yi Lee, HOW FAR ARE WE FROM ROBUST VOICE CONVERSION: A SURVEY, SLT, 2021
- Chien-yu Huang, Yist Y. Lin, Hung-yi Lee, Lin-shan Lee, DEFENDING YOUR VOICE: ADVERSARIAL ATTACK ON VOICE CONVERSION, SLT, 2021
- Heng-Jui Chang, Alexander H. Liu, Hung-yi Lee, Lin-shan Lee, END-TO-END WHISPERED SPEECH RECOGNITION WITH FREQUENCY-WEIGHTED APPROACHES AND PSEUDO WHISPER PRE-TRAINING, SLT, 2021
- David C. Chiang, Sung-Feng Huang, Hung-yi Lee, Pretrained Language Model Embryology: The Birth of ALBERT, EMNLP, 2020
- Chun-Hsing Lin, Siang-Ruei Wu, Hung-Yi Lee, Yun-Nung Chen, TaylorGAN: Neighbor-Augmented Policy Update for Sample-Efficient Natural Language Generation, NeurIPS, 2020
-
Shu-wen Yang, Andy T. Liu, Hung-yi Lee, "Understanding Self-Attention of Self-Supervised Audio Transformers", INTERSPEECH, 2020
-
Haibin Wu, Andy T. Liu, Hung-yi Lee, "Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised Learning", INTERSPEECH, 2020
-
Po-chun Hsu, Hung-yi Lee, "WG-WaveNet: Real-Time High-Fidelity Speech Synthesis without GPU", INTERSPEECH, 2020
-
Yi-Chen Chen, Jui-Yang Hsu, Cheng-Kuang Lee, Hung-yi Lee, "DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation", INTERSPEECH, 2020
-
Da-Yi Wu, Yen-Hao Chen, Hung-Yi Lee, "VQVC+: One-Shot Voice Conversion by Vector Quantization and U-Net architecture", INTERSPEECH, 2020
-
Tao Tu, Yuan-Jui Chen, Alexander H. Liu, Hung-yi Lee, "Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation", INTERSPEECH, 2020
-
Yung-Sung Chuang, Chi-Liang Liu, Hung-Yi Lee, Lin-shan Lee, "SpeechBERT: An Audio-and-text Jointly Learned Language Model for End-to-end Spoken Question Answering", INTERSPEECH, 2020
- Shun-Po Chuang, Tzu-Wei Sung, Alexander H Liu, Hung-yi Lee, "Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation", ACL, 2020
- Andy T. Liu, Shu-wen Yang, Po-Han Chi, Po-chun Hsu, Hung-yi Lee, "MOCKINGJAY: UNSUPERVISED SPEECH REPRESENTATION LEARNING WITH DEEP BIDIRECTIONAL TRANSFORMER ENCODERS", ICASSP, 2020
(video)
- Chung-Yi Li, Pei-Chieh Yuan, Hung-Yi Lee, "WHAT DOES A NETWORK LAYER HEAR? ANALYZING HIDDEN REPRESENTATIONS OF END-TO-END ASR THROUGH SPEECH SYNTHESIS", ICASSP, 2020
(video)
- Gene-Ping Yang, Szu-Lin Wu, Yao-Wen Mao, Hung-yi Lee, Lin-shan Lee, "INTERRUPTED AND CASCADED PERMUTATION INVARIANT TRAINING FOR SPEECH SEPARATION", ICASSP, 2020
(video)
- Alexander H. Liu, Tzu-Wei Sung, Shun-Po Chuang, Hung-yi Lee, Lin-shan Lee
"SEQUENCE-TO-SEQUENCE AUTOMATIC SPEECH RECOGNITION WITH WORD EMBEDDING REGULARIZATION AND FUSED DECODING", ICASSP, 2020
(video)
- Shun-Po Chuang, Tzu-Wei Sung, Hung-Yi Lee, "TRAINING A CODE-SWITCHING LANGUAGE MODEL WITH MONOLINGUAL DATA", ICASSP, 2020
(video)
- Alexander H. Liu, Tao Tu, Hung-yi Lee, Lin-shan Lee, "TOWARDS UNSUPERVISED SPEECH RECOGNITION AND SYNTHESIS WITH QUANTIZED SPEECH REPRESENTATION LEARNING", ICASSP, 2020
(video)
- Da-Yi Wu, Hung-yi Lee, "ONE-SHOT VOICE CONVERSION BY VECTOR QUANTIZATION", ICASSP, 2020
(video)
- Haibin Wu, Songxiang Liu, Helen Meng, Hung-yi Lee, "Defense against adversarial attacks on spoofing countermeasures of ASV", ICASSP, 2020
(video)
- Jui-Yang Hsu, Yuan-Jui Chen, Hung-yi Lee, "META LEARNING FOR END-TO-END LOW-RESOURCE SPEECH RECOGNITION", ICASSP, 2020
(video)
- Chun-Hao Chao, Pin-Lun Hsu, Hung-Yi Lee, Yu-Chiang Frank Wang, "SELF-SUPERVISED DEEP LEARNING FOR FISHEYE IMAGE RECTIFICATION", ICASSP, 2020
- Fan-Keng Sun, Cheng-Hao Ho, Hung-Yi Lee, "LAMOL: LAnguage MOdeling for Lifelong Language Learning", ICLR, 2020
(video)
- Che-Ping Tsai, Hung-Yi Lee, "Order-free Learning Alleviating Exposure Bias in Multi-label Classification", AAAI, 2020
- Songxiang Liu, Haibin Wu, Hung-yi Lee, Helen Meng, "Adversarial attacks on spoofing countermeasures of automatic speaker verification", ASRU, 2019
- Tsung-Yuan Hsu, Chi-Liang Liu and Hung-yi Lee,
"Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model",
EMNLP, 2019
- Hong-Ren Mao and Hung-Yi Lee
"Polly Want a Cracker: Analyzing Performance of Parroting on Paraphrase Generation Datasets",
EMNLP, 2019
- Yaushian Wang, Hung-Yi Lee and Yun-Nung Chen
"Tree Transformer: Integrating Tree Structures into Self-Attention",
EMNLP, 2019
- Yi-Lin Tuan, Yun-Nung Chen and Hung-yi Lee
"DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs",
EMNLP, 2019
- Ju-chieh Chou, Cheng-chieh Yeh, Hung-yi Lee, "One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization", INTERSPEECH, 2019
- Andy T. Liu, Po-chun Hsu and Hung-yi Lee, "Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion", INTERSPEECH, 2019
- Feng-Guang Su, Aliyah Hsu, Yi-Lin Tuan and Hung-yi Lee, "Personalized Dialogue Response Generation Learned from Monologues", INTERSPEECH, 2019
- Yuan-Jui Chen, Tao Tu, Cheng-chieh Yeh, Hung-yi Lee, "End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning", INTERSPEECH, 2019
- Ching-Ting Chang, Shun-Po Chuang, Hung-Yi Lee, "Code-switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation", INTERSPEECH, 2019
- Kuan-yu Chen, Che-ping Tsai, Da-Rong Liu, Hung-yi Lee and Lin-shan Lee, "Completely Unsupervised Phoneme Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models", INTERSPEECH, 2019
- Chien-Feng Liao, Yu Tsao, Hung-yi Lee and Hsin-Min Wang, "Noise Adaptive Speech Enhancement using Domain Adversarial Training", INTERSPEECH, 2019
- Gene-Ping Yang, ChaoI Tuan, Hung-yi Lee and Lin-shan Lee, "Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering", INTERSPEECH, 2019
- Li-Wei Chen, Hung-Yi Lee, Yu Tsao, "Generative Adversarial Networks for Unpaired Voice Transformation on Impaired Speech", INTERSPEECH, 2019
- Che-Ping Tsai, Hung-Yi Lee,
Adversarial Learning of Label Dependency: A Novel Framework for Multi-class Classification", ICASSP, 2019
- Chia-Hung Wan, Shun-Po Chuang, Hung-Yi Lee, "Towards Audio to Scene Image Synthesis using Generative Adversarial Network", ICASSP, 2019
demo
- Chia-Hsuan Lee, Yun-Nung Chen, Hung-Yi Lee,
"Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain Adaptation", ICASSP, 2019
- Tzu-Wei Sung, Jun-You Liu, Hung-yi Lee, Lin-shan Lee,
"Towards End-to-end Speech-to-text Translation with Two-pass Decoding", ICASSP, 2019
- Alexander H. Liu, Hung-yi Lee, Lin-shan Lee,
"Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model", ICASSP, 2019
- Richard Tzong-Han Tsai, Chia-Hao Chen, Chun-Kai Wu, Yu-Cheng Hsiao, Hung-Yi Lee,
"Using Deep-Q Network to Select Candidates from N-best Speech Recognition Hypotheses for Enhancing Dialogue State Tracking", ICASSP, 2019
- Yau-Shian Wang, Hung-Yi Lee, "Learning to Encode Text as Human-Readable Summaries using Generative Adversarial Networks", EMNLP, 2018
- Da-Rong Liu, Chi-Yu Yang, Szu-Lin Wu, Hung-Yi Lee, "Improving Unsupervised Style Transfer in End-to-End Speech Synthesis with End-to-End Speech Recognition", SLT, 2018
- Chia-Hsuan Lee, Shang-Ming Wang, Huan-Cheng Chang, Hung-Yi Lee, "ODSQA: Open-domain Spoken Question Answering Dataset", SLT, 2018
dataset
- Cheng-chieh Yeh, Po-chun Hsu, Ju-chieh Chou, Hung-yi Lee, Lin-shan Lee, "Rhythm-Flexible Voice Conversion without Parallel Data Using Cycle-GAN over Phoneme Posteriorgram Sequences", SLT, 2018
- Yi-Chen Chen, Sung-Feng Huang, Chia-Hao Shen, Hung-yi Lee, Lin-shan Lee, "Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval", SLT, 2018
- Chia-Hsuan Li, Szu-Lin Wu, Chi-Liang Liu, Hung-yi Lee, "Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension", INTERSPEECH, 2018
dataset
- Pei-Hung Chung, Kuan Tung, Ching-Lun Tai, Hung-Yi Lee, "Joint Learning of Interactive Spoken Content Retrieval and Trainable User Simulator", INTERSPEECH, 2018
(The paper won the best student paper award in INTERSPEECH 2018. Over 700 papers, from all over the world were considered. From there, the list was narrowed down to 12 and then 3, from which the best paper award was announced.)
- Ju-chieh Chou, Cheng-chieh Yeh, Hung-yi Lee, Lin-shan Lee, "Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations", INTERSPEECH, 2018
(one of the 12 finalists for the best student paper award)
- Da-Rong Liu, Kuan-Yu Chen, Hung-Yi Lee, Lin-shan Lee, "Completely Unsupervised Phoneme Recognition by Adversarially Learning Mapping Relationships from Audio Embeddings", INTERSPEECH, 2018
- Chia-Hao Shen, Janet Y. Sung, Hung-Yi Lee, "Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data", ICASSP, 2018
- Chia-Wei Ao, Hung-yi Lee, "Query-by-example Spoken Term Detection using Attention-based Multi-hop Networks", ICASSP, 2018
- Hsien-Chin Lin, Chi-Yu Yang, Hung-Yi Lee, Lin-Shan Lee, "Domain Independent Key Term Extraction from Spoken Content based on Context and Term Location Information", ICASSP, 2018
- Chih-Wei Lee, Yau-Shian Wang, Tsung-Yuan Hsu, Kuan-Yu Chen, Hung-Yi Lee, Lin-Shan Lee, "Scalable Sentiment for Sequence-to-sequence Chatbot Response with Performance Analysis", ICASSP, 2018
- Yu-Hsuan Wang, Hung-Yi Lee, Lin-Shan Lee, "Segmental Audio Word2vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection", ICASSP, 2018
-
Yu-An Chung, Hung-Yi Lee, James Glass, "Supervised and Unsupervised Transfer Learning for Question Answering", NAACL, 2018
code
-
Tzu-Chien Liu, Yu-Hsueh Wu, Hung-Yi Lee, "Query-based Attention CNN for Text Similarity Map", ICCV workshop, 2018
code
- Pin-Jung Chen, I-Hung Hsu, Yi Yao Huang, Hung-Yi Lee, "Mitigating the Impact of Speech Recognition Errors on Chatbot using Sequence-to-sequence Model", the 12th biannual IEEE workshop on Automatic Speech Recognition and Understanding (ASRU'17), Okinawa, Japan, December 2017
- Shun Po Chuang, Chia-Hung Wan, Pang-Chi Huang, Chi-Yu Yang, Hung-Yi Lee, "Seeing and Hearing Too: Audio Representation for Video Captioning" the 12th biannual IEEE workshop on Automatic Speech Recognition and Understanding (ASRU'17), Okinawa, Japan, December 2017
- Zih-Wei Lin, Tzu-Wei Sung, Hung-Yi Lee, Lin-Shan Lee, "Personalized Word Representations Carrying Personalized Semantics Learned from Social Network Posts", the 12th biannual IEEE workshop on Automatic Speech Recognition and Understanding (ASRU'17), Okinawa, Japan, December 2017
- Tzu-Ray Su, Hung-Yi Lee, "Learning Chinese Word Representations From Glyphs Of Characters", Conference on Empirical Methods in Natural Language Processing (EMNLP), Copenhagen, Denmark, Sept. 2017
- Yu-Hsuan Wang, Cheng-Tao Chung, Hung-yi Lee, "Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries", the 18th Annual Conference of the International Speech Communication Association (INTERSPEECH'17), Stockholm, Sweden, August 2017
- Bo-Ru Lu, Frank Shyu, Yun-Nung Chen, Hung-Yi Lee, Lin-Shan Lee, "Order-Preserving Abstractive Summarization for Spoken Content based on Connectionist Temporal Classification", the 18th Annual Conference of the International Speech Communication Association (INTERSPEECH'17), Stockholm, Sweden, August 2017
- Wei-Jen Ko, Bo-Hsiang Tseng, Hung-yi Lee, "Recurrent Neural Network based Language Modeling with Controllable External Memory", the 42th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'17), New Orleans, March 2017
- Cheng-Kuan Wei, Cheng-Tao Chung, Hung-yi Lee, Lin-Shan Lee, "Personalized Acoustic Modeling by Weakly Supervised Multi-task Deep Learning using Acoustic Tokens Discovered from Unlabeled Data", the 42th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'17), New Orleans, March 2017
- Lang-Chi Yu, Hung-yi Lee, Lin-Shan Lee, “Abstractive Headline Generation for Spoken Content by Attentive Recurrent Neural Networks with ASR Error Modeling”, the 6th IEEE Workshop on Spoken Language Technology (SLT'16), San Diego, Dec. 2016
- Wei Fang, Juei-Yang Hsu, Hung-yi Lee, Lin-Shan Lee,
"Hierarchical Attention Model for Improved Machine Comprehension of Spoken Content",
the 6th IEEE Workshop on Spoken Language Technology (SLT'16), San Diego, Dec. 2016
code
- Bo-Hsiang Tseng, Sheng-syun Shen, Hung-Yi Lee, Lin-Shan Lee,
"Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine",
the 17th Annual Conference of the International Speech Communication Association (INTERSPEECH'16),
San Francisco, Sept. 2016 (one of the 12 finalists for the best student paper award)
- Yen-Chen Wu, Tzu-Hsiang Lin, Yang-De Chen, Hung-Yi Lee, Lin-Shan Lee, "Interactive Spoken Content Retrieval by Deep Reinforcement Learning", the 17th Annual Conference of the International Speech Communication Association (INTERSPEECH'16), San Francisco, Sept. 2016
- Yu-An Chung, Chao-Chung Wu, Chia-Hao Shen, Hung-Yi Lee, Lin-Shan Lee, "Audio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder", the 17th Annual Conference of the International Speech Communication Association (INTERSPEECH'16), San Francisco, Sept. 2016
- Sheng-syun Shen, Hung-Yi Lee, "Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection", the 17th Annual Conference of the International Speech Communication Association (INTERSPEECH'16), San Francisco, Sept. 2016
- Yi-Hsiu Liao, Hung-yi Lee, Lin-shan Lee, "Towards Structured Deep Neural Network for Automatic Speech Recognition", the 11th biannual IEEE workshop on Automatic Speech Recognition and Understanding (ASRU'15), Arizona, December 2015
- Bo-Hsiang Tseng, Hung-yi Lee, Lin-Shan Lee, "Personalizing Universal Recurrent Neural Network Language Model with User Characteristic Features by Social Network Crowdsourcing", the 11th biannual IEEE workshop on Automatic Speech Recognition and Understanding (ASRU'15), Arizona, December 2015
- Cheng-Tao Chung, Cheng-Yu Tsai, Hsiang-Hung Lu, Chia-Hsiang Liu, Hung-yi Lee, Lin-shan Lee, "An Iterative Deep Learning Framework for Unsupervised Discovery of Speech Features and Linguistic Units with Applications on Spoken Term Detection", the 11th biannual IEEE workshop on Automatic Speech Recognition and Understanding (ASRU'15), Arizona, December 2015
- Sheng-syun Shen, Hung-yi Lee, Shang-wen Li, Victor Zue and Lin-shan Lee,
"Structuring Lectures in Massive Open Online Courses (MOOCs) for Efficient Learning by Linking Similar Sections and Predicting Prerequisites", the 16th Annual Conference of the International Speech Communication Association (INTERSPEECH'15), Dresden, Germany, Sept. 2015
- Hung-tsung Lu, Yuan-ming Liou, Hung-yi Lee and Lin-shan Lee, "Semantic Retrieval of Personal Photos using a Deep Autoencoder Fusing Visual Features with Speech Annotations Represented as Word/Paragraph Vectors", the 16th Annual Conference of the International Speech Communication Association (INTERSPEECH'15), Dresden, Germany, Sept. 2015
- Ching-Feng Yeh, Yuan-ming Liou, Hung-yi Lee and Lin-shan Lee, "Personalized Speech Recognizer with Keyword-based Personalized Lexicon and Language Model using Word Vector Representations", the 16th Annual Conference of the International Speech Communication Association (INTERSPEECH'15), Dresden, Germany, Sept. 2015
- Hung-yi Lee, Yu Zhang, Ekapol Chuangsuwanich, James Glass, "Graph-based Re-ranking using Acoustic Feature Similarity between Search Results for Spoken Term Detection on Low-resource Languages", the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH'14), Singapore, Sept. 2014
- Han Lu, Sheng-syun Shen, Sz-Rung Shiang, Hung-yi Lee and Lin-shan Lee, "Alignment of Spoken Utterances with Slide Content for Easier Learning with Recorded Lectures using Structured Support Vector Machine (SVM)", the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH'14), Singapore, Sept. 2014
- Sz-Rung Shiang, Hung-yi Lee and Lin-shan Lee, "Spoken Question Answering Using Tree-structured Conditional Random Fields and Two-layer Random Walk", the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH'14), Singapore, Sept. 2014
- Yuan-ming Liou, Yi-sheng Fu, Hung-yi Lee and Lin-shan Lee, "Semantic Retrieval of Personal Photos using Matrix Factorization and Two-layer Random Walk Fusing Sparse Speech Annotations with Visual Features", the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH'14), Singapore, Sept. 2014
- Hung-yi Lee, Ting-yao Hu, How Jing, Yun-Fan Chang, Yu Tsao, Yu-Cheng Kao, Tsang-Long Pao, "Ensemble of Machine Learning and Acoustic Segment Model Techniques for Speech Emotion and Autism Spectrum Disorders Recognition", the 14th Annual Conference of the International Speech Communication Association (INTERSPEECH'13), Lyon, France, August 2013
- Hung-yi Lee, Yu-yu Chou, Yow-Bang Wang, Lin-shan Lee, "Unsupervised Domain Adaptation for Spoken Document Summarization with Structured Support Vector Machine", the 38th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'13), Vancouver, Canada, May 2013
- Hung-yi Lee, Yun-Chiao Li, Cheng-Tao Chung, Lin-shan Lee, "Enhancing Query Expansion for Semantic Retrieval of Spoken Content with Automatically Discovered Acoustic Patterns", the 38th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'13), Vancouver, Canada, May 2013
- Yun-Chiao Li, Hung-yi Lee, Cheng-Tao Chung, Chun-an Chan, and Lin-shan Lee, "Towards Unsupervised Semantic Retrieval of Spoken Content with Query Expansion based on Automatically Discovered Acoustic Patterns", the 10th biannual IEEE workshop on Automatic Speech Recognition and Understanding (ASRU'13), Olomouc, Czech Republic, December 2013
- Sz-Rung Shiang, Hung-yi Lee, Lin-shan Lee, "Supervised Spoken Document Summarization Based on Structured Support Vector Machine with Utterance Clusters as Hidden Variables", the 14th Annual Conference of the International Speech Communication Association (INTERSPEECH'13), Lyon, France, August 2013
- Tsung-Hsien Wen, Aaron Heidel, Hung-yi Lee, Yu Tsao, Lin-shan Lee, "Recurrent Neural Network Based Language Model Personalization by Social Network Crowdsourcing", the 14th Annual Conference of the International Speech Communication Association (INTERSPEECH'13), Lyon, France, August 2013 (one of the 12 finalists for the best student paper award)
- Ching-Feng Yeh, Hung-yi Lee and Lin-shan Lee, "Speaking Rate Normalization with Lattice-based Context-dependent Phoneme Duration Modeling for Personalized Speech Recognizers on Mobile Devices", the 14th Annual Conference of the International Speech Communication Association (INTERSPEECH'13), Lyon, France, August 2013
- Tsung-Hsien Wen, Hung-yi Lee, Pei-Hao Su, Lin-shan Lee, " Interactive Spoken Content Retrieval by Extended Query Model and Continuous State Space Markov Decision Process", the 38th IEEE International Conference on Acoustics, Speech and Signal Processing Vancouver, Canada, May 2013
- Hung-yi Lee, Tsung-Hsien Wen, Lin-shan Lee, "Improved Semantic Retrieval of Spoken Content by Language models Enhanced with Acoustic Similarity Graph", the 4th IEEE Workshop on Spoken Language Technology (SLT'12), Miami, Florida, December 2012
- Tsung-Hsien Wen, Hung-yi Lee, Lin-shan Lee, "Personalized Language Modeling by Crowd Sourcing with Social Network Data for Voice Access of Cloud Applications", the 4th IEEE Workshop on Spoken Language Technology (SLT'12), Miami, Florida, December 2012
- Hung-yi Lee, Yu-yu Chou, Yow-Bang Wang, Lin-shan Lee, "Supervised Spoken Document Summarization Jointly Considering Utterance Importance and Redundancy by Structured Support Vector Machine", the 13th Annual Conference of the International Speech Communication Association (INTERSPEECH'12), Portland, Oregon, September 2012
- Hung-yi Lee, Po-wei Chou, Lin-shan Lee, "Open-Vocabulary Retrieval of Spoken Content with Shorter/Longer Queries Considering Word/Subword-based Acoustic Feature Similarity", the 13th Annual Conference of the International Speech Communication Association (INTERSPEECH'12), Portland, Oregon, September 2012
- Hung-yi Lee, Yun-nung Chen, Lin-shan Lee, "Utterance-level Latent Topic Transition Modeling for Spoken Documents and its Application in Automatic Summarization", the 37th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'12), Kyoto, Japan, March 2012
- Tsung-Hsien Wen, Hung-yi Lee, Lin-shan Lee, "Interactive Spoken Content Retrieval with Different Types of Actions Optimized by a Markov Decision Process", the 13th Annual Conference of the International Speech Communication Association (INTERSPEECH'12), Portland, Oregon, September 2012 (one of the 10 finalists for the best student paper award)
- Tsung-wei Tu, Hung-yi Lee, Lin-shan Lee, "Semantic Query Expansion and Context-based Discriminative Term Modeling for Spoken Document Retrieval", the 37th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'12), Kyoto, Japan, March 2012 (IEEE Spoken Language Processing Student Travel Grant)
- Yun-Nung Chen, Yu Huang, Hung-yi Lee, Lin-shan Lee, "Unsupervised Two-Stage Keyword Extraction from Spoken Documents by Topic Coherence and Support Vector Machine", the 37th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'12), Kyoto, Japan, March 2012
- Ching-Feng Yeh, Aaron Heidel, Hung-yi Lee, Lin-shan Lee, "Recognition of Highly Imbalanced Code-mixed Bilingual Speech with Frame-level Language Detection based on Blurred Posteriorgram", the 37th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'12), Kyoto, Japan, March 2012
- Hung-yi Lee, Yun-nung Chen, Lin-shan Lee, "Improved Speech Summarization and Spoken Term Detection with Graphical Analysis of Utterance Similarities", the 3rd Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2011), Xi'an, China, October 2011
- Hung-yi Lee, Tsung-wei Tu, Chia-ping Chen, Chao-yu Huang, Lin-shan Lee , "Improved Spoken Term Detection Using Support Vector Machines based on Lattice Context Consistency", the 36th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'11), Prague, Czech Republic, May 2011
- Tsung-wei Tu, Hung-yi Lee, Lin-shan Lee,
"Improved Spoken Term Detection using Support Vector Machines with Acoustic and Context Features from Pseudo-relevance Feedback", the 9th biannual IEEE workshop on Automatic Speech Recognition and Understanding (ASRU'11), Hawaii, December 2011 (one of the 5 finalists for the best student paper award)
- Yun-nung Chen, Chia-ping Chen, Hung-yi Lee, Chun-an Chan, Lin-shan Lee, "Improved Spoken Term Detection with Graph-based Re-ranking in Feature Space", the 36th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'11), Prague, Czech Republic, May 2011
- Hung-yi Lee, Chia-ping Chen, Ching-feng Yeh, Lin-shan Lee, "A Framework Integrating Different Relevance Feedback Scenarios and Approaches for Spoken Term Detection", the 3rd IEEE Workshop on Spoken Language Technology (SLT'10), Berkeley, California, December 2010
- Hung-yi Lee, Chia-ping Chen, Ching-feng Yeh, Lin-shan Lee, "Improved Spoken Term Detection by Discriminative Training of Acoustic Models based on User Relevance Feedback", the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH'10), Makuhari, Japan, September 2010
- Hung-yi Lee and Lin-shan Lee, "Integrating Recognition and Retrieval with User Feedback: A New Framework for Spoken Term Detection", the 35th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'10), Dallas, Texas, March 2010 (cited in textbook)
- Chia-ping Chen, Hung-yi Lee, Ching-feng Yeh, Lin-shan Lee, "Improved Spoken Term Detection by Feature Space Pseudo-Relevance Feedback", the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH'10), Makuhari, Japan, September 2010
- Yu-Hui Chen, Chia-Chen Chou, Hung-yi Lee, Lin-shan Lee, "An Initial Attempt to Improve Spoken Term Detection by Learning Optimal Weights for Different Indexing Features", the 35th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'10), Dallas, Texas, March 2010 (cited in textbook)
- Hung-yi Lee, Yueh-Lien Tang, Hao Tang, Lin-shan Lee, "Spoken Term Detection from Bilingual Spontaneous Speech Using Code-switched Lattice-based Structures for Words and Subword Units", the 8th biannual IEEE workshop on Automatic Speech Recognition and Understanding, (ASRU'09), Merano, Italy, December 2009
- Chao-hong Meng, Hung-yi Lee, Lin-shan Lee, "Improved Lattice-based Spoken Document Retrieval by Directly Learning from the evaluation Measures", the 34th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'09), Taipei, Taiwan, April 2009
Journal
- Yi-Chen Chen, Sung-Feng Huang, Hung-yi Lee, Yu-Hsuan Wang, Chia-Hao Shen,
"Audio Word2vec: Sequence-to-sequence Autoencoding for Unsupervised Learning of Audio Segmentation and Representation",
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 9, pp. 1481-1493, Sept. 2019
- Chia-Hsuan Lee, Hung-yi Lee, Szu-Lin Wu, Chi-Liang Liu, Wei Fang, Juei-Yang Hsu, Bo-Hsiang Tseng,
"Machine Comprehension of Spoken Content: TOEFL Listening Test and Spoken SQuAD",
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 9, pp. 1469-1480, Sept. 2019
- Yi-Lin Tuan, Hung-Yi Lee, "Improving Conditional Sequence Generative Adversarial Networks by Stepwise Evaluation",
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 4, pp. 788-798, April 2019
- Hung-Yi Lee, Pei-Hung Chung, Yen-Chen Wu, Tzu-Hsiang Lin, Tsung-Hsien Wen, "Interactive Spoken Content Retrieval by Deep Reinforcement Learning,"
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 12, pp. 2447-2459, Dec. 2018
- Hung-yi Lee, Bo-Hsiang Tseng, Tsung-Hsien Wen, Yu Tsao, "Personalizing Recurrent Neural Network Based Language Model by Social Network,"
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 3, pp. 519-530, March 2017
- Shun-Yao Shih, Fan-Keng Sun, Hung-yi Lee,
"Temporal Pattern Attention for Multivariate Time Series Forecasting",
accepted by the journal track of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD)
- Lin-shan Lee, James Glass, Hung-yi Lee, Chun-an Chan,
"Spoken Content Retrieval —Beyond Cascading Speech Recognition with Text Retrieval,"
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.23, no.9, pp.1389-1420, Sept. 2015
- Hung-yi Lee, Ching-feng Yeh, Yun-Nung Chen, Yu Huang, Sheng-Yi Kong and Lin-shan Lee,
“Spoken Knowledge Organization by Semantic Structuring and a Prototype Course Lecture System for Personalized Learning”, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.22, no.5, pp.883-898, May 2014
(Figure 9 of the article selected as journal cover)
- Hung-yi Lee, Po-wei Chou, Lin-shan Lee, Improved open-vocabulary spoken content retrieval with word and subword lattices using acoustic feature similarity, Computer Speech & Language, Volume 28, Issue 5, pp. 1045-1065, Sept. 2014
- Hung-yi Lee, Lin-shan Lee, "Improved Semantic Retrieval of Spoken Content by Document/Query Expansion with Random Walk over Acoustic Similarity Graphs," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.22, no.1, pp.80-94, Jan. 2014(Figure 2 of the article selected as journal cover)
- Hung-yi Lee, Lin-shan Lee, "Enhanced Spoken Term Detection Using Support Vector Machines and Weighted Pseudo Examples," IEEE Transactions on Audio, Speech, and Language Processing, vol.21, no.6, pp.1272-1284, June 2013
- Hung-yi Lee, Chia-ping Chen, Lin-shan Lee, "Integrating Recognition and Retrieval with Relevance Feedback for Spoken Term Detection," IEEE Transactions on Audio, Speech, and Language Processing, vol.20, no.7, pp.2095-2110, Sept. 2012
- Yi-cheng Pan, Hung-yi Lee, Lin-shan Lee, "Interactive Spoken Document Retrieval With Suggested Key Terms Ranked by a Markov Decision Process", IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.2, pp. 632-645, Feb. 2012
Preprint
-
Tsung-Han Wu, Chun-Cheng Hsieh, Yen-Hao Chen, Po-Han Chi, Hung-yi Lee, "Hand-crafted Attention is All You Need? A Study of Attention on Self-supervised Audio Transformer", arXiv preprint, 2020
-
Po-Han Chi, Pei-Hung Chung, Tsung-Han Wu, Chun-Cheng Hsieh, Shang-Wen Li, Hung-yi Lee, "Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation", arXiv preprint, 2020
-
Heng-Jui Chang, Alexander H. Liu, Hung-yi Lee, Lin-shan Lee, "End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Layer-wise Transfer Learning", arXiv preprint, 2020
-
Chien-yu Huang, Yist Y. Lin, Hung-yi Lee, Lin-shan Lee, "Defending Your Voice: Adversarial Attack on Voice Conversion", arXiv preprint, 2020
-
Yuan-Kuei Wu, Chao-I Tuan, Hung-yi Lee, Yu Tsao, "SADDEL: Joint Speech Separation and Denoising Model based on Multitask Learning", arXiv preprint, 2020
-
Chao-I Tuan, Yuan-Kuei Wu, Hung-yi Lee, Yu Tsao, "MITAS: A Compressed Time-Domain Audio Separation Network with Parameter Sharing", arXiv preprint, 2019
-
Po-chun Hsu, Chun-hsuan Wang, Andy T. Liu, Hung-yi Lee, "Towards Robust Neural Vocoding for Speech Generation: A Survey", arXiv preprint, 2019
-
Chia-Hsuan Lee, Hung-Yi Lee, "Cross-Lingual Transfer Learning for Question Answering", arXiv preprint, 2019
- Yi-Chen Chen, Chia-Hao Shen, Sung-Feng Huang, Hung-yi Lee, "Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only", arXiv preprint, 2018
-
Yi-Lin Tuan, Jinzhi Zhang, Yujia Li, Hung-yi Lee, "Proximal Policy Optimization and its Dynamic Version for Sequence Generation", arXiv preprint, 2018
- Da-Rong Liu, Shun-Po Chuang, Hung-yi Lee,
"Attention-based Memory Selection Recurrent Network for Language Modeling", arXiv preprint, 2016