2024-07-17 [ACM MM 2024] Efficient Training for Multilingual Visual Speech Recognition (by Minsu Kim, Jeonghun Yeo) is accepted in ACM MM 2024
2024-07-03 [ECCV 2024] MoAI: Mixture of All Intelligence for Large Language and Vision Models (by Byung-Kwan Lee) is accepted in ECCV 2024
2024-07-03 [Pattern Recognition] Text-Guided Distillation Learning to Diversify Video Embeddings (by Sangmin Lee) is accepted in Pattern Recognition
2024-07-03 [ICIP 2024] Environmental Context Understanding (by Hyunjun Kim) is accepted in ICIP 2024
2024-07-03 [ICIP 2024] A Language-Driven Approach for Cross-modal Alignment Fusion (by Taeheon Kim, Sangyun Chung, Youngjoon Yu) is accepted in ICIP 2024 workshop
2024-06-26 [2024 가을학기 합격생 연구실 TO] 국비 석사 2명, KAIST 석사 1명, 산학장학생 등 TO 있습니다.
2024-05-19 [Recent Ph.D. graduate: postdocs] Minsu, Ph.D graduate of 2024 has joined postdoc in AI research at META.
2024-05-19 [Amazon, Google Internships] Sungjune and Se Jin will join Amazon and Google for research internships, respectively.
2024-05-16 [ACL 2024] CoLLaVO: Crayon Large Language and Vision mOdel (by Byung-Kwan Lee) is accepted in Findings of the Association for Computational Linguistics, ACL 2024
2024-05-16 [ACL 2024] Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation (by Se Jin Park, Chae Won Kim) accepted In Proceedings of the Annual Meeting of the Association for Computational Linguistics, ACL 2024
2024-04-26 [Pattern Recognition] Robust Pedestrian Detection via Constructing Versatile Pedestrian Knowledge Bank (by Sungjune Park, Hyunjun Kim) is accepted in Pattern Recognition
2024-03-26 [IEEE TCSVT] Integrating Language-Derived Appearance Elements with Visual Cues in Pedestrian Detection (by Sungjune Park, Hyunjun Kim) is accepted in IEEE Trans. on CSVT
2024-03-12 [2024 가을학기 대학원생 모집] 국비 석사 2명, KAIST박사 1명, 산학장학생 등 모집합니다. 관심있는 학생은 ymro@kaist.ac.kr 로 메일하기 바랍니다.
2024-02-27 [CVPR 2024] Causal Mode Multiplexer: A Novel Framework for Unbiased Data (by Taeheon Kim) is accepted in CVPR 2024
2024-02-27 [CVPR 2024] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation (by Se Jin Park, Minsu Kim) is accepted in CVPR 2024
2024-02-27 [IEEE TMM] AKVSR: Compressing Audio Knowledge of a Pretrained Model (by Jeong Hun Yeo) is accepted in IEEE Trans. on Multimedia
2024-02-22 Recruitment for PhD and MS Students
2024-02-21 Prof. Yong Man Ro Named ICT Endowed Chair Professor at KAIST
2023-12-20 [ICASSP 2024] Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens (by Minsu Kim) is accepted in ICASSP 2024
2023-12-20 [ICASSP 2024] Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from Whisper (by Jeong Hun Yeo and Minsu Kim) is accepted in ICASSP 2024
2023-12-20 [ICASSP 2024] Persona Extraction through Semantic Similarity for Emotional Support Conversation Generation (by Seunghee Han) is accepted in ICASSP 2024
2023-12-20 [ICASSP 2024] Text-driven Talking Face Synthesis by Reprogramming Audio-driven Models (by Jeongsoo Choi) is accepted in ICASSP 2024
2023-12-20 [ICASSP 2024] Exploring Phonetic Context-aware Lip-Sync for Talking Face Generation (by Se Jin Park) is accepted in ICASSP 2024
2023-12-10 [AAAI 2024] OSR via Visual Prompts from Common-Sense Knowledge (by Seongyeop Kim) is accepted in AAAI 2024
2023-12-04 [IEEE TDSC] Defending Video Recognition Model against Adversarial Perturbations via Defense Patterns (by Hong Joo Lee) is accepted in IEEE TDSC
2023-10-08 [EMNLP 2023] Intuitive Multilingual Audio-Visual Speech Recognition with a Single-Trained Model (by Joanna Hong) is accepted in EMNLP 2023
2023-10-08 [IEEE TNNLS] Enabling Visual Object Detection with Object Sounds via Visual Modality Recalling Memory (by Jung Uk Kim) is accepted in IEEE TNNLS
2023-07-17 [ICCV 2023] Lip Reading for Low-resource Languages by General Speech Knowledge (by Minsu Kim and Jeong Hun Yeo) is accepted in ICCV 2023
2023-07-17 [ICCV 2023] Mitigating Adversarial Vulnerability through Causal Parameter Estimation (by Byung-Kwan Lee and Junho Kim) is accepted in ICCV 2023
2023-07-17 [ICCV 2023] DiffV2S: Diffusion-based Video-to-Speech Synthesis (by Jeongsoo Choi and Joanna Hong) is accepted in ICCV 2023
IVYLAB & IVLLAB