Search this site
Embedded Files
Skip to main content
Skip to navigation
IVYLab & IVLLab
IVYLab & IVLLab
LLM Multimodal Highlights
People
Professor
Members
Research Collaborators
Alumni
Research
Lab Overview
Research Fields
Research Demo
Publications
International Conference
International Journal
International Standards
Patents
Domestic Papers
Gallery
Board
Contact
Database
IVYLab & IVLLab
IVYLab & IVLLab
LLM Multimodal Highlights
People
Professor
Members
Research Collaborators
Alumni
Research
Lab Overview
Research Fields
Research Demo
Publications
International Conference
International Journal
International Standards
Patents
Domestic Papers
Gallery
Board
Contact
Database
More
IVYLab & IVLLab
LLM Multimodal Highlights
People
Professor
Members
Research Collaborators
Alumni
Research
Lab Overview
Research Fields
Research Demo
Publications
International Conference
International Journal
International Standards
Patents
Domestic Papers
Gallery
Board
Contact
Database
International Journal
2024
2023
2022
2021
2020
2019
~ 2018
[#15
9
]
MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection
Taeheon Kim, Sangyun Chung, Damin Yeom, Youngjoon Yu, Hak Gu Kim, Yong Man Ro
IEEE Transactions on Circuits and Systems for Video Technology
[#158] Prompt Tuning of Deep Neural Networks for Speaker-adaptive Visual Speech Recognition
Minsu Kim, Hyeong-Il Kim, Yong Man Ro
IEEE Transactions on Pattern Analysis and Machine Intelligence
[#157] Advancing Causal Intervention in Image Captioning with Causal Prompt
Youngjoon Yu, Yeonju Kim, Yong Man Ro
IEEE Transactions on Neural Networks and Learning Systems
[#15
6
]
Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation
Minsu Kim, Jeongsoo Choi, Dahun Kim, Yong Man Ro
IEEE Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 3934-3946, 2024. /
Demo
[#155]
Text-Guided Distillation Learning to Diversify Video Embeddings for Text-Video Retrieval
Sangmin Lee, Hyung-Il Kim, Yong Man Ro
Pattern Recognition, vol. 156, no. 3, pp. 110754, 2024.
[#15
4
]
Robust Pedestrian Detection via Constructing Versatile Pedestrian Knowledge Bank
Sungjune Park*, Hyunjun Kim*, Yong Man Ro (* equal contributor)
Pattern Recognition, vol. 153, no. 4
, pp. 110539, 2024.
[#153]
Integrating Language-Derived Appearance Elements with Visual Cues in Pedestrian Detection
Sungjune Park*, Hyunjun Kim*, Yong Man Ro (* equal contributor)
IEEE Transactions on Circuits and Systems for Video Technology
, pp. 1-1, 2024.
[#152]
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Jeong Hun Yeo, Minsu Kim, Jeongsoo Choi, Dae Hoe Kim, and Yong Man Ro
IEEE Transactions on Multimedia
, vol. 26, pp. 6462-6474, 2024.
Report abuse
Page details
Page updated
Report abuse