Multimodal Sparse Representation Learning and Applications
- 최초 등록일
- 2019.11.16
- 최종 저작일
- 2019.11
- 26페이지/
어도비 PDF
- 가격 6,100원

판매자한국학술정보(주)

* 본 문서는 배포용으로 복사 및 편집이 불가합니다.
서지정보
ㆍ발행기관 : 중앙대학교 인문콘텐츠연구소
ㆍ수록지정보 : 인공지능인문학연구 / 2권
ㆍ저자명 : ( Miriam Cha ) , ( Youngjune L. Gwon ) , ( H. T. Kung )
영어 초록
Sparse coding has been applied successfully to single-modality scenarios. We consider a sparse coding framework for multimodal representation learning. Our framework aims to capture semantic correlation between different data types via joint sparse coding. Such joint optimization induces a unified representation that is sparse and shared across modalities. In particular, we compute joint, cross-modal, and stacked cross-modal sparse codes. We find that these representations are robust to noise and provide greater flexibility in modeling features for multimodal input. A good multimodal framework should be able to fill in missing modality given the other and improve representational efficiency. We demonstrate missing modality case through image denoising and indicate effectiveness of cross-modal sparse code in uncovering the relation of the clean-corrupted image pairs. Furthermore, we experiment with multi-layer sparse coding to learn highly nonlinear relationship. The effectiveness of our approach is also demonstrated in the multimedia event detection and retrieval on the TRECVID dataset (audio-video), category classification on the Wikipedia dataset (image-text), and sentiment classification on PhotoTweet (image-text).
참고 자료
없음
"인공지능인문학연구"의 다른 논문
더보기 (1/6)