Chapter 2

Deep Learning for Multimodal Data Fusion

Asako Kanezaki^⁎; Ryohei Kuga^†; Yusuke Sugano^†; Yasuyuki Matsushita^† ^⁎National Institute of Advanced Industrial Science and Technology, Tokyo, Japan^†Graduate School of Information Science and Technology, Osaka University, Osaka, Japan

Abstract

Recent advance in deep learning has enabled realistic image-to-image translation of multimodal data. Along with the development, auto-encoders and generative adversarial networks (GAN) have been extended to deal with multimodal input and output. At the same time, multitask learning has been shown to efficiently and effectively address multiple mutually related recognition tasks. Various scene understanding tasks, such as semantic segmentation and depth prediction, ...

Get Multimodal Scene Understanding now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Multimodal Scene Understanding by Michael Ying Yang, Bodo Rosenhahn, Vittorio Murino

Deep Learning for Multimodal Data Fusion

Abstract

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly