Multimodal scene understanding : algorithms, applications and deep learning / edited by Michael Ying Yang, Bodo Rosenhahn, Vittorio Murino.

Multimodal Scene Understanding: Algorithms, Applications and Deep Learning presents recent advances in multi-modal computing, with a focus on computer vision and photogrammetry. It provides the latest algorithms and applications that involve combining multiple sources of information and describes th...

Full description

Saved in:
Bibliographic Details
Online Access: Full Text (via O'Reilly/Safari)
Other Authors: Yang, Michael Ying (Editor), Rosenhahn, Bodo (Editor), Murino, Vittorio (Editor)
Format: eBook
Language:English
Published: London ; San Diego, CA : Academic Press, [2019]
Subjects:

MARC

LEADER 00000cam a2200000 i 4500
001 b11379387
006 m o d
007 cr |||||||||||
008 190718s2019 enkab ob 001 0 eng d
005 20240829145919.6
015 |a GBB9C9474  |2 bnb 
016 7 |a 019475544  |2 Uk 
019 |a 1122796018  |a 1125218264  |a 1125965489  |a 1127109461  |a 1135854093  |a 1178893943  |a 1192341476 
020 |a 9780128173596  |q (electronic book) 
020 |a 0128173599  |q (electronic book) 
020 |a 9780128173589  |q (electronic book) 
020 |a 0128173580  |q (electronic book) 
029 1 |a AU@  |b 000065674606 
029 1 |a AU@  |b 000066136283 
029 1 |a AU@  |b 000066144916 
029 1 |a AU@  |b 000066233062 
029 1 |a UKMGB  |b 019475544 
029 1 |a AU@  |b 000068845859 
029 1 |a DKDLA  |b 820030-katalog:2534111 
035 |a (OCoLC)safo1109390062 
035 |a (OCoLC)1109390062  |z (OCoLC)1122796018  |z (OCoLC)1125218264  |z (OCoLC)1125965489  |z (OCoLC)1127109461  |z (OCoLC)1135854093  |z (OCoLC)1178893943  |z (OCoLC)1192341476 
037 |a safo9780128173596 
040 |a N$T  |b eng  |e rda  |e pn  |c N$T  |d EBLCP  |d N$T  |d UKMGB  |d OCLCF  |d OPELS  |d YDXIT  |d UKAHL  |d YDX  |d OCLCQ  |d OCL  |d SFB  |d OCLCQ  |d SFB  |d VT2  |d OCLCQ  |d OCLCO  |d S2H  |d OCLCO  |d OCLCQ  |d OCL  |d OCLCO  |d OCLCQ  |d SXB  |d OCLCQ  |d OCLCO 
049 |a GWRE 
050 4 |a Q342  |b .M85 2019 
245 0 0 |a Multimodal scene understanding :  |b algorithms, applications and deep learning /  |c edited by Michael Ying Yang, Bodo Rosenhahn, Vittorio Murino. 
264 1 |a London ;  |a San Diego, CA :  |b Academic Press,  |c [2019] 
300 |a 1 online resource (ix, 412 pages) :  |b illustrations (some color), maps 
336 |a text  |b txt  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a volume  |b nc  |2 rdacarrier 
504 |a Includes bibliographical references and index. 
505 0 |a Front Cover; Multimodal Scene Understanding; Copyright; Contents; List of Contributors; 1 Introduction to Multimodal Scene Understanding; 1.1 Introduction; 1.2 Organization of the Book; References; 2 Deep Learning for Multimodal Data Fusion; 2.1 Introduction; 2.2 Related Work; 2.3 Basics of Multimodal Deep Learning: VAEs and GANs; 2.3.1 Auto-Encoder; 2.3.2 Variational Auto-Encoder (VAE); 2.3.3 Generative Adversarial Network (GAN); 2.3.4 VAE-GAN; 2.3.5 Adversarial Auto-Encoder (AAE); 2.3.6 Adversarial Variational Bayes (AVB); 2.3.7 ALI and BiGAN 
505 8 |a 2.4 Multimodal Image-to-Image Translation Networks2.4.1 Pix2pix and Pix2pixHD; 2.4.2 CycleGAN, DiscoGAN, and DualGAN; 2.4.3 CoGAN; 2.4.4 UNIT; 2.4.5 Triangle GAN; 2.5 Multimodal Encoder-Decoder Networks; 2.5.1 Model Architecture; 2.5.2 Multitask Training; 2.5.3 Implementation Details; 2.6 Experiments; 2.6.1 Results on NYUDv2 Dataset; 2.6.2 Results on Cityscape Dataset; 2.6.3 Auxiliary Tasks; 2.7 Conclusion; References; 3 Multimodal Semantic Segmentation: Fusion of RGB and Depth Data in Convolutional Neural Networks; 3.1 Introduction; 3.2 Overview; 3.2.1 Image Classi cation and the VGG Network 
505 8 |a 3.2.2 Architectures for Pixel-level Labeling3.2.3 Architectures for RGB and Depth Fusion; 3.2.4 Datasets and Benchmarks; 3.3 Methods; 3.3.1 Datasets and Data Splitting; 3.3.2 Preprocessing of the Stanford Dataset; 3.3.3 Preprocessing of the ISPRS Dataset; 3.3.4 One-channel Normal Label Representation; 3.3.5 Color Spaces for RGB and Depth Fusion; 3.3.6 Hyper-parameters and Training; 3.4 Results and Discussion; 3.4.1 Results and Discussion on the Stanford Dataset; 3.4.2 Results and Discussion on the ISPRS Dataset; 3.5 Conclusion; References 
505 8 |a 4 Learning Convolutional Neural Networks for Object Detection with Very Little Training Data4.1 Introduction; 4.2 Fundamentals; 4.2.1 Types of Learning; 4.2.2 Convolutional Neural Networks; 4.2.2.1 Arti cial neuron; 4.2.2.2 Arti cial neural network; 4.2.2.3 Training; 4.2.2.4 Convolutional neural networks; 4.2.3 Random Forests; 4.2.3.1 Decision tree; 4.2.3.2 Random forest; 4.3 Related Work; 4.4 Traf c Sign Detection; 4.4.1 Feature Learning; 4.4.2 Random Forest Classi cation; 4.4.3 RF to NN Mapping; 4.4.4 Fully Convolutional Network; 4.4.5 Bounding Box Prediction; 4.5 Localization 
505 8 |a 4.6 Clustering4.7 Dataset; 4.7.1 Data Capturing; 4.7.2 Filtering; 4.8 Experiments; 4.8.1 Training and Test Data; 4.8.2 Classi cation; 4.8.3 Object Detection; 4.8.4 Computation Time; 4.8.5 Precision of Localizations; 4.9 Conclusion; Acknowledgment; References; 5 Multimodal Fusion Architectures for Pedestrian Detection; 5.1 Introduction; 5.2 Related Work; 5.2.1 Visible Pedestrian Detection; 5.2.2 Infrared Pedestrian Detection; 5.2.3 Multimodal Pedestrian Detection; 5.3 Proposed Method; 5.3.1 Multimodal Feature Learning/Fusion; 5.3.2 Multimodal Pedestrian Detection; 5.3.2.1 Baseline DNN model 
520 |a Multimodal Scene Understanding: Algorithms, Applications and Deep Learning presents recent advances in multi-modal computing, with a focus on computer vision and photogrammetry. It provides the latest algorithms and applications that involve combining multiple sources of information and describes the role and approaches of multi-sensory data and multi-modal deep learning. The book is ideal for researchers from the fields of computer vision, remote sensing, robotics, and photogrammetry, thus helping foster interdisciplinary interaction and collaboration between these realms. Researchers collecting and analyzing multi-sensory data collections - for example, KITTI benchmark (stereo+laser) - from different platforms, such as autonomous vehicles, surveillance cameras, UAVs, planes and satellites will find this book to be very useful. 
588 0 |a Online resource; title from digital title page (viewed on October 10, 2019). 
650 0 |a Computational intelligence. 
650 0 |a Computer vision. 
650 0 |a Algorithms. 
650 0 |a Engineering. 
650 0 |a Artificial intelligence. 
650 0 |a Computer algorithms. 
650 7 |a Computer algorithms  |2 fast 
650 7 |a Computer vision  |2 fast 
650 7 |a Algorithms  |2 fast 
650 7 |a Artificial intelligence  |2 fast 
650 7 |a Computational intelligence  |2 fast 
650 7 |a Engineering  |2 fast 
700 1 |a Yang, Michael Ying,  |e editor. 
700 1 |a Rosenhahn, Bodo,  |e editor.  |1 https://id.oclc.org/worldcat/entity/E39PCjxvBfRgQyKMbbDBD89GXm 
700 1 |a Murino, Vittorio,  |e editor.  |1 https://id.oclc.org/worldcat/entity/E39PCjrdXW8yKCrqqjpKXGtCpK 
776 0 8 |i Ebook version :  |z 9780128173596 
776 0 8 |i Print version:  |t Multimodal scene understanding.  |d London ; San Diego, CA : Academic Press, [2019]  |z 0128173580  |z 9780128173589  |w (OCoLC)1089504196 
856 4 0 |u https://go.oreilly.com/UniOfColoradoBoulder/library/view/~/9780128173596/?ar  |z Full Text (via O'Reilly/Safari) 
915 |a - 
956 |a O'Reilly-Safari eBooks 
956 |b O'Reilly Online Learning: Academic/Public Library Edition 
994 |a 92  |b COD 
998 |b Subsequent record output 
999 f f |i 9ae7b67d-7bb5-555c-8af1-6893974995b7  |s fa0745e8-909c-5773-8b17-7260d16f8c2d 
952 f f |p Can circulate  |a University of Colorado Boulder  |b Online  |c Online  |d Online  |e Q342 .M85 2019  |h Library of Congress classification  |i web  |n 1