Skip to main content
  • SPS
    Members: $20.00
    IEEE Members: $30.00
    Non-members: $40.00
    Length: 3:06:38
Short Course 08 Oct 2023

Our experience of the world surrounding us is multi-modal; we see things, hear sounds, smell odors, and so on. Modality refers to a way in which the world can be senses and experienced. In case of Machine Learning, modality refers to type of data that a model can process such as audio, image or text. Each modality has its own unique characteristics and properties requiring different types of processing and analysis for extraction of useful information. Multi-modal learning is a paradigm focused on combining multiple modalities of data such as audio-image, image-text learning to improve the performance of a model. The idea behind multimodal learning is that different modalities can provide complementary cues that can help a model make more accurate prediction or decisions. For example, a model that can process both images and text can better understand the context of image and make accurate predictions. Keeping in view the importance of multimodal learning, we have designed this course to acquaint participants with the latest research trends and applications in multimodal learning. This short course provides a detailed, principle and rationale introduction to Multimodal Learning. This course also discusses the applications, and research problems that can be carried out by participants.