** Exact topics and schedule subject to change, based on interests and time. **
| 1 |
Introduction [slides]
-
What is Multimodal? Definitions, dimensions of heterogeneity and cross-modal interactions.
-
Historical view and multimodal research tasks.
-
Core technical challenges: representation, alignment, transference, reasoning, generation, and quantification.
|
|
| 2 |
Multimodal challenges
-
Why is multimodal hard? Introduction to core challenges.
-
Overview of multimodal representation, alignment, reasoning, transfer, generation, and quantification.
-
Identifying recent solutions for practitioners.
|
|
| 3 |
Recent advances in multimodal AI
-
Multimodal transformers and foundation models
-
Multimodal generative models
-
Multimodal agents
|
|
| 4 |
Multimodal AI for Human Sensing [slides]
-
Sensor data synthesis: Video to Doppler, Video to IMU, Video to Audio, MoCap to IMU, MoCap to UWB
-
Data augmentation
-
Temporal data modeling
|
|
| 5 |
Ethics, interpretability and privacy
-
Privacy and fairness concerns
-
Handling errors and uncertainty
-
Bringing humans into the loop
|
|
| 6 |
Applications
-
Human activity recognition, pose estimation, gesture recognition
-
Infrastructure and environmental sensing
-
Wellness and fitness tracking, mobile health monitoring
|
|
| 7 |
Hardware and Sensors for Multimodal AI [slides]
-
Challenges and opportunities in hardware and sensors for multimodal AI
-
Importance of scalable, customizable hardware platforms
-
Key applications benefiting from advancements in multimodal sensing and feedback interfaces
|
|
| 8 |
Multimodal sensing and feedback interface
-
Advanced fabrication techniques for multimodal sensing hardware
-
Addressing scalability, adaptability, and customization in hardware development
-
Innovations in state-of-the-art data acquisition systems for multimodal interfaces
-
Integration of diverse sensing modalities into compact, flexible form factors
|
|
| 9 |
Multisensory data fusion
-
Approaches to synchronize and interpret multimodal data
-
Strategies for enhancing signal quality and accuracy
-
Future directions enabled by multisensory interfaces
|
|