Modeling: MultiModal AI
aka, How to AI (Almost) Anything
MAS.S60, 6.S985, 2.S971(U), 2.S793(G) • Spring 2026 • MIT
Artificial Intelligence (AI) holds great promise to enhance digital productivity, physical interactions, overall well-being, and the human experience. To enable the true impact of AI, these systems will need to be grounded in many real-world data modalities, from language-only systems to holistically integrating vision, audio, sensors, medical data, music, art, smell, taste, and more. This course introduces the principles of multimodal AI that can process many modalities at once, such as connecting language and images, music and art, sensing and actuation, and more. We will cover AI methods to (1) represent and fuse heterogeneous and interconnected data sources, (2) align data across different views, (3) reason over multiple steps with many modalities, (4) generate new multimodal content, (5) transfer knowledge from high-resource to low-resource data, and (6) quantify the principles of multimodal AI for safe, ethical, and human-aligned deployment.
Content will be delivered via 2 1.5-hour lectures weekly. Through lectures, homework assignments, readings, and a significant research component, this course will develop critical thinking skills and intuitions when applying AI to new data modalities and their combinations, knowledge of recent technical achievements in AI, and a deeper understanding of the AI research process. Students will complete hands-on intermediate assignments on applying AI to their data modalities and tasks of interest, culminating in a novel research project. The course projects will be done in teams, with a research topic on AI for new data modalities and/or multimodal AI, and pre-approved by the instructors.
- Time: Tuesdays and Thursdays 2:30-4pm
- Location: MIT Media Lab E14-633
- Instructor Paul Liang
- Email: ppliang@mit.edu
- Instructor Dimitris Bertsimas
- Email: dbertsim@mit.edu
- Instructor Jinhua Zhao
- Email: jinhua@mit.edu
- Instructor Sang-Gook Kim
- Email: sangkim@mit.edu
- TA Edgar Morfin
- Email: emorfin@mit.edu
- TA David Dai
- Email: dvdai@mit.edu
Announcements
| Jan 27, 2026 | Welcome to Modeling: MultiModal AI, Spring 2026! |