MIT MMAI | Examples of Previous Project Reports

We list here only project reports that were publicly released by students. It should be noted that some of these links are for the follow-up submissions to conferences, after some revisions of the original project reports.

Phoebe Chua, Cathy Mengying Fang, Takehiko Ohkawa, Raja Kushalnagar, Suranga Nanayakkara, Pattie Maes. EmoSign: A Multimodal Dataset for Understanding Emotions in American Sign Language. arXiv 2025

Chenyu Zhang, Minsol Kim, Shohreh Ghorbani, Jingyao Wu, Rosalind Picard, Patricia Maes, Paul Pu Liang. When One Modality Sabotages the Others: A Diagnostic Lens on Multimodal Reasoning. NeurIPS 2025 Workshop

Shohreh Ghorbani, Chenyu Zhang, Minsol Kim, Jingyao Wu. Beyond Accuracy: A Diagnostic Protocol for Fairly Evaluating Multimodal Reasoning. NeurIPS 2025 Workshop

Haofei Yu, Zhengyang Qi, Lawrence Jang, Russ Salakhutdinov, Louis-Philippe Morency, Paul Pu Liang. MMoE: Enhancing Multimodal Models with Mixtures of Multimodal Interaction Experts. EMNLP 2024

Alex Wilf, Leena Mathur, Sheryl Mathew, Claire Ko, Youssouf Kebe, Paul Pu Liang, Louis-Philippe Morency. Social-iq 2.0 Challenge: Benchmarking Multimodal Social Understanding. ICCV 2023 Challenge

Vedant Palit, Rohan Pandey, Aryaman Arora, Paul Pu Liang. Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP. ICCV 2023

Dong Won Lee, Chaitanya Ahuja, Paul Pu Liang, Sanika Natu, Louis-Philippe Morency. Multimodal Lecture Presentations Dataset: Understanding Multimodality in Educational Slides. ICCV 2023

Himanshu Thakur, Atishay Jain, Praneetha Vaddamanu, Paul Pu Liang, Louis-Philippe Morency. Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions. ACL 2023

Rohan Pandey, Rulin Shao, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency. Cross-modal Attention Congruence Regularization for Vision-language Relation Alignment. ACL 2023

Seong Hyeon Park, Gyubok Lee, Manoj Bhat, Jimin Seo, Minseok Kang, Jonathan Francis, Ashwin Jadhav, Paul Pu Liang, Louis-Philippe Morency. Diverse and Admissible Trajectory Prediction through Multimodal Context Understanding. ECCV 2020

Ankit Shah, Vaibhav Vaibhav, Vasu Sharma, Mahmoud Alismail, Louis-Philippe Morency. Multimodal Behavior Markers Exploring Suicidal Intent in Social Media Videos. ICMI 2019

Vasu Sharma, Ankita Kalra, Vaibhav, Simral Chaudhary, Labhesh Patel, Louis-Philippe Morency. Attend and Attack: Attention Guided Adversarial Attacks on Visual Question Answering Models. NeurIPS 2018 Workshop

Yash Patel, Lluis Gomez, Marçal Rusiñol, Dimosthenis Karatzas, C.V. Jawahar. Self-Supervised Visual Representations for Cross-Modal Retrieval

Hai Pham, Paul Pu Liang, Thomas Manzini, Louis-Philippe Morency, Barnabas Poczos. Found in Translation: Learning Robust Joint Representations by Cyclic Translations Between Modalities. AAAI 2019

Zhun Liu, Ying Shen, Varun Bharadhwaj Lakshminarasimhan, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency. Efficient Low-rank Multimodal Fusion with Modality-Specific Factors. ACL 2018

Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov. Gated-Attention Architectures for Task-Oriented Language Grounding. AAAI 2018

Minghai Chen, Sen Wang, Paul Pu Liang, Tadas Baltrušaitis, Amir Zadeh, Louis-Philippe Morency. Multimodal Sentiment Analysis with Word-Level Fusion and Reinforcement Learning. ICMI 2017

Junjie Hu, Desai Fan, Shuxin Yao, Jean Oh. Answer-Aware Attention on Grounded Question Answering in Images. AAAI 2017

Haohan Wang, Aaksha Meghawat, Louis-Philippe Morency, Eric P. Xing. Select-Additive Learning: Improving Generalization in Multimodal Sentiment Analysis. ICME 2017

If you previously took the MMAI course and ended up releasing a public version of your project report, we would love to hear about it! Please contact the course instructor to be added to this list.