Logo, Multisensory Intelligence
Socially-Intelligent AI

Socially-Intelligent AI

Intelligent agents that can comprehend and interact with humans in long-term multi-party social situations, fostering collaboration and growth in social relationships, while maintaining privacy, safety, and fairness for trusted interaction.

Building socially-intelligent AI is a multidisciplinary, multimodal research goal that involves creating agents that can sense, perceive, reason about, learn from, and respond to affect, behavior, and cognition of other agents (human or artificial). This goal introduces new technical challenges to AI, including (1) ambiguity in social constructs, (2) nuanced behavioral signals, (3) multiple perspectives and experiences, and (4) agency and adaptation. The social contexts in which social-AI agents can be situated are diverse, with interactions differing between social settings, degrees of agent embodiment, and social attributes of humans. A vision paper we wrote on the advances in social agents has inspired significant new directions in this field.

Our team has pioneered several key resources for socially-intelligent AI. Human Behavior Atlas is our latest unified benchmark for psychological and social behavior understanding, spanning over 100,000 videos with text, audio, and visual modalities, covering tasks on affective states, cognitive states, pathologies, and social processes, and enabling unified multimodal and multitask foundation models for social intelligence. Previously, CMU-MOSEI remains among the largest datasets for multimodal sentiment analysis and emotion recognition to date, and has become the community standard to train AI for human multimodal language understanding. Social-IQ is a question answering benchmark for artificial social intelligence covering questions about social situations, human behaviors, mental states, traits, attitudes, and attributes. Finally, Social Genome curated human-annotated reasoning steps about social interactions that reference evidence from visual cues, verbal cues, vocal cues, and external knowledge, which enables fine-grained evaluation and reasoning-supervised training of socially-intelligent multimodal models.

A key part of socially-intelligent AI is its responsible deployment, and our group actively works to quantify and mitigate real-world societal concerns around bias, fairness, and privacy, engage in participatory co-design with stakeholders, and assist in developing policies around the real-world deployment of AI foundation models.

Key works:

Human Behavior Atlas: Benchmarking Unified Psychological and Social Behavior Understanding, arXiv 2025

MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models, NeurIPS 2025

Mime videos isolate nonverbal communication, challenging AI to infer emotion, intent, and theory of mind from gesture and movement alone. We introduce MimeQA, a new video QA dataset to train AI with nonverbal social intelligence for natural human–AI interaction.

Alt Text

Social Genome: Grounded Social Reasoning Abilities of Multimodal Models, EMNLP 2025

Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions, EMNLP 2024

Towards Understanding and Mitigating Social Biases in Language Models, ICML 2021

Think Locally, Act Globally: Federated Learning with Local and Global Representations, NeurIPS 2020 Workshop on Federated Learning Distinguished Student Paper

Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph, ACL 2018

Alt Text