SocialNav: Training Human-Inspired Foundation Model for Socially-Aware Embodied Navigation

Ziyi Chen*, Yingnan Guo*, Zedong Chu, Minghua Luo, Yanfen Shen, Mingchao Sun, Junjun Hu, Shichao Xie, Kuan Yang, Pei Shi, Zhining Gu, Lu Liu, Honglin Han, Xiaolong Wu, Mu Xu, Yu Zhang,
AMAP, Alibaba Group, China Zhejiang University, China
*Equal Contribution Corresponding Author

SocialNav combines high-level semantic reasoning with low-level trajectory generation. It identifies socially traversable zones and generates CoT explanations, planning routes that respect social norms.

Abstract

Embodied navigation that adheres to social norms remains an open research challenge. Our SocialNav is a foundational model for socially-aware navigation with a hierarchical "brain-action" architecture, capable of understanding high-level social norms and generating low-level, socially compliant trajectories. To enable such dual capabilities, we construct the SocNav Dataset, a large-scale collection of 7 million samples, comprising (1) a Cognitive Activation Dataset providing social reasoning signals such as chain-of-thought explanations and social traversability prediction, and (2) an Expert Trajectories Pyramid aggregating diverse navigation demonstrations from internet videos, simulated environments, and real-world robots. A multi-stage training pipeline is proposed to gradually inject and refine navigation intelligence: we first inject general navigation skills and social norms understanding into the model via imitation learning, and then refine such skills through a deliberately designed Socially-Aware Flow Exploration GRPO (SAFE-GRPO), the first flow-based reinforcement learning framework for embodied navigation that explicitly rewards socially compliant behaviors. SocialNav achieves +38% success rate and +46% social compliance rate compared to the state-of-the-art method, demonstrating strong gains in both navigation performance and social compliance.

Socially-Aware Navigation in Real-World Environments.

Overview of the SocNav Dataset and Benchmark

SocialNav combines high-level semantic reasoning with low-level trajectory generation. It identifies socially traversable zones and generates CoT explanations, planning routes that respect social norms.

SocNav Dataset and Benchmark

Overview of the SocNav Dataset and Benchmark

The SocNav Dataset illustrates the hierarchical structure and data construction pipeline, composed of the Expert Trajectories Pyramid (ETP) and Cognitive Activation Dataset (CAD). The SocNav Benchmark is a high-fidelity evaluation platform for comprehensive assessment of socially-aware navigation.

Method

SocialNav Architecture and Training Pipeline

SocialNav adopts a hierarchical "brain-action" architecture, with a VLM-based Brain for high-level semantic reasoning and an action expert for generating socially compliant trajectories. We adopt a three-stage training strategy: Pre-training, Fine-tuning, and SAFE-GRPO.

Qualitative Results

Qualitative comparison on the SocNav Benchmark

Qualitative comparison in three scenes (Crossing, Park, Campus). Our method (green) consistently adheres to social norms like sidewalks, while the baseline (red) often takes socially risky shortcuts.

BibTeX

@article{chen2025socialnav,
      title={SocialNav: Training Human-Inspired Foundation Model for Socially-Aware Embodied Navigation}, 
      author={Ziyi Chen and Yingnan Guo and Zedong Chu and Minghua Luo and Yanfen Shen and Mingchao Sun and Junjun Hu and Shichao Xie and Kuan Yang and Pei Shi and Zhining Gu and Lu Liu and Honglin Han and Xiaolong Wu and Mu Xu and Yu Zhang},
      journal={arXiv preprint arXiv:2511.21135},
      year={2025}
}