Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

June 1, 2026 · View on GitHub

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

arXiv Website License: CC BY 4.0 Awesome Papers Maintenance GitHub stars

A comprehensive survey and the first unified safety framework for embodied AI, covering 500+ key works across perception, cognition, planning, interaction, and agentic systems.

[arXiv] | [Website]

Authors

Xiao Li1,*, Xiang Zheng4,*, Yifeng Gao1, Xinyu Xia5, Yixu Wang1, Xin Wang1, Ye Sun1, Yunhan Zhao1, Ming Wen1,3, Jiayu Li1, Zixing Chen1, Xun Gong5, Yi Liu4, Yige Li6, Yutao Wu7, Cong Wang4, Jun Sun6, Yixin Cao1,2,3, Zhineng Chen1,3, Jingjing Chen1,3, Tao Gui1,2,3, Qi Zhang1,3, Zuxuan Wu1,2,3, Xipeng Qiu1,2,3, Xuanjing Huang1,3, Tiehua Zhang8, Zhipeng Wei10, Kun Wang11, Xinfeng Li11, Hanxun Huang13, Sarah Erfani13, James Bailey13, Jianping Wang4, Chaowei Xiao14, Ran He12, Bo Li9, Xingjun Ma1,2,3,†, Yu-Gang Jiang1,3,†

1Institute of Trustworthy Embodied AI, Fudan University, 2Shanghai Innovation Institute, 3Shanghai Key Laboratory of Multimodal Embodied AI, 4City University of Hong Kong, 5Jilin University, 6Singapore Management University, 7Deakin University, 8Tongji University, 9UIUC, 10UC Berkeley, 11Nanyang Technological University, 12Chinese Academy of Sciences, 13The University of Melbourne, 14Johns Hopkins University

*Equal Contribution, Corresponding Authors

🔥 News

  • [2026/06/01] Added an interactive At a Glance dashboard to the project page: papers-per-year, a clickable taxonomy sunburst, and venue-type and top-venue charts, all rendered live from the paper list and cross-filtering it on click. Author citations across the list were also standardized.
  • [2026/05/30] Added a Paper Reader page (papers.html) for browsing abstracts and keywords, with search across titles, authors, venues, and keywords, plus taxonomy-layer filtering.
  • [2026/05/25] Featured by Chinese AI & tech media: 机器之心, 专知, 机器学习算法与自然语言处理, AI与安全, AI思想会, 非具身不智能, 诺亚星辰, and 小红书.
  • [2026/05/24] arXiv v2 released; integrated further 2026-04 / 2026-05 arXiv papers across all 5 layers; URL re-verification across the paper list fixed title truncations and shifted Google Scholar links to arXiv where available. Survey now indexes 500+ papers across 38 co-authors from 13 institutions.
  • [2026/05/11] Updated with 29 new arXiv papers (2026-04 / 2026-05) across all 5 layers, including HazardArena, RedVLA, JailWAM, DTap, IPI-in-Wild, MCP function-hijacking, and skill-safety literature; renamed Tool Use to Tool Use and Skill to track the agentic-skill threat surface. Survey then indexed 481 papers across 38 co-authors from 13 institutions.
  • [2026/05/09] Paper posted on arXiv.
  • [2026/04/01] Beautified paper list with layer icons and visual separators.
  • [2026/03/31] Added llms.txt and SEO meta tags for AI discoverability.
  • [2026/03/28] Added 11 missing safety papers; unified paper counts to 400+.
  • [2026/03/27] Repository and paper released!
  • [2026/03/27] Launched project website with GitHub Pages.
  • [2026/03/27] Added automated paper review GitHub Action for community contributions.
  • [2026/03/26] ISC-Bench paper on arXiv -- 400+ stars in 48 hours!
  • [2026/03/22] ISC-Bench repository released -- Internal Safety Collapse benchmark for frontier LLMs.
  • [2025/09/15] Safety at Scale survey published in Foundations and Trends in Security.
  • [2025/02/02] Safety at Scale survey on arXiv -- large model & agent safety.

Table of Contents

Overview

Embodied AI integrates perception, cognition, planning, and interaction into agents that operate in open-world, safety-critical environments. As these systems gain autonomy and enter domains such as autonomous driving, healthcare, and robotics, ensuring their safety becomes both technically challenging and socially indispensable.

Capability-Risk Duality: each layer of the embodied pipeline represents a capability expansion that introduces corresponding new vulnerabilities.

Capability vs. risk duality in embodied AI systems. As capabilities expand outward from perception to agentic systems, the attack surface grows correspondingly -- vulnerabilities at inner layers cascade to outer layers.

Illustration of safety threats and attack surfaces across capability layers of embodied AI systems.

Overview of representative attack and defense methods across perception, cognition, planning, action & interaction, and agentic system layers. The width of the strips is proportional to the number of reviewed works.

Surveyed Papers

We review 533 papers across five capability layers of embodied AI. The taxonomy below organizes the core attack and defense literature; the remaining cited works (background and capability models, datasets, simulators, and related surveys) are collected under Other Related Works.

LayerSubcategoriesPapers
👁️PerceptionVisual, Auditory, Spatial, Motion, Cross-Modal195
🧠CognitionInstruction Understanding, World Model, Reasoning37
🗺️PlanningTask, Trajectory, Multi-Agent81
🤖Action and InteractionRobot Control, Human-Agent, Multi-Agent Collaboration109
AgenticTool Use and Skill, Memory, Self-Evolving, Cascading Risks91
👁️ Perception (199 papers)
Visual Perception (58)
Auditory Perception (21)
Spatial Perception (61)
Motion Perception (48)
Cross-Modal Perception (11)

🧠 Cognition (38 papers)
Instruction Understanding (16)
World Model (18)
Reasoning (4)

🗺️ Planning (80 papers)
Task Planning (32)
Trajectory Planning (34)
Multi-Agent Planning (14)

🤖 Action and Interaction (112 papers)
Robot Control (97)
Human-Agent Interaction (12)
Multi-Agent Collaboration (3)

Agentic (96 papers)
Tool Use and Skill (22)
Memory (22)
Self-Evolving (17)
Cascading Risks (35)

33 cited works beyond the five-layer attack and defense taxonomy, grouped by type.

Surveys & Reviews (13)
Benchmarks & Datasets (2)
Foundation, World, World-Action & VLA Models (10)
Other & Foundational (8)

Open Challenges

  • Multimodal Perception Fusion Fragility: Cross-modal attacks exploiting inconsistencies between visual, auditory, and spatial perception remain underexplored.
  • Planning Under Jailbreak: LLM-based planners are vulnerable to instruction manipulation that bypasses safety constraints in physical execution.
  • Human-Agent Interaction Trust: Open-ended scenarios where agents must negotiate trust with humans lack standardized safety evaluation.
  • Agentic Cascading Failures: Self-evolving agents with persistent memory and tool use can propagate inner-layer compromises to system-wide failures.
  • Benchmark Standardization: Lack of unified safety benchmarks across the full embodied AI pipeline hinders reproducible evaluation.

Contributing

Contributions are welcome and encouraged! If you find relevant papers missing from our list or spot any errors:

  • Add a paper: Open a pull request with the paper title, authors, venue, year, and a Google Scholar link.
  • Report an issue: Open an issue describing what needs to be corrected or added.
  • Suggest a category: If a paper does not fit existing subcategories, propose a new one in your PR description.

Please follow the existing format: [Paper Title](Google Scholar link). Authors. *Venue*, Year.

Citation

If you find this survey useful, please cite our paper:

@article{li2026safety,
  title={Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses},
  author={Li, Xiao and Zheng, Xiang and Gao, Yifeng and Xia, Xinyu and Wang, Yixu and Wang, Xin and Sun, Ye and Zhao, Yunhan and Wen, Ming and Li, Jiayu and Chen, Zixing and Gong, Xun and Liu, Yi and Li, Yige and Wu, Yutao and Wang, Cong and Sun, Jun and Cao, Yixin and Chen, Zhineng and Chen, Jingjing and Gui, Tao and Zhang, Qi and Wu, Zuxuan and Qiu, Xipeng and Huang, Xuanjing and Zhang, Tiehua and Wei, Zhipeng and Wang, Kun and Li, Xinfeng and Huang, Hanxun and Erfani, Sarah and Bailey, James and Wang, Jianping and Xiao, Chaowei and He, Ran and Li, Bo and Ma, Xingjun and Jiang, Yu-Gang},
  journal={arXiv preprint arXiv:2605.02900},
  year={2026},
  url={https://arxiv.org/abs/2605.02900}
}

From the same team:

ProjectDescriptionStars
ISC-BenchInternal Safety Collapse in Frontier LLMs
Awesome-Large-Model-SafetySafety at Scale: A Comprehensive Survey of Large Model and Agent Safety
BackdoorLLMA Comprehensive Benchmark for Backdoor Attacks on LLMs (NeurIPS 2025)
BackdoorAgentBackdoor Attacks on LLM-based Agent Workflows (ACL Findings 2026)
JustAskCurious Code Agents Reveal System Prompts in Frontier LLMs (ICML 2026)
Unlearnable-ExamplesMaking Personal Data Unexploitable (ICLR 2021)
XTransferBenchSuper Transferable Adversarial Attacks on CLIP (ICML 2025)

Curated external resources on frontier AI safety beyond the surveyed papers:

Star History

Star History Chart