A Comprehensive Guide to Contemporary AI Safety
Safer Agentic AI
Principles and Responsible Practices
A practical guide to governing, aligning, and securing autonomous AI systems. Written for policymakers, developers, and leaders navigating the challenges of increasingly capable AI.
What Experts Are Saying
Praise from leaders in AI, ethics, and technology governance
WHAT IS AGENTIC AI?
Agentic AI systems set and pursue goals, adapt to situations, and make decisions autonomously. Unlike AI that merely recommends, these systems act directly in the world—breaking down tasks, experimenting, adjusting to feedback. This shift from advice to action is why safety becomes critical: mistakes can compound before humans notice or intervene. Examples include self-driving cars or AI managing logistics in real time.
Look Inside the Book
A comprehensive guide spanning AI fundamentals to practical governance frameworks
Table of Contents
-
Part 1 The Dawn of AgencyCh 1 AI Overview From rule-based systems to neural networks — the journey to modern AICh 2 The Age of Agentic AI Understanding autonomous systems that can perceive, reason, and act
Read Free Sample -
Part 2 The Core Alignment ProblemCh 3 Goal Alignment of Agentic AI Systems Ensuring AI systems reliably pursue intended objectivesCh 4 Value Alignment and Embedding Ethics Value learning, preference aggregation, and ethical frameworksCh 5 Deceptive and Power-Seeking AI How AI systems can learn to deceive and accumulate influenceCh 6 Utility Convergence Common patterns in AI reasoning and emergent behaviours
-
Part 3 The Governance and Safety ToolkitCh 7 Safety Fundamentals Core principles for building AI systems we can trustCh 8 Security and Guardrails Fail-safe defaults, fault tolerance, and runtime hardeningCh 9 Agentic Conflict Multi-agent dynamics and coordination challengesCh 10 Transparency and Interpretability Explainable AI, decision logs, and interpretable modelsCh 11 AI Governance Policy frameworks, accountability structures, and oversightCh 12 Weighted Capability Governance Measuring and controlling AI systems by their capabilities
-
Part 4 The Human and Societal InterfaceCh 13 Working with Agentic AI Collaborative workflows, human-machine teaming, and trustCh 14 Psychosecurity AI's impact on cognition, manipulation, and mental wellbeingCh 15 Who Goes There? Identity, authentication, and distinguishing humans from AICh 16 Self-Watching Machines AI monitoring, oversight systems, and recursive safetyCh 17 Sustainability and Resource Stewardship Environmental impact and responsible resource use
-
Part 5 The Uncharted FrontierCh 18 Disruptive Waves Emerging trends and accelerating capabilitiesCh 19 Frontier Concerns Advanced risks and long-term safety challengesCh 20 Superintelligence Strategy Preparing for transformative AI capabilitiesCh 21 Ten Principles for a Positive Future Actionable guidelines for responsible AI development
From the Foreword
"This timely volume addresses one of our field's most daunting challenges: ensuring the safe and beneficial development of increasingly autonomous AI systems. The authors combine technical expertise and ethical insight in a comprehensive framework... This book is invaluable for AI researchers, developers, policymakers, business leaders and anyone invested in the responsible future of artificial intelligence."
Explore More Resources
KEY FOCUS AREAS
Our framework addresses nine critical dimensions for building safe and beneficial agentic AI systems.
Goal Alignment
Ensuring robust alignment between operational goals and human values.
Value Alignment
Identifying, codifying, and maintaining human values in AI systems.
Safe Operations
Ensuring safe operations throughout the system lifecycle.
Epistemic Hygiene
Maintaining cognitive clarity and accurate information management.
Transparency
Creating clear, interpretable rationales for AI reasoning processes.
Goal Termination
Implementing proper protocols for task completion and system sunsetting.
Security
Implementing comprehensive protection against threats and vulnerabilities.
Contextual Understanding
Establishing robust control mechanisms across operational contexts.
Responsible Governance
Establishing accountability, compliance, and oversight frameworks for responsible deployment.
Built on Rigorous Research
The book expands on a comprehensive framework developed by our 25-member Working Group of international experts in AI, ethics, law, and safety engineering.
Explore the Full FrameworkAbout the Authors
Leaders in AI safety, ethics, and governance
OUR WORKING GROUP
Experts from diverse fields—AI, ethics, law, social sciences, and safety engineering—have contributed their time and expertise to develop this framework. We are deeply grateful for their engagement, ideas, and contributions.
Regular Contributors
- Ali Hessami
- Matthew Newman
- Sara El-Deeb
- Farhad Fassihi
- Mert Cuhadaroglu
- Scott David
- Hamid Jahankhani
- Nell Watson
- Sean Moriarty
- Isabel Caetano
- Roland Pihlakas
- Vassil Tashev
- Keeley Crockett
- Safae Essafi
- Zvikomborero Murahwi
- Lubna Dajani
- Salma Abbasi
Occasional Contributors
- Aisha Gurung
- Leonie Koessler
- Pramod Misra
- Aleksander Jevtic
- McKenna Fitzgerald
- Pranav Gade
- Alina Holcroft
- Michael O'Grady
- Rebecca Hawkins
- Md Atiqur R. Ahad
- Mrinal Karvir
- Sai Joseph
- Chantell Murphy
- Nikita Tiwari
- Tim Schreier
- Katherine Evans
- Patricia Shaw
JOIN OUR COMMUNITY
The Safer Agentic AI Community of Practice is a global network of experts building practical safety guidelines for autonomous AI systems. Subscribe for updates or join our LinkedIn group to connect with fellow practitioners.
Stay Updated
Get the latest on our research, framework updates, and book news.
Connect With Us
Join our LinkedIn group to engage with AI safety experts, share insights, and stay connected with the community.
HAVE ANY QUESTIONS OR IDEAS?
Use the form below to get in touch with us.
We welcome feedback, questions, and collaborative
opportunities.