Awesome Responsible AI

May 28, 2026 · View on GitHub

Awesome Responsible AI

A curated list of awesome academic research, books, code of ethics, courses, data sets, databases, frameworks, institutes, maturity models, newsletters, principles, podcasts, regulations, responsible scale policies, reports, tools and standards related to Responsible, Trustworthy, and Human-Centered AI.

Main Concepts

What is AI Governance?

AI governance is a system of rules, processes, frameworks, and tools within an organization to ensure the ethical and responsible development of AI.

What is Human-Centered AI?

Human-Centered Artificial Intelligence (HCAI) is an approach to AI development that prioritizes human users' needs, experiences, and well-being.

What is Open Source AI?

When we refer to a “system,” we are speaking both broadly about a fully functional structure and its discrete structural elements. To be considered Open Source, the requirements are the same, whether applied to a system, a model, weights and parameters, or other structural elements.

An Open Source AI is an AI system made available under terms and in a way that grant the freedoms1 to:

Use the system for any purpose and without having to ask for permission.
Study how the system works and inspect its components.
Modify the system for any purpose, including to change its output.
Share the system for others to use with or without modifications, for any purpose.

Source

What is Responsible AI?

Responsible AI (RAI) refers to the development, deployment, and use of artificial intelligence (AI) systems in ways that are ethical, transparent, accountable, and aligned with human values.

What is a Responsible AI framework?

Responsible AI frameworks often encompass guidelines, principles, and practices that prioritize fairness, safety, and respect for individual rights.

What is Trustworthy AI?

Trustworthy AI (TAI) refers to artificial intelligence systems designed and deployed to be transparent, robust and respectful of data privacy.

Why is Responsible, Trustworthy, and Human-Centered AI important?

AI is a transformative and dual-side technology prone to reshape industries, yet it requires careful governance to balance the benefits of automation and insight with protections against unintended social, economic, and security impacts. You can read more about the current wave here.

Content

Academic Research
Books
Code of Ethics
Courses
Data Sets
Databases
Frameworks
Institutes
Maturity Models
Newsletters
Principles
Podcasts
Regulations
Responsible Scaling Policies
Reports
Standards
Tools
Citing this repository

Academic Research

Adversarial ML

Oprea, A. et al. (2023). Adversarial machine learning: A taxonomy and terminology of attacks and mitigations. National Institute of Standards and Technology. Article

Artificial General Intelligence (AGI)

Hendricks, D. et al. (2025). A definition of AGI. Article

Artificial Intelligence Governance (AI Governance)

Eisenberg, I. W. et al. (2025). The Unified Control Framework: Establishing a Common Foundation for Enterprise AI Governance, Risk Management and Regulatory Compliance. arXiv preprint arXiv:2503.05937. Article Visualization Credo

Bias

Schwartz, R. et al. (2022). Towards a standard for identifying and managing bias in artificial intelligence (Vol. 3, p. 00). US Department of Commerce, National Institute of Standards and Technology. Article NIST

Challenges

D'Amour, A. et al. (2022). Underspecification presents challenges for credibility in modern machine learning. Journal of Machine Learning Research, 23(226), 1-61. Article Google

Drift

Ackerman, S. et al. (2021, June). Machine learning model drift detection via weak data slices. In 2021 IEEE/ACM Third International Workshop on Deep Learning for Testing and Testing for Deep Learning (DeepTest) (pp. 1-8). IEEE. Article IBM
Ackerman, S. et al. (2020, February). FreaAI: Automated extraction of data slices to test machine learning models. In International Workshop on Engineering Dependable and Secure Machine Learning Systems (pp. 67-83). Cham: Springer International Publishing. Article IBM

Explainability/Interpretability/Mechanical Interpretability

Dhurandhar, A. et al. (2018). Explanations based on the missing: Towards contrastive explanations with pertinent negatives. Advances in neural information processing systems, 31. Article University of Michigan IBM Research
Dhurandhar, A. et al. (2018). Improving simple models with confidence profiles. Advances in Neural Information Processing Systems, 31. Article IBM Research
Gurumoorthy, K. S. et al. (2019, November). Efficient data representation by selecting prototypes with importance weights. In 2019 IEEE International Conference on Data Mining (ICDM) (pp. 260-269). IEEE. Article Amazon Development Center IBM Research
Hind, M. et al. (2019, January). TED: Teaching AI to explain its decisions. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (pp. 123-129)Article IBM Research
Lundberg, S. M. et al. (2017). A unified approach to interpreting model predictions. Advances in neural information processing systems, 30. Article, Github University of Washington
Luss, R. et al. (2021, August). Leveraging latent features for local explanations. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining (pp. 1139-1149). Article IBM Research University of Michigan
Ribeiro, M. T. et al. (2016, August). "Why should i trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135-1144). Article, Github University of Washington
Wei, D. et al. (2019, May). Generalized linear rule models. In International conference on machine learning (pp. 6687-6696). PMLR. Article IBM Research
Contrastive Explanations Method with Monotonic Attribute Functions (Luss et al., 2019)
Boolean Decision Rules via Column Generation (Light Edition) (Dash et al., 2018) IBM Research
Towards Robust Interpretability with Self-Explaining Neural Networks (Alvarez-Melis et al., 2018) MIT

An interesting curated collection of articules (updated until 2021) A Living and Curated Collection of Explainable AI Methods.

A shared effort can be found at Neuronpedia.

Ethical Data Products

Gebru, T. et al. (2021). Datasheets for datasets. Communications of the ACM, 64(12), 86-92. Article Google
Mitchell, M. et al. (2019, January). Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency (pp. 220-229). Article Google
Pushkarna, M. et al. (2022, June). Data cards: Purposeful and transparent dataset documentation for responsible ai. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (pp. 1776-1826). Article Google
Rostamzadeh, N. et al. (2022, June). Healthsheet: development of a transparency artifact for health datasets. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (pp. 1943-1961). Article Google
Saint-Jacques, G. et al. (2020). Fairness through Experimentation: Inequality in A/B testing as an approach to responsible design. arXiv preprint arXiv:2002.05819. Article LinkedIn

Evaluation (of model explanations)

Agarwal, C. et al. (2022). Openxai: Towards a transparent evaluation of model explanations. Advances in Neural Information Processing Systems, 35, 15784-15799. Article
Liesenfeld, A. et al. (2024). Rethinking Open Source Generative AI: Open-Washing and the EU AI Act. In The 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’24). Rio de Janeiro, Brazil: ACM. Article Benchmark

Fairness

Caton, S. et al. (2024). Fairness in machine learning: A survey. ACM Computing Surveys, 56(7), 1-38. Article
Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2), 153-163. Article
Chouldechova, A. et al. (2017). Fairer and more accurate, but for whom? arXiv preprint arXiv:1707.00046. Article
Coston, A. et al. (2020, January). Counterfactual risk assessments, evaluation, and fairness. In Proceedings of the 2020 conference on fairness, accountability, and transparency (pp. 582-593). Article
Kleinberg, J. et al. (2016). Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807. Article
Saleiro, P. et al. (2024). Aequitas Flow: Streamlining Fair ML Experimentation. arXiv preprint arXiv:2405.05809. Article
Plečko, D. et al. (2024), Causal Fairness Analysis, Foundations and Trends in Machine Learning: Vol. 17, No. 3, pp 1–238. DOI: 10.1561/2200000106. Article and Materials
Saleiro, P. et al. (2018). Aequitas: A bias and fairness audit toolkit. arXiv preprint arXiv:1811.05577. Article
Vasudevan, S. et al. (2020, October). Lift: A scalable framework for measuring fairness in ml applications. In Proceedings of the 29th ACM international conference on information & knowledge management (pp. 2773-2780). Article LinkedIn

Regulation

Wasil, A. R. et al. (2024). Verification methods for international AI agreements. arXiv preprint arXiv:2408.16074. Article

Representation Engineering

Zou, A. et al. (2024) Improving Alignment and Robustness with Circuit Breakers. Article
Zou, A. et al. (2023) Representation Engineering: A Top-Down Approach to AI Transparency. Article

Risk

Slattery, P., et al. (2024). The ai risk repository: A comprehensive meta-review, database, and taxonomy of risks from artificial intelligence. arXiv preprint arXiv:2408.12622. Article

Systems Risks

Uuk, R., et al. (2024). A Taxonomy of Systemic Risks from General-Purpose AI. arXiv preprint arXiv:2412.07780. Article

Sustainability

Lacoste, A. et al. (2019). Quantifying the carbon emissions of machine learning. arXiv preprint arXiv:1910.09700. Article
Li, P. et al. (2023) Making AI Less “Thirsty”: Uncovering and Addressing the Secret Water Footprint of AI Models. arXiv:2304.03271 Article
Parcollet, T. et al. (2021). The energy and carbon footprint of training end-to-end speech recognizers. Article
Patterson, D. et al. (2021). Carbon emissions and large neural network training. arXiv preprint arXiv:2104.10350. Article
Sculley, D. et al. (2015). Hidden technical debt in machine learning systems. Advances in neural information processing systems, 28. Article Google
Sculley, D. et al. (2014, December). Machine learning: The high interest credit card of technical debt. In SE4ML: software engineering for machine learning (NIPS 2014 Workshop) (Vol. 111, p. 112). Article Google
Strubell, E. et al. (2019). Energy and policy considerations for deep learning in NLP. arXiv preprint arXiv:1906.02243. Article
Sustainable AI: AI for sustainability and the sustainability of AI (van Wynsberghe, A. 2021). AI and Ethics, 1-6
Green Algorithms: Quantifying the carbon emissions of computation (Lannelongue, L. et al. 2020)
C.-J. Wu, R. Raghavendra, U. Gupta, B. Acun, N. Ardalani, K. Maeng, G. Chang, F. Aga, J. Huang, C. Bai, M. Gschwind, A. Gupta, M. Ott, A. Melnikov, S. Candido, D. Brooks, G. Chauhan, B. Lee, H.-H. Lee, K. Hazelwood, Sustainable AI: Environmental implications, challenges and opportunities. Proceedings of the 5th Conference on Machine Learning and Systems (MLSys) (2022) vol. 4, pp. 795–813. Article

Collections

Google Research on Responsible AI: https://research.google/pubs/?collection=responsible-ai Google
Pipeline-Aware Fairness: http://fairpipe.dssg.io

Reproducible/Non-Reproducible Research

Computational reproducibility (when the results in a paper can be replicated using the exact code and dataset provided by the authors) is becoming a significant problem not only for academic but for practitionars who want to implement AI in their organizations and aim to resuse ideas from academia. Read more about this problem here.

Books

This section features a curated selection of books.

Open Access

Barocas, S., Hardt, M., & Narayanan, A. (2023). Fairness and machine learning: Limitations and opportunities. MIT press. Book
Barrett, M., Gerke, T. & D’Agostino McGowa, L. (2024). Causal Inference in R Book Causal Inference R
Biecek, P., & Burzykowski, T. (2021). Explanatory model analysis: explore, explain, and examine predictive models. Chapman and Hall/CRC. Book Explainability Interpretability Transparency R
Biecek, P. (2024). Adversarial Model Analysis. Book Safety Red Teaming
Cunningham, Scott. (2021) Causal inference: The mixtape. Yale university press. Book Causal Inference
Fourrier, C. and et all. (2024) LLM Evaluation Guidebook. Github Repository. Web LLM Evaluation
Freiesleben, T. & Molnar, C. (2024). Supervised Machine Learning for Science: How to stop worrying and love your black box. Book
Huntington-Klein, N. (2012) The effect: An introduction to research design and causality. Chapman and Hall/CRC. Book Causal Inference
Leveson, N. G. (2016). Engineering a safer world: Systems thinking applied to safety. MIT press. Book
Kamath, U. et al. (2023) Applied Causal Inference Book
Matloff, N et al. (2204) Data Science Looks at Discrimination Book Fairness R
Molnar, C. (2020). Interpretable Machine Learning. Lulu.com. Book Explainability Interpretability Transparency R
Tegomoh, B. (2025). The Biosecurity Handbook: Biological Security in the AI Era. Book
Tegomoh, B. (2025). The Public Health AI Handbook: Evaluating AI Tools for Public Health Practice. Book
Vizquez, S. & Kubersky, W. (2025) The Little Book of ML Metrics. Book ML Evaluation

Commercial / Propietary / Closed Access

Trust in Machine Learning (Varshney, K., 2022) Safety Privacy Drift Fairness Interpretability Explainability
Interpretable AI (Thampi, A., 2022) Explainability Fairness Interpretability
AI Fairness (Mahoney, T., Varshney, K.R., Hind, M., 2020 Report Fairness
Practical Fairness (Nielsen, A., 2021) Fairness
Hands-On Explainable AI (XAI) with Python (Rothman, D., 2020) Explainability
AI and the Law (Kilroy, K., 2021) Report Trust Law
Responsible Machine Learning (Hall, P., Gill, N., Cox, B., 2020) Report Law Compliance Safety Privacy
Privacy-Preserving Machine Learning
Human-In-The-Loop Machine Learning: Active Learning and Annotation for Human-Centered AI
Interpretable Machine Learning With Python: Learn to Build Interpretable High-Performance Models With Hands-On Real-World Examples
Responsible AI (Hall, P., Chowdhury, R., 2023) Governance Safety Drift
Marcus, G., and Davis, E. (2019). Rebooting AI: Building artificial intelligence we can trust. Vintage. Book
Marcus, G. F. (2024). Taming Silicon Valley: How We Can Ensure That AI Works for Us. MIT Press. Book
Yampolskiy, R. V. (2024) AI: Unexplainable, Unpredictable, Uncontrollable. 2024. CRC Press  Book

Code of Ethics

This section features a curated selection of code of ethics.

ACS Code of Professional Conduct by Australian ICT (Information and Communication Technology)
Association for Computer Machinery's Code of Ethics and Professional Conduct
IEEE Global Initiative for Ethical Considerations in Artificial Intelligence (AI) and Autonomous Systems (AS)
ISO/IEC's Standards for Artificial Intelligence

Courses

This section features a curated selection of open courses focused on Responsible AI, AI Ethics, AI Safety and other related topics. The classes range from introductory courses on data ethics to specialized training in AI Safety.

Course	Organization	Description	Topic
AGI Strategy	BlueDot Impact	A course abour AGI to understand to understand the race, the risks, and how you can make a difference.	AGI Strategy
AI Alignment	BlueDot Impact	A course designed to introduce the key concepts in AI safety and alignment.	AI Alignment, AI Safety
AI Ethics	Turing College	This course is part of the DIVERSIFAIR project, an EU-backed initiative created to help professionals build ethical AI that’s fair, transparent, and accountable — not just technically accurate.	AI Ethics, Fairness
AI Ethics & Governance (AEG)	The Alan Turing Institute	This course is designed to help you understand the fundamentals of AI Ethics and Governance.	AI Ethics, AI Governance
AI Governance	AI Career Pro	A series of courses that teach the practical capabilities missing from AI governance education — not just theory, but how to actually build AI inventories, perform algorithmic assessments, design meaningful human oversight, and make the business case that secures resources.	AI Governance
AI Governance	BlueDot Impact	A course designed to Examine the risks posed by advanced AI systems, standards and regulations to address them, and foreign policy approaches.	AI Governance
AI Policy Clinic	Center for AI and Digital Policy	The Center has launched a comprehensive certification program for AI Policy.	AI Governance
AI Safety, Ethics and Society	Center for AI Safety	A course aims to provide a comprehensive introduction to how current AI systems work, why many experts are concerned that continued advances in AI may pose severe societal-scale risks, and how society can manage and mitigate these risks.	AI Safety, AI Ethics, AI Governance
AI Security and Governance	Securiti	This certification covers core concepts in generative AI, global AI laws, compliance obligations, AI risk management, and AI governance frameworks that ensure responsible innovation.	AI Security, AI Governance
CS 2881 AI Safety	Harvard University	This course introduces challenges in alignment and safety of artificial intelligence.	AI Safety
CS 294-131: Trustworthy Deep Learning	Berkeley University	This course helps to develop a deeper understanding of deep learning and explore new research directions and applications of AI/deep learning and privacy/security.	Explainability, Privacy, Security
CIS 4230/5230 - Ethical Algorithm Design	University of Pennsylvania	This course is about the social and human problems that can arise from algorithms, AI and machine learning, and how we might design these technologies to be "better behaved" in the first place.	AI Safety, Responsible AI
CS 594 - Causal Inference and Learning	University of Illinois at Chicago	The goal of the course on Causal is to introduce students to methodologies and algorithms for causal reasoning and connect various aspects of causal inference, including methods developed within computer science, statistics, and economics.	Causal Inference
CS 7880 - Rigorous Approaches to Data Privacy	Northeastern University	This course covers the theory of differential privacy, its application, and its connections to other areas of computer science, covering roughly the state-of-the-art in the field.	Data Privacy
CS 860 - Algorithms for Private Data Analysis	University of Waterloo	This course is on algorithms for differentially private analysis of data.	Data Privacy
Data Justice (DJ)	The Alan Turing Institute	A course that explores the emerging movement of data justice, which seeks to apply a social justice-oriented approach to examining the range of social, political, and material concerns arising within our increasingly datafied society	Ethics, Data Justice
Explainable Artificial Intelligence	Harvard University	This course aims to familiarize students with the recent advances in the emerging field of eXplainable Artificial Intelligence (XAI)	Explainability, Interpretability
Future of AI	BlueDot Impact	A course to understand AI's impact and be part of the conversation about its future.	AI Fundamentals
Introduction to AI Ethics	Kaggle	A course to explore practical tools to guide the moral design of AI systems.	AI Ethics
Introduction to ML Safety	Center for AI Safety	A course discusses how researchers can shape the process that will lead to strong AI systems and steer that process in a safer direction.	AI Safety
Introduction to Responsible Machine Learning	The George Washington University	Materials for a technical, nuts-and-bolts course about increasing transparency, fairness, robustness, and security in machine learning.	Responsible AI
LLM evaluation	Nebius Academy, Evidently	A course about LLM evaluation using Evidently.	AI Safety, LLM Evaluation
Machine Learning Explainability	Kaggle	A course to extract human-understandable insights from any model.	Explainability, Interpretability
Machine Learning in Production (17-445/17-645/17-745) / AI Engineering (11-695)	Carnegie Mellon University	A course that covers how to build, deploy, assure, and maintain software products with machine-learned models.	MLOps, Responsible AI
MATS	MATS Research	The main goal of the course is to help scholars develop as AI alignment researchers.	AI Alignment, AI Safety
Modern-Day Oracles or Bullshit Machines?	Bergstrom, C. T., & West, J. D.	A course about how data and statistical analysis — the keystones of scientific reasoning — can be abused to mislead people.	Ethics
Practical Data Ethics	Fast.ai	A course focus on topics that are both urgent and practical.	Data Ethics
Public Engagement of Data Science and AI (PED)	The Alan Turing Institute	A course is designed to help you understand the practical and ethical value of public engagement with data science and AI.	Ethics
Responsible AI	All Tech is Human	This series of short courses, which can be completed in just a few hours, offers a foundational understanding of Responsible AI.	Responsible AI
Responsible Research and Innovation (RRI)	The Alan Turing Institute	This course explores what it means to take (individual and collective) responsibility for (and over) the processes and outcomes of research and innovation in data science and AI.	Responsible AI

Data Sets

This section features a curated selection of data sets.

If you are looking for public data sets for your project, this is a curated collection.

Databases

This section features a curated selection of databases focused on tracking incidents, issues, litigations, vulnerabilities and Ai for good initiatives.

(AI) Incidents Trackers

Tracker	Paper	Organization/Creator	Description	Topic
AI for Good Lab	N/A	Microsoft	An open source database of assets for social and environmental good.	AI for Good
AI Hallucination Cases	N/A	Damien Charlotin	This database tracks legal decisions1 in cases where generative AI produced hallucinated content – typically fake citations, but also other types of arguments.	Deepfakes, Misinformation
AI Risk Repository	The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence	MIT	A comprehensive living database of over 1600 AI risks categorized by their cause and risk domain.	AI Risk
Political Deepfakes Incidents Database	Merging AI Incidents Research with Political Misinformation Research: Introducing the Political Deepfakes Incidents Database	Purdue University	A collection of politically-salient deepfakes, encompassing synthetically-created videos, images, and less-sophisticated `cheapfakes.'	Deepfakes

This section is under review and the rest of entries will be added to the table with extended information.

AI Risk Database MITRE
AIAAIC
AI Harm Map Ethical AI Alliance
AI Incident Database
AI Incident Tracker MIT
AI Vulnerability Database (AVID)
George Washington University Law School's AI Litigation Database
OECD AI Incidents Monitor
Verica Open Incident Database (VOID)

Cybersecurity

Frameworks

A Framework for Ethical Decision Making Markkula Center for Applied Ethics
Data Ethics Canvas Open Data Institute
Deon Python Drivendata
Ethics & Algorithms Toolkit
Open Ethics Transparency Protocol (OETP) Open Ethics
RAI Toolkit US Department of Defense

Institutes

This section features a curated selection of institutes that research about Responsible AI and related topics.

AI Safety Institutes (or equivalent)

Beijing AISI China
Canada AISI Canada
China AI Development and Safety Network China
EU AI Office Europe
Korea AISI South Korea
Singapore AISI Singapore

AI Security Institute

UK AISI United Kingdom

Japan AISI

Code	Title	Description	Status	Source
AI Safety Evaluation v1.10	A guide to red teaming techniques for AI safety	Presents basic concepts that those involved in the development and provision of AI systems can refer to when conducting AI Safety evaluations	Published	Source
AI Safety RT v1.10	Guide to Red Teaming Methodology on AI Safety	Intended to help developers and providers of AI systems to evaluate the basic considerations of red teaming methodologies for AI systems from the viewpoint of attackers	Published	Source
Data Quality Management v1.0.0	A guide about Data Quality linked to AI Safety	Intended to help developers and providers of AI systems to adopt data quality management practices	Published	Source
AI Business Guidelines v1.1.0	A guide for organizations to adopt agile AI Governance	Intended to help all the stakeholders in an organization to adopt voluntary agile AI Governance practices	Published	Source
Known Attacks and Their Impacts on AI Systems (March 2025)	Known Attacks and Their Impacts on AI Systems	An accessible overview of adversarial attacks unique to predictive and generative AI systems	Published	Source 1, Source 2

US CAISI

Code	Title	Description	Status	Source
NIST AI 800-1	Managing Misuse Risk for Dual-Use Foundation Models	Outlines voluntary best practices for identifying, measuring, and mitigating risks to public safety and national security across the AI lifecycle	Draft (second Version)	Source

Responsible AI Institutes

Canadian Centre for Responsible AI Governance
IBGIA - Instituto Brasileiro de Governança em IA - Brazilian nonprofit institute for AI governance, compliance, ethics, and regulation. Brazil

Research Institutes

Ada Lovelace Institute United Kingdom
Centre pour la Securité de l'IA, CeSIA France
European Centre for Algorithmic Transparency
Center for Human-Compatible AI UC Berkeley United States of America
Center for Responsible AI New York University United States of America
Montreal AI Ethics Institute Canada
Munich Center for Technology in Society (IEAI) TUM School of Social Sciences and Technology Germany
National AI Centre's Responsible AI Network Australia
Open Data Institute United Kingdom
Stanford University Human-Centered Artificial Intelligence (HAI) United States of America
The Institute for Ethical AI & Machine Learning
UNESCO Chair in AI Ethics & Governance IE University Spain
University of Oxford Institute for Ethics in AI University of Oxford United Kingdom
Australian Government-funded AI Adopt Centres:
- ARM Hub AI Adopt Centre
- Australian Regional AI Network (ARAIN)
- SAAM (Safe AI Adoption Model)
- SMEC AI (Small to Medium Enterprise Centre of Artificial Intelligence)
Future of Life Institute: Focused on reducing existential risks, this institute brings together experts to ensure AI benefits humanity.
International Panel on the Information Environment: A global network of scholars and practitioners working to improve public understanding of our evolving information landscape, including the role of AI.
Center for AI Safety: This organization researches the challenges of AI safety and develops strategies to mitigate potential risks in AI development.
Distributed AI Research Institute -DAIR-: DAIR advocates for decentralized and transparent AI research, emphasizing open collaboration for safe technological progress.
International Association for Safe and Ethical AI: Dedicated to advancing safe and ethical AI practices, this association provides a platform for stakeholders to share guidelines and best practices.
Partnership on AI: Bringing together industry, academia, and civil society, this partnership promotes responsible AI development and broad benefits for all.
AI Now Institute: An interdisciplinary research center that examines the social implications of AI and advocates for greater accountability in AI systems.
Centre for the Governance of AI: Based at the University of Oxford, this centre researches policy and governance frameworks to manage the challenges of AI technologies.
Future of Humanity Institute: An interdisciplinary research center that explores global challenges and the long-term impacts of AI on society and humanity.
Machine Intelligence Research Institute -MIRI-: MIRI focuses on developing theoretical tools to ensure that advanced AI systems are aligned with human values and remain safe.

Maturity Models

This section features a curated selection of maturity models that can help organizations to adopt AI in a responsible way.

Newsletters

This section features a curated selection of newsletters that keep you informed about this domain.

AI Frontiers Center for AI Safety
AI Policy Perspectives
AI Policy Weekly
AI Safety in China
AI Safety Newsletter Center for AI Safety
AI Snake Oil
Import AI
Marcus on AI
ML Safety Newsletter
Navigating AI Risks
One Useful Thing
The AI Ethics Brief
The AI Evaluation Substack
The EU AI Act Newsletter
The Machine Learning Engineer
Turing Post

Principles

This section features a curated selection of Responsible AI principles adopted by several organizations. Note: some publication dates may not be accurate.

Company	Document	Year
Allianz	Principles for a responsible usage of AI	2022
Deutsche Telekom	Guidelines for Artificial Intelligence	2018
European Commission	Guidelines for Trustworthy AI	2019
Future of Life Institute	Asilomar AI principles	2017
Google	AI Principles	2019
IEEE	Ethically Aligned Design	2023
Logitech	Principles for Responsible AI	2024
Microsoft	AI Principles	2022
OECD	AI principles	2019
Telefonica	AI Principles	2018
The Institute for Ethical AI & Machine Learning	The Responsible Machine Learning Principles	2018

Additional:

FAIR Principles Findability Accessibility Interoperability Reuse
The CARE Principles for Indigenous Data Governance Data Governance
The First Nations Principles of OCAP Data Governance

If you want to read on how to move from AI principles to commitments, we recommend hte article 'Getting from commitment to content in AI and data ethics: Justice and explainability'.

Podcasts

This section features podcasts that offer insightful commentary and explanations on Responsible AI, AI Governance, AI Safety, AI Alignment and Machine Learning Interpretability.

Podcast	Description	Creator
AI Frontiers	A space for expert dialogue about the impact of AI.	Center for AI Safety
AI Safety Fundamentals	Listen to the Bluedot Impact courses content	BlueDot Impact
AI Safety Newsletter	Narrations of the newsletter.	Center for AI Safety
Me, Myself and AI	Interviews with experts	MIT Sloan Management Review
Practical AI	Practical AI is a show in which technology professionals, business people, students, enthusiasts, and expert guests engage in lively discussions about Artificial Intelligence and related topics.	Chris Benson

Regulations

This section features a curated selection of regulations.

Definition

What are regulations?

Regulations are requirements established by governments.

Interesting resources

AI Regulations Tracker
Data Protection and Privacy Legislation Worldwide UNCTAD
Data Protection Laws of the Word DLAPiper
Digital Policy Alert
ETO Agora CSET
GAIIN: The Global AI Initiatives Navigator OECD
GDPR Comparison
Global AI Regulation
INTERACTIVE MAPPING OF THE AI REGULATION LANDSCAPE DiversiFair
Policy Database AI Standards Hub
SEA Observatory AI Safety Asia
SCL Artificial Intelligence Contractual Clauses

Australia 🇦🇺

Additionally, we recommend this comprehensive, community-maintained index of Australian AI Security standards, policies, frameworks, and guidance.

Canada 🇨🇦

China 🇨🇳

Chinese AI Governance Documents

European Union 🇪🇺

Short Name	Code	Description	Status	Website	Legal text
Cyber Resilience Act (CRA) - horizontal cybersecurity requirements for products with digital elements	2022/0272(COD)	It introduces mandatory cybersecurity requirements for hardware and software products, throughout their whole lifecycle.	Proposal	Website	Source
Data Act	EU/2023/2854	It enables a fair distribution of the value of data by establishing clear and fair rules for accessing and using data within the European data economy.	Published	Website	Source
Data Governance Act	EU/2022/868	It supports the setup and development of Common European Data Spaces in strategic domains, involving both private and public players, in sectors such as health, environment, energy, agriculture, mobility, finance, manufacturing, public administration and skills.	Published	Website	Source
Digital Market Act	EU/2022/1925	It establishes a set of clearly defined objective criteria to identify “gatekeepers”. Gatekeepers are large digital platforms providing so called core platform services, such as for example online search engines, app stores, messenger services. Gatekeepers will have to comply with the do’s (i.e. obligations) and don’ts (i.e. prohibitions) listed in the DMA.	Published	Website	Source
Digital Operational Resilience Act (DORA)	EU/2022/2554	IT is a regulation to strengthen the digital resilience of financial entities, and ensures that banks, insurance companies, investment firms and other financial entities can withstand, respond to, and recover from ICT (Information and Communication Technology) disruptions, such as cyberattacks or system failures.	Published	Website	Source
Digital Services Act	EU/2022/2026	It regulates online intermediaries and platforms such as marketplaces, social networks, content-sharing platforms, app stores, and online travel and accommodation platforms. Its main goal is to prevent illegal and harmful activities online and the spread of disinformation. It ensures user safety, protects fundamental rights, and creates a fair and open online platform environment.	Published	Website	Source
DMS Directive	EU/2019/790	It is intended to ensure a well-functioning marketplace for copyright.	Published	Website	Source
Energy Efficiency Directive	EU/2023/1791	It establishes ‘energy efficiency first’ as a fundamental principle of EU energy policy, giving it legal-standing for the first time. In practical terms, this means that energy efficiency must be considered by EU countries in all relevant policy and major investment decisions taken in the energy and non-energy sectors.	Published	Website	Source
EU AI ACT	EU/2024/1689	It assigns applications of AI to three risk categories. First, applications and systems that create an unacceptable risk are banned. Second, high-risk applications are subject to specific legal requirements. Lastly, applications not explicitly banned or listed as high-risk are largely left unregulated.	Published	Website	Source
General Data Protection Regulation (GDPR)	EU/2016/679	It strengthens individuals' fundamental rights in the digital age and facilitate business by clarifying rules for companies and public bodies in the digital single market.	Published	Website	Source
NIS2 Directive	EU/2022/2555	It provides legal measures to boost the overall level of cybersecurity in the EU by ensuring preparedness, cooperation and security cultere across the Member States.	Published	Website	Source

Additionally,

India 🇮🇳

Singapore 🇸🇬

South Korea 🇰🇷

AI Basic Act

United Arab Emirates 🇦🇪

AI Principles & Ethics/Ethical AI Toolkit

United States 🇺🇸

State laws: California (CCPA and its amendment, CPRA), and SB-53 Artificial intelligence models: large developers.; Virginia (VCDPA); Colorado (ColoPA - Colorado SB21-190 and SB21-169: Regulation prohibiting unfair discrimination in insurance); and New York NYC Local Law 144: Mandatory bias audits for automated employment decision tools.
Specific and limited privacy data laws: HIPAA, FCRA, FERPA, GLBA, ECPA, COPPA, VPPA and FTC.
EU-U.S. and Swiss-U.S. Privacy Shield Frameworks - The EU-U.S. and Swiss-U.S. Privacy Shield Frameworks were designed by the U.S. Department of Commerce and the European Commission and Swiss Administration to provide companies on both sides of the Atlantic with a mechanism to comply with data protection requirements when transferring personal data from the European Union and Switzerland to the United States in support of transatlantic commerce.
REMOVING BARRIERS TO AMERICAN LEADERSHIP IN ARTIFICIAL INTELLIGENCE - Official mandate by the President of the US to position the country at the forefront of AI innovation.
Privacy Act of 1974 - The privacy act of 1974 which establishes a code of fair information practices that governs the collection, maintenance, use and dissemination of information about individuals that is maintained in systems of records by federal agencies.
Privacy Protection Act of 1980 - The Privacy Protection Act of 1980 protects journalists from being required to turn over to law enforcement any work product and documentary materials, including sources, before it is disseminated to the public.

Spain 🇪🇸

Responsible Scaling Policies

This section features a curated selection of responsible scale policies adopted by AI Labs and organizations developing frontier models.

Definition

Responsible Scale Policies (RSPs) specify what level of AI capabilities an AI developer is prepared to handle safely with their current protective measures, and conditions under which it would be too dangerous to continue deploying AI systems and/or scaling up AI capabilities until protective measures improve.

RSP List

Anthropic: part of Anthropic’s Transparency Hub)
- Version 3.0 and RSP Noncompliance Reporting and Anti-Retaliation Policy (effective February 24, 2026)
- Version 2.2 (effective May 14, 2025)
- Version 2.1 (effective March 31, 2025)
- Version 2.0 (effective October 15, 2024)
- Version 1.0 (effective September 19, 2023)
OpenAI: Preparedness Framework. First Published: December, 2025. Last Updated: April, 2025
Google DeepMind: Frontier Safety Framework. Version 3. First Published: May, 2024. Last Updated: September, 2025
Magic: AGI Readiness Policy. Published: July 2, 2024
NAVER: AI Safety Framework. Published: August 7, 2024
Meta: Frontier AI Framework. Published: February 3, 2025
G42: Frontier AI Safety Framework. Published: February 6, 2025
Cohere: Secure AI Frontier Model Framework. Published: February 7, 2025
Microsoft: Frontier Governance Framework. Published: February 8, 2025
Amazon: Frontier Model Safety Framework. Published: February 10, 2025
xAI: Risk Management Framework. First Published: February 30, 2025. Last Update: August 20, 2025.
Nvidia: Frontier AI Risk Assessment. Published: February 17, 2025

Reports

This section features a curated selection of reports relevant to understand the current situation and trends related to Responsible AI, AI Ethics and AI Governance.

AI Ethics

State of AI Ethics. MAIEI Website (latest version 7, November 2025)

AI Governance

Araujo, R. 2024. Understanding the First Wave of AI Safety Institutes: Characteristics, Functions, and Challenges. Institute for AI Policy and Strategy (IAPS) Article
Buchanan, B. 2020. The AI triad and what it means for national security strategy. Center for Security and Emerging Technology. Article
Corrigan, J. et al. 2023. The Policy Playbook: Building a Systems-Oriented Approach to Technology and National Security Policy. CSET (Center for Security and Emerging Technology) Article
Curto, J. 2024. How Can Spain Remain Internationally Competitive in AI under EU Legislation? Article
CSIS. 2024 The AI Safety Institute International Network: Next Steps and Recommendations. CSIS (Center for Strategic and International Studies) Article
Gupta, Ritwik, et al. (2024). Data-Centric AI Governance: Addressing the Limitations of Model-Focused Policies. arXiv preprint arXiv:2409.17216 (Article)[https://arxiv.org/pdf/2409.17216]
Hendrycks, D. et al. 2023. An overview of catastrophic AI risks. Center for AI Safety. arXiv preprint arXiv:2306.12001. Article
Janjeva, A., et al. (2023). Strengthening Resilience to AI Risk. A guide for UK policymakers. CETaS (Centre for Emerging Technology and Security) Article
Piattini, M. and Fernández C.M. 2024. Marco Confiable. Revista SIC 162 Article
Sastry, G., et al. 2024. Computing Power and the Governance of Artificial Intelligence. arXiv preprint arXiv:2402.08797. Article

AI Safety

AI Security

GENAI Security Project - Resources Library OWASP

AI Testing

OWASP AI Testing Guide

Copyright

Market Analysis

AI Safety Index: 2024, 2025 Future of Life
AI World
European Open Source AI Index
Global Index for AI Safety
Impact Report. Edition: 2023 and 2024 Center for AI Safety
State of AI - from 2018 up to now -
The AI Index Report. Edition: 2017, 2018, 2019, 2021, 2022, 2023, 2024, 2025, and 2026. Stanford Institute for Human-Centered Artificial Intelligence

AI Labs

Other

Four Principles of Explainable Artificial Intelligence NIST Explainability
Psychological Foundations of Explainability and Interpretability in Artificial Intelligence NIST Explainability
Inferring Concept Drift Without Labeled Data, 2021 Drift
Interpretability, Fast Forward Labs, 2020 Interpretability
Towards a Standard for Identifying and Managing Bias in Artificial Intelligence (NIST Special Publication 1270) NIST Bias
Auditing machine learning algorithms Auditing

Ratings

https://aimodelratings.com

Standards

Definition

What are standards?

Standards are voluntary, consensus solutions. They document an agreement on how a material, product, process, or service should be specified, performed or delivered. They keep people safe and ensure things work. They create confidence and provide security for investment.

Standards can be understood as formal specifications of best practices as well. There is a growing number of standards related to AI. You can search for the latest in the Standards Database from AI Standards Hub.

There are some open standards such as RSL, focused on content licensing, that still need to gain traction in the market.

Standards

This section features a curated selection of standards related to Responsible AI.

CEN Standards

The European Committee for Standardization is one of three European Standardization Organizations (together with CENELEC and ETSI) that have been officially recognized by the European Union and by the European Free Trade Association (EFTA) as being responsible for developing and defining voluntary standards at European level.

Domain	Standard	Status	URL
Data governance and quality for AI within the European context	CEN/CLC/TR 18115:2024	Published	Source

CEN AI Work programme can be found here.

DGSI Standards

The Digital Governance Standards Institute, part of the Digital Governance Council, is an accredited standards development body. The Institute enables greater trust and confidence in Canada’s digital systems through developing technology governance standards collaboratively across a range of stakeholders.

Domain	Standard	Status	URL
Design, Implementation and Evaluation of Regulatory Sandboxes	CAN/DGSI 123	Published	Source
Ethical Design and Use of Artificial Intelligence by Small and Medium Organizations	CAN/DGSI 101	Published	Source
Machine Learning and AI Implementation in Research Institutions	CAN/DGSI 128	Published	Source

ETSI Standards

The European Telecommunications Standards Institute (ETSI) is an independent, not-for-profit, standardization organization operating in the field of information and communications.

Domain	Standard	Status	URL
Securing Artificial Intelligence (SAI); Baseline Cyber Security Requirements for AI Models and Systems	ETSI EN 304 223 V2.1.1 (2025-12)	Published	Source
Securing Artificial Intelligence (SAI); AI Threat Ontology	ETSI GR SAI 001 V1.1.1 (2022-01)	Published	Source
Securing Artificial Intelligence (SAI); Explicability and transparency of AI processing	ETSI GR SAI 007 V1.1.1 (2023-03)	Published	Source
Securing Artificial Intelligence (SAI); Artificial Intelligence Computing Platform Security Framework	ETSI GR SAI 009 V1.1.1 (2023-02)	Published	Source
Securing ArtificiaI Intelligence (SAI);The role of hardware in security of AI	ETSI GR SAI 006 V1.1.1 (2022-03)	Published	Source
Securing Artificial Intelligence TC (SAI);Privacy aspects of AI/ML systems	ETSI TR 104 225 V1.1.1 (2024-04)	Published	Source
Securing Artificial Intelligence (SAI); Explicability and transparency of AI processing	ETSI TS 104 224 V1.1.1 (2025-03)	Published	Source
Methods for Testing & Specification (MTS); AI Testing; Guidelines for Documentation of AI-enabled Systems	ETSI TR 104 119 V1.1.1 (2025-09)	Published	Source
Securing Artificial Intelligence;Security Testing of AI	ETSI TR 104 066 V1.1.1 (2024-07)	Published	Source
Securing Artificial Intelligence (SAI);Baseline Cyber Security Requirements for AI Models and Systems	ETSI TS 104 223 V1.1.1 (2025-04)	Published	Source
Securing Artificial Intelligence (SAI);Baseline Cyber Security Requirements for AI Models and Systems	ETSI EN 304 223 V2.1.1 (2025-12)	Published	Source
Securing Artificial Intelligence (SAI); AI Threat Ontology and definitions	ETSI TS 104 050 V1.1.1 (2025-03)	Published	Source
Securing Artificial Intelligence (SAI);Traceability of AI Models	ETSI TR 104 032 V1.1.1 (2024-02)	Published	Source
Methods for Testing & Specification (MTS);Continuous Auditing Based Conformity Assessment for AI-enabled systems	ETSI TS 104 008 V1.1.1 (2026-01)	Published	Source
Securing Artificial Intelligence (SAI); Guide to Cyber Security for AI Models and Systems	ETSI TR 104 128 V1.1.1 (2025-05)	Published	Source
Securing Artificial Intelligence (SAI); Understanding and Preventing Harm from Generative AI	ETSI TR 104 159 V1.1.1 (2026-01)	Published	Source
Securing Artificial Intelligence (SAI); Proofs of Concepts Framework	ETSI GR SAI 013 V1.1.1 (2023-03)	Published	Source

IEEE Standards

The Institute of Electrical and Electronics Engineers (IEEE) is an American 501 charitable professional organization for electrical engineering, electronics engineering, and related disciplines. Today, it is a global network of more than 486,000 engineering and STEM professionals across a variety of disciplines whose core purpose is to foster technological innovation for the benefit of humanity.

Domain	Standard	Status	URL
IEEE Guide for an Architectural Framework for Explainable Artificial Intelligence	IEEE 2894-2024	Published	Source
IEEE Recommended Practice for the Quality Management of Datasets for Medical Artificial Intelligence	IEEE 2801-2022	Published	Source
IEEE Standard for Ethical Considerations in Emulated Empathy in Autonomous and Intelligent Systems	IEEE 7014-2024	Published	Source
IEEE Standard for Robustness Testing and Evaluation of Artificial Intelligence (AI)-based Image Recognition Service	IEEE 3129-2023	Published	Source
IEEE Standard for Performance Benchmarking for Artificial Intelligence Server Systems	IEEE 2937-2022	Published	Source
IEEE Standard for Security Requirement of Privacy-Preserving Computation	IEEE 3169-2025	Published	Source

SAE Standards

SAE International is community of more than 200,000 aerospace, commercial vehicle, and ground vehicle engineers and technical experts, SAE solves challenges, creates connections, and develops resources that allow us to drive industry forward.

Domain	Standard	Status	URL
Artificial intelligence Simulation: Best Practices	3347	WIP	Source
Assessment of Human Factors concerns for the development of safety-related AI-based Systems in Aviation	AIR8493	WIP	Source
Verification & Validation of AI/ML based Components & Systems in Ground Vehicles	J3321	WIP	Source
AI Regulations, Standards & Applications Challenges	J3329	WIP	Source
Managing the Development of Artificial Intelligence Software	CRB1	Published	source
Artificial Intelligence in Aeronautical Systems: Statement of Concerns	AIR6988	Published	Source

UNE Standards

UNE is Spain's only Standardisation Organisation, designated by the Spanish Ministry of Economy, Industry and Competitiveness to the European Commission. It helps Spanish organizations to keep up-to-date on all aspects related to standardisation:

Discover the new regulatory developments;
Take part in developing standards;
Learn how to integrate standardisation in your R&D&i project;

Domain	Standard	Status	URL
Calidad del dato	UNE 0079:2023	Published	Source
Gestión del dato	UNE 0078:2023	Published	Source
Gobierno del dato	UNE 0077:2023	Published	Source
Guía de evaluación de la Calidad de un Conjunto de Datos	UNE 0081:2023	Published	Source
Guía de evaluación del Gobierno, Gestión y Gestión de la Calidad del Dato	UNE 0080:2023	Published	Source
Medición del consumo energético, huella de carbono, consumo del agua y rendimiento de sistemas de Inteligencia Artificial	UNE 0086:2025	Published	Source

Additional translations in Spanish can be found here.

ISO/IEC Standards

Domain	Standard	Status	URL
AI Concepts and Terminology	ISO/IEC 22989:2022 Information technology — Artificial intelligence — Artificial intelligence concepts and terminology	Published	https://www.iso.org/standard/74296.html
AI Controllabitlity	ISO/IEC CD TS 8200 Information technology — Artificial intelligence — Controllability of automated artificial intelligence systems	Published	https://www.iso.org/standard/83012.html
AI Governance	ISO/IEC 38507:2022 Information technology — Governance of IT — Governance implications of the use of artificial intelligence by organizations	Published	https://www.iso.org/standard/56641.html
AI Management System	ISO/IEC DIS 42001 Information technology — Artificial intelligence — Management system	Published	https://www.iso.org/standard/81230.html
AI Impact Assessment	ISO/IEC 42005:2025 Information technology — Artificial intelligence (AI) — AI system impact assessment	Published	https://www.iso.org/standard/42005
AI Performance	ISO/IEC TS 4213:2022 Information technology — Artificial intelligence — Assessment of machine learning classification performance	Published	https://www.iso.org/standard/79799.html
AI Privacy	ISO/IEC AWI 27091 Cybersecurity and Privacy — Artificial Intelligence — Privacy protection	Under Development	https://www.iso.org/standard/56582.html
AI Quality	ISO/IEC AWI TR 42106 Information technology — Artificial intelligence — Overview of differentiated benchmarking of AI system quality characteristics	Under Development	https://www.iso.org/standard/86903.html
AI Risk Management	ISO/IEC 23894:2023 Information technology - Artificial intelligence - Guidance on risk management	Published	https://www.iso.org/standard/77304.html
AI Security	ISO/IEC DIS 27090 Cybersecurity — Artificial Intelligence — Guidance for addressing security threats and failures in artificial intelligence systems	Under Development	https://www.iso.org/standard/56581.html
AI Sustainability	ISO/IEC AWI TR 20226 Information technology — Artificial intelligence — Environmental sustainability aspects of AI systems	Published	https://www.iso.org/standard/86177.html
AI Verification and Validation	ISO/IEC AWI TS 17847 Information technology — Artificial intelligence — Verification and validation analysis of AI systems	Under Development	https://www.iso.org/standard/85072.html
AI Audit and Certification	ISO/IEC 42006:2025 Information technology — Artificial intelligence — Requirements for bodies providing audit and certification of artificial intelligence management systems	Published	https://www.iso.org/standard/42006
Biases in AI	ISO/IEC TR 24027:2021 Information technology — Artificial intelligence (AI) — Bias in AI systems and AI aided decision making	Published	https://www.iso.org/standard/77607.html
Ethical and societal concerns	ISO/IEC TR 24368:2022 Information technology — Artificial intelligence — Overview of ethical and societal concerns	Published	https://www.iso.org/standard/78507.html
Explainability	ISO/IEC AWI TS 6254 Information technology — Artificial intelligence — Objectives and approaches for explainability of ML models and AI systems	Under Development	https://www.iso.org/standard/82148.html
Biases in AI	ISO/IEC CD TS 12791 Information technology — Artificial intelligence — Treatment of unwanted bias in classification and regression machine learning tasks	Published	https://www.iso.org/standard/84110.html
Data Quality for AI/ML	ISO/IEC DIS 5259 Artificial intelligence — Data quality for analytics and machine learning (ML) (1 to 6)	Published	https://www.iso.org/standard/81088.html
Data Lifecycle	ISO/IEC FDIS 8183 Information technology — Artificial intelligence — Data life cycle framework	Published	https://www.iso.org/standard/83002.html
Transparency	ISO/IEC AWI 12792 Information technology — Artificial intelligence — Transparency taxonomy of AI systems	Under Development	https://www.iso.org/standard/84111.html
Trustworthy AI	ISO/IEC TR 24028:2020 Information technology — Artificial intelligence — Overview of trustworthiness in artificial intelligence	Published	https://www.iso.org/standard/77608.html
Synthetic Data	ISO/IEC AWI TR 42103 Information technology — Artificial intelligence — Overview of synthetic data in the context of AI systems	Under Development	https://www.iso.org/standard/86899.html
AI Safety	ISO/IEC CD TR 5469 Artificial intelligence — Functional safety and AI systems	Published	https://www.iso.org/standard/81283.html
Beneficial AI Systems	ISO/IEC AWI TR 21221 Information technology – Artificial intelligence – Beneficial AI systems	Under Development	https://www.iso.org/standard/86690.html

Learning Resources for ISO/IEC Standards

ISO 42001 Visual Library — A visual learning library for ISO/IEC 42001:2023 covering all clauses, Annex A controls and the PDCA cycle through reference cards, memory cards and deep dives. Open source, CC BY 4.0.

NIST Publications

Resource	Description	Source
AI RMF (Risk Management Framework)	The AI Risk Management Framework (AI RMF) is intended for voluntary use and to improve the ability to incorporate trustworthiness considerations into the design, development, use, and evaluation of AI products, services, and systems.	Source
AI RMF Playbook	The Playbook provides suggested actions for achieving the outcomes laid out in the AI Risk Management Framework (AI RMF) Core (Tables 1 – 4 in AI RMF 1.0). Suggestions are aligned to each sub-category within the four AI RMF functions (Govern, Map, Measure, Manage).	Source
AI RMF Glossary	This glossary seeks to promote a shared understanding and improve communication among individuals and organizations seeking to operationalize trustworthy and responsible AI through approaches such as the NIST AI Risk Management Framework (AI RMF).	Source

Other resources

Additional standards can be found using the Standards Database and AIDG Hub, we recommend to review NIST Assessing Risks and Impacts of AI (ARIA), and EU AI Act Harmonised Standards Mapping as well. Another interesting repository for AI Governance is the AI Governance Library.

Tools

This section features tools and libraries that help to design, implement and manage AI in a responsible way.

Tool	Language	Description	Creator	Status
balance	Python	A package for balancing biased data samples	META	Active
clav	R	This package provides utilities for conducting cluster (profile) analysis with an emphasis on the validating the stability of the profiles both within a given data set as well as across data sets.	Jason Bryer	Active
smclafify	Python	Bias detection and mitigation for datasets and models	Amazon	Inactive
SolasAI	Python	A Library of Curated Disparity Testing Metrics for Use in Real-World Settings	SolasAI	Active
TRAK (Attributing Model Behaviour at Scale)	Python	A data attribution method called TRAK (Tracing with the Randomly-Projected After Kernel) to make accurate counterfactual predictions. See: Article	MIT	Inactive

This section is under review and the rest of entries will be added to the table with extended information.

AI Alignment

Circuit Breakers Python

AI Governance

Agent Governance Toolkit Microsoft
Governance Mega-Map Application The Company Ethos
Verifywise VerifyWise
Venturalitica SDK

AI Licensing

Licensing AI models adds new layers of complexity beyond what traditional software licenses manage. Models may have separate licenses for: (1) The code used to train the model, (2) The model weights after training, (3) The datasets used during training, and (4) The outputs generated when users interact with the model.

https://www.licenses.ai

Audit

AIR Blackbox Python - Open-source EU AI Act compliance scanner and runtime trust layer for Python AI agents. HMAC-SHA256 tamper-evident audit chains, PII detection, and prompt injection blocking. Trust layers for LangChain, CrewAI, AutoGen, OpenAI, Google ADK, and Claude Agent SDK. (Website | PyPI)
PRML / falsify Python, JS, Go, Rust Studio 11 - Pre-Registered ML Manifest specification (CC BY 4.0). Commits an evaluation claim (metric, comparator, threshold, dataset hash, seed, producer identity) to a SHA-256 hash before the experiment runs. Tamper-evident audit trail; subcategory crosswalks for EU AI Act Article 12, NIST AI RMF, ISO/IEC 42001. Four byte-equivalent reference implementations across 20 conformance vectors. Zenodo DOI 10.5281/zenodo.20177839, in SchemaStore catalog.
glassalpha Python
Systima Comply TypeScript Systima

Causal Inference

AIPW: Augmented Inverse Probability Weighting
caugi (Causal Graph Interface) R
CausalAI Python Salesforce
CausalNex Python
CausalImpact R
Causalinference Python
causaldef R
causalDT: Causal Distillation Trees R
Causal Inference 360 Python
CausalPy Python
CIMTx: Causal Inference for Multiple Treatments with a Binary Outcome R
dagitty R
DoWhy Python Microsoft
flexCausal R
ForCausality R
mediation: Causal Mediation Analysis R
MRPC R

Data Management

DataRec Python

Data Quality

Pointblank Python

Data Version Control

Drift

Alibi Detect Python
Deepchecks Python
drifter R
Evidently Python
nannyML Python
phoenix Python
PKBooks Rust

EU AI Act Compliance

AI Act Skills Skills Gemini Claude OpenAI
EuConform Python

Fairness

Aequitas' Bias & Fairness Audit Toolkit Python
AI360 Toolkit Python R IBM
dsld: Data Science Looks at Discrimination R
EDFfair: Explicitly Deweighted Features R
EquiPy Python
fairadapt R
faircause R
Fairlearn Python Microsoft
fairmetrics R
fmm-fairness-eval Python
Fairmodels R University of California
fairness R
Fairness Indicators Python Google
FairRankTune Python
Fairsight Toolkit Python
FairPAN - Fair Predictive Adversarial Network R
Intersectional Fairness (ISF) Python
OxonFair Python Oxford Internet Institute
Themis ML Python
What-If Tool Python Google

Feature Stores

Butterfree Python
Featureform Python
Feathr Python
Feast Python
Hopsworks Python

Interpretability/Explicability

Alibi Explain Python
Automated interpretability Python OpenAI
AI360 Toolkit Python R IBM
aorsf: Accelerated Oblique Random Survival Forests R
breakDown: Model Agnostic Explainers for Individual Predictions R
captum Python PyTorch
ceterisParibus: Ceteris Paribus Profiles R
DALEX: moDel Agnostic Language for Exploration and eXplanation Python R
DALEXtra: extension for DALEX Python R
Dianna Python
Diverse Counterfactual Explanations (DiCE) Python Microsoft
dtreeviz Python
ecco article Python
effector Python
effectplots R
eli5 Python
explabox Python National Police Lab AI
eXplainability Toolbox Python
ExplainaBoard Python Carnegie Mellon University
ExplainerHub in github Python
e2tree R
fastshap R
fasttreeshap Python LinkedIn
FAT Forensics Python
ferret Python
flashlight R
Human Learn Python
hstats R
innvestigate Python Neural Networks
Inseq Python
intepretML Python
interactions: Comprehensive, User-Friendly Toolkit for Probing Interactions R
kernelshap: Kernel SHAP R
midr R
Learning Interpretability Tool Python Google
lime: Local Interpretable Model-Agnostic Explanations R
Network Dissection Python Neural Networks MIT
OmniXAI Python Salesforce
pre R
ReasonGraph Python
Shap Python
Shapash Python
shapper R
shapviz R
Skater Python Oracle
survex R
teller Python
TCAV (Testing with Concept Activation Vectors) Python
Transformer Debugger Python OpenAI
truelens Python Truera
truelens-eval Python Truera
pre: Prediction Rule Ensembles R
Vetiver R Python Posit
vip R
vivid R
XAI - An eXplainability toolbox for machine learning Python The Institute for Ethical Machine Learning
xplique Python
XAIoGraphs Python Telefonica
XAITK Python DARPA
Zennit Python

Interpretable Models

imodels Python
imodelsX Python
interpretML Python Microsoft R
PiML Toolbox Python
Tensorflow Lattice Python Google
Trust-free Python

Model Verification

Model Transparency Python Google Open Source Security Foundation

LLM Evaluations and Benchmarks

Measuring progress is fundamental to the advancement of any scientific field. As benchmarks play an increasingly central role, they also grow more susceptible to distortion. Read more about it in Sing, S., et al. (2025) The Leaderboard Illusion. arXiv preprint arXiv:2504.2087. In addition to the problem of distursion, we must remember that this is nascent discipline as stated in Weidinger, L., et al. (2025). Toward an evaluation science for generative AI systems. arXiv preprint arXiv:2503.05336. Benchmarks may appear as neutral scoreboards; however, they embody more than that. Each one signifies a particular philosophy: the types of work valued, the definition of success, and what can be safely disregarded. The development of a truly effective benchmark is equally challenging and indispensable as the development of the model itself.

New approaches are emerging to established protocol or methodology for conducting AI evaluations such as PREP-Eval v1.0 Pre-registration and REporting Protocol for AI Evaluations or Evals-Consensus.

AbsenceBench Python
AIluminate
AI Wellbeing CAIS
AlignEval: Making Evals Easy, Fun, and Semi-Automated Motivation
AlpacaEval Python
ARC AGI 1 Python
ARC AGI 2 Python
ARES Python Stanford Future Data Systems
Artificial Analysis Omniscience Index Artificial Analysis
Autoeval Python
Azure AI Evaluation Python Microsoft
BabyReasoningBench Python
Banana-lyzer Python
BALROG Python
BIG-Bench Extra Hard Python Deepmind
BountyBench Python
BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs Python INSAIT SRILAB ETHZürich
Catastrophic Cyber Capabilities Benchmark (3CB) Python
Chinese Safety Evaluations Concordia AI
CL-Bench Python
CLUE benchmark Python
CritPt Python
Cybench Python
DarkBench Python
DeepEval Python
DeepSWE Python
DELEGATE-52 Python Microsoft Paper LLMs Corrupt Your Documents When You Delegate
evals Python OpenAI
EvalScope Python
evmbench Python
FMBench Python Amazon
FlagEval Python BAAI
FBI: Finding Blindspots in LLM Evaluations with Interpretable Checklists Python
ForecastBench
ForesightSafety-Bench Python Beijing AISI
FrontierMath
Geekbench AI
GDPval Paper OpenAI
GPQA: A Graduate-Level Google-Proof Q&A Benchmark Python dataset Epoch Dashboard
Giskard Python
HAL Harness Python PLI
Harbor Python
HELM Python
Humanity's Last Exam (HLE) Scale AI Center for AI Safety
Humanity's Last Exam (HLE)-Verified
HybridRAG-Bench
Inspect Python UK AISI
- Inspect Petri Python UK AISI Meridian Labs
- Inspect Scout Python Meridian Labs
- Inspect Flow Python Meridian Labs
- Inspect Petri Dish Python Meridian Labs
- Inspect Petri Bloom Python Meridian Labs
Intercode Python
Intima Benchmark Paper HuggingFace
Jailbreakbench Python
JailBreakV-28K Python
JGLUE: Japanese General Language Understanding Evaluation Python
KLUE: Korean Language Understanding Evaluation Python
LABBench2 Python Edison
Machiavelli Python Center for AI Safety
MalayMMLU Python YTL AI Labs
Mask Benchmark Python Center for AI Safety Scale AI
Math Science Bench
MCPBench: A Benchmark for Evaluating MCP Servers Python ModelScope
MixEval Python
ML Commons Safety Benchmark for general purpose AI chat model
MLflow LLM Evaluation Python
MLGym Python Facebook Agents
MLPerf Training Benchmark Training
MMMU Apple Python
Moonshoot AI Verify Foundation Python
MoReBench Python
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving Python ByteDance
NaturalBench Python
NYU CFT Bench Python
Langchain Evaluations Python
Langfuse Scores Python
LightEval HuggingFace Python
LiveBench: A Challenging, Contamination-Free LLM Benchmark Contamination free
LM Evaluation Harness Python
lmms-eval Python
OffsetBias: Leveraging Debiased Data for Tuning Evaluators Python
opik Comet Python
Petri Python
Phare LLM Benchmark Python Giskard AI
Phoenix Arize AI Python
Political Even-handedness Evaluation Python Anthropic AI
Prometheus Python
Promptfoo Python
Prophet Arena Sigma Research Lab @UChicago
PurpleLlama Python Meta
Pydantic Evals Python
ragas Python
Remote Labor Index Center for AI Safety
RewardBench: Evaluating Reward Models Python Ai2
Rouge Python
SALAD-BENCH Article Python
SciCode Python
Selene Mini Python Atla
simple evals Python OpenAI
SnitchBench Python
StrongREJECT jailbreak benchmark Python
SWE-bench Verified Python
TealTiger Python TypeScript
terminal-bench
TextQuests Python Center for AI Safety
The Berkeley Function Calling Leaderboard (BFCL) Python Berkeley
τ²-bench: Evaluating Conversational Agents in a Dual-Control Environment Python
Yet Another Applied LLM Benchmark Python
Vending Bench Andon Labs
Verdict Python
Virology Capabilities Test Center for AI Safety
vitals R Posit
VCBench Paper
VLMEvalKit Python
Weapons of Mass Destruction Proxy (WMDP) benchmark Python
Werewolf Social Bench
WindowsAgentArena Python Microsoft
XEvalAD Python

Additional benchmarks can be found here, the AI Benchmarking Hub (from Epoch) compares the latest frontier AI models against each other, the CAIS AI Dashboard (from the Center for AI Safety) provides their latest benchmarks (text, vision, safety and automation), and you can learn about prompt evaluations here (by Anthropic).

LLM Regulation Compliance

COMPL-AI Python ETH Zurich Insait LaticeFlow AI
Tunix Python Google

Performance (& Automated ML)

auditor R
automl: Deep Learning with Metaheuristic R
AutoKeras Python
Auto-Sklearn Python
DataPerf Python Google
deepchecks Python
EloML R
Featuretools Python
LOFO Importance Python
forester R
metrica: Prediction performance metrics R
MLmetrics R
model-diagnostics Python
NNI: Neural Network Intelligence Python Microsoft
performance R
rliable Python Google
roclab: ROC-Optimizing Binary Classifiers R
ROCnGO R
Silhouette R
SLmetrics R
TensorFlow Model Analysis Python Google
TPOT Python
Unleash Python
yardstick R
Yellowbrick Python
WeightWatcher (Examples) Python

(AI/Data) Poisoning

Copyright Traps for Large Language Models Python
Nightshade University of Chicago Tool
Glaze University of Chicago Tool
Fawkes University of Chicago Tool

Privacy

BackPACK Python
diffpriv R
Diffprivlib Python IBM
Discrete Gaussian for Differential Privacy Python IBM
Faker Python
FakeDataR: Privacy-Preserving Synthetic Data for 'LLM' Workflows R
GRANDpriv R
JAX-Privacy Python DeepMind
Opacus Python Facebook
Privacy Meter Python National University of Singapore
PyVacy: Privacy Algorithms for PyTorch Python
SEAL Python Microsoft
Tensorflow Privacy Python Google

Red Teaming

AutoDan Python
Rival AI Python
TextAttack Python

Reliability Evaluation (of post hoc explanation methods and LLMs evaluations)

BELLS (Benchmark for the Evaluation of LLM Safeguards) Python CeSIA - Centre pour la Sécurité de l'IA
BetterBench Database
openXAI Python

Robustness

Adversarial Robustness Toolbox (ART) Python
BackdoorBench Python
Factool Python
Foolbox Python
Guardrails Python Guardrails Hub

Safety

AIxploit Python
Bandit Python
Diotra Python NIST
Garak Python Nvidia
Model Inversion Attack ToolBox Python
NeMo Guardrails Python Amazon
Qwen3Guard Python Alibaba
RAXE Python
Safety CLI Python
wildguard Python AllenAI

Security

Counterfit Python Microsoft
detect-secrets Python
Modelscan Python
LLM Guard Python
NB Defense Python
PyRIT Python Microsoft
Rebuff Playground Python
Resk-LLM Python
Turing Data Safe Haven Python The Alan Turing Institute

For consumers:

Synthetic Data

Curator
DataSynthesizer: Privacy-Preserving Synthetic Datasets Python Drexel University University of Washington
Gretel Synthetics Python
SmartNoise Python OpenDP
SDV Python
Snorkel Python
YData Synthetic Python

Sustainability

AI Energy Consumption Calculator
AI Energy Score
Azure Sustainability Calculator Microsoft
Carbon Tracker Website Python
CodeCarbon Website Python
Computer Progress
Eco2AI Python
Green Algorithms
Impact Framework API
ML CO2 IMPACT
The ML.ENERGY Data & Toolkit The ML.ENERGY Leaderboard

(RAI) Toolkit

Deepchecks Python
Dr. Why R Warsaw University of Technology
Mercury Python BBVA
Responsible AI Toolbox Python Microsoft
Responsible AI Widgets R Microsoft
The Data Cards Playbook Python Google
Zeno Hub Python

(AI) Watermarking

AudioSeal: Proactive Localized Watermarking Python Facebook
MarkLLM: An Open-Source Toolkit for LLM Watermarking Python
SynthID Text Python Google

Citing this repository

Contributors with over 50 edits can be named coauthors in the citation of visible names. Otherwise, all contributors with fewer than 50 edits are included under "et al."

Bibtex

@misc{arai_repo,
  author={Josep Curto et al.},
  title={Awesome Responsible Artificial Intelligence},
  year={2026},
  note={\url{https://github.com/AthenaCore/AwesomeResponsibleAI}}
}

ACM, APA, Chicago, and MLA

ACM (Association for Computing Machinery)

Curto, J., et al. 2026. Awesome Responsible Artificial Intelligence. GitHub. https://github.com/AthenaCore/AwesomeResponsibleAI.

APA (American Psychological Association) 7th Edition

Curto, J., et al. (2026). Awesome Responsible Artificial Intelligence. GitHub. https://github.com/AthenaCore/AwesomeResponsibleAI.

Chicago Manual of Style 17th Edition

Curto, J., et al. "Awesome Responsible Artificial Intelligence." GitHub. Last modified 2026. https://github.com/AthenaCore/AwesomeResponsibleAI.

MLA (Modern Language Association) 9th Edition

Curto, J., et al. "Awesome Responsible Artificial Intelligence". GitHub, 2026, https://github.com/AthenaCore/AwesomeResponsibleAI. Accessed 28 May 2026.