oss-llm-security

December 23, 2025 · View on GitHub

Curated list of Open Source project focused on LLM security

Tools / projects

EasyJailbreak - An easy-to-use Python framework to generate adversarial jailbreak prompts.
fast-llm-security - The fastest && easiest LLM security and privacy guardrails for GenAI apps.
Garak - LLM vulnerability scanner. garak checks if an LLM can be made to fail in an way we don't want. garak probes for hallucination, data leakage, prompt injection, misinformation, toxicity generation, jailbreaks, and many other weaknesses. If you know nmap, it's nmap for LLMs.
HouYi - The automated prompt injection framework for LLM-integrated applications.
langkit - An open-source toolkit for monitoring Large Language Models (LLMs). Extracts signals from prompts & responses, ensuring safety & security.
llm-attacks - Universal and Transferable Attacks on Aligned Language Models
llm-guard - The Security Toolkit for LLM Interactions. LLM Guard by Protect AI is a comprehensive tool designed to fortify the security of Large Language Models (LLMs).
llm-security - Dropbox LLM Security research code and results. This repository contains scripts and related documentation that demonstrate attacks against large language models using repeated character sequences. These techniques can be used to execute prompt injection on content-constrained LLM queries.
llm-security - New ways of breaking app-integrated LLMs
modelscan - Protection against Model Serialization Attacks
Open-Prompt-Injection - Prompt injection attacks and defenses in LLM-integrated applications
plexiglass - A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).
ps-fuzz - Make your GenAI Apps Safe & Secure 🚀 Test & harden your system prompt
PurpleLlama - Set of tools to assess and improve LLM security.
promptfoo - LLM red teaming and evaluation framework with modelaudit for scanning ML models for malicious code and backdoors.
promptmap - automatically tests prompt injection attacks on ChatGPT instances.
PyRIT - The Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and machine learning engineers to proactively find risks in their generative AI systems.
rebuff - LLM Prompt Injection Detector.
TrustGate - LLM & Agent attacks detector - Generative Application Firewall (GAF)
vibraniumdome - LLM Security Platform.
vigil-llm -⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs.

By OWASP Top 10 for LLM Applications

LLM01: Prompt Injection

LLM02: Insecure Output Handling

LLM03: Training Data Poisoning

LLM04: Model Denial of Service

LLM05: Supply Chain Vulnerabilities

modelscan

LLM06: Sensitive Information Disclosure

LLM07: Insecure Plugin Design

LLM08: Excessive Agency

LLM09: Overreliance

LLM10: Model Theft

Resources

awesome-llm-security - A curation of awesome tools, documents and projects about LLM Security.
llm-security - https://llmsecurity.net/ - large language model security content - research, papers, and news