Semantic Stealth Attacks & Symbolic Prompt Red Teaming on GPT and other LLMs.
-
Updated
May 16, 2025
Semantic Stealth Attacks & Symbolic Prompt Red Teaming on GPT and other LLMs.
AATMF | An Open Source - Adversarial AI Threat Modeling Framework
VEX Protocol — The trust layer for AI agents. Adversarial verification, temporal memory, Merkle audit trails, and tamper-proof execution. Built in Rust.
Test and evaluate Large Language Models against prompt injections, jailbreaks, and adversarial attacks with a web-based interactive lab.
LLM Attack Testing Toolkit is a structured methodology and mindset framework for testing Large Language Model (LLM) applications against logic abuse, prompt injection, jailbreaks, and workflow manipulation.
Proof of concept tool to bypass document replay technology (such as GPTZero).
A research framework for simulating, detecting, and defending against backdoor loop attacks in LLM-based multi-agent systems.
Implementation of Vocabulary-Based Adversarial Fuzzing (VB-AF) to systematically probe vulnerabilities in Large Language Models (LLMs).
🛡️ Enterprise-grade AI security framework protecting LLMs from prompt injection attacks using ML-powered detection
Breaking Chain-of-Thought: A Comprehensive Taxonomy of Reasoning Vulnerabilities in Production AI Systems
Pit AI models against each other. Score them sealed. Crown a winner. All built using the GitHub Copilot CLI. ⚡
🔍 Emulate advanced phishing tactics ethically with this open-source framework for red team operations focused on social engineering sophistication.
AI Security Research: Gemini 3.0 Pro S2-Class Exfiltration & Adversarial Robustness. Hardening frontier models against autonomous mutation vectors. NIST VDP / AI Safety Institute compliant.
👻 Adversarial AI Pentester - CHAOS vs ORDER dual-agent exploitation with collective memory
Formal research on Cognitive Side-Channel Extraction (CSCE) and AI semantic leakage vulnerabilities.
A Django-based platform for testing LLMs against prompt injection, social engineering, and policy bypass attacks using red teaming methodologies.
Ethically-bounded red team framework for AI-driven social engineering simulation with consent enforcement and identity graph mapping
[Veracity] Dual-LLM hallucination defense — adversarial verification with Localization Gap detection for Arabic knowledge
Código y demos para generar exploits de kernel vulnerables y defensas en tiempo real con IA.
Adversarial verification layer for AI coding assistants. Based on IACDM — Interactive Adversarial Convergence Development Methodology. The AI proposes. Versus critiques.
Add a description, image, and links to the adversarial-ai topic page so that developers can more easily learn about it.
To associate your repository with the adversarial-ai topic, visit your repo's landing page and select "manage topics."