adversarial-testing
Here are 30 public repositories matching this topic...
Elenchus MCP Server - Adversarial verification system for code review
-
Updated
Jan 29, 2026 - TypeScript
AI safety evaluation framework testing LLM epistemic robustness under adversarial self-history manipulation
-
Updated
Dec 18, 2025 - Python
A marketplace of Claude Code plugins for adversarial security and architectural code review.
-
Updated
Feb 28, 2026
Benchmark LLM jailbreak resilience across providers with standardized tests, adversarial mode, rich analytics, and a clean Web UI.
-
Updated
Aug 12, 2025 - Python
Adversarial MCP server benchmark suite for testing tool-calling security, drift detection, and proxy defenses
-
Updated
Dec 27, 2025 - JavaScript
Description URF Application Stress Test — adversarial and scalability tests for Unified Rigidity Framework applications, validating limits under load, noise, and edge cases.
-
Updated
Feb 24, 2026 - Shell
Red team toolkit for stress-testing MCP security scanners — find detection gaps before attackers do
-
Updated
Mar 2, 2026 - Python
Compliance-focused vulnerability probes for NVIDIA garak, targeting LLMs in regulated industries (CMMC, NIST, HIPAA, DFARS)
-
Updated
Feb 17, 2026 - Python
LLM-powered fuzzing and adversarial testing framework for Solana programs. Generates intelligent attack scenarios, builds real transactions, and reports vulnerabilities with CWE classifications.
-
Updated
Jan 19, 2026 - Python
Adversarial testing of LLMs on constraint satisfaction deadlocks
-
Updated
Jan 27, 2026
Extremely hard, multi-turn, open-source-grounded coding evaluations that reliably break every current frontier models (Claude, GPT, Grok, Gemini, Llama, etc.) on numerical stability, zero-allocation, autograd, SIMD, and long-chain correctness.
-
Updated
Jan 27, 2026
A dependency-aware Bayesian belief gate that resists correlated evidence and yields only under true independent verification.
-
Updated
Jan 18, 2026 - Python
Generate adversarial pytest tests using LLM. Tries to find edge cases in your Python code.
-
Updated
Jan 22, 2026 - Python
Forensic-style adversarial audit of Google Gemini 2.5 Pro revealing hidden cross-session memory. Includes structured reports, reproducible contracts, SHA-256 checksums, and video evidence of 28-day semantic recall and affective priming. Licensed under CC-BY 4.0.
-
Updated
Oct 7, 2025 - PowerShell
Adversarial testing and robustness evaluation for the Crucible framework
-
Updated
Dec 29, 2025 - Elixir
Analysis of ChatGPT-5 reviewer failure: speculative reasoning disguised as certainty. Captures how evidence-only review drifted into hypotheses, later admitted as review-process failure. Includes logs, checksums, screenshots, and external video.
-
Updated
Oct 7, 2025 - PowerShell
Adversarial plan verification for Claude Code
-
Updated
Feb 28, 2026
🔒 Simulate adversarial behaviors to test and strengthen MCP defenses without real exploitation or risk, ensuring robust security evaluations.
-
Updated
Mar 3, 2026 - JavaScript
Domain-expert evaluation framework for AI judgment quality in healthcare investing
-
Updated
Mar 1, 2026 - Python
Improve this page
Add a description, image, and links to the adversarial-testing topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the adversarial-testing topic, visit your repo's landing page and select "manage topics."