Use the same platform for sprint gates, release assurance, audit prep, AI product validation, executive risk, and continuous attack-surface monitoring.
AI security
AI product release
LLM red team, agentic tool tests, guardrails, policy reports, and regression suites.
Findings, reports, dashboards, exports, integrations, and retests all read from the same normalized record.
Pencheff favors repeatable checks, then uses AI for triage, enrichment, orchestration, and remediation where it adds signal.
Coverage
What does AI product release test?
- LLM red team, agentic tool tests, guardrails, policy reports, and regression suites.
- This page is part of Solutions under Program Workflows.
- It links back into the broader security programs without fragmented tooling experience.
- OWASP LLM Top 10 coverage for prompt injection, sensitive information disclosure, supply chain, data leakage, plugins, agency, overreliance, and model theft.
- Jailbreak strategies, roleplay, encoding, payload splitting, multilingual variants, custom datasets, and judge-backed scoring.
- Agentic tests for tool authorization, memory poisoning, context exfiltration, planner hijacking, and unsafe side effects.
- Sentry runtime guardrails, HTTP sidecars, LiteLLM plugins, MCP middleware, PII, secrets, unsafe HTML, and tool authorization checks.
- AI governance mapping to OWASP LLM, MITRE ATLAS, NIST AI RMF, EU AI Act, ISO/IEC 42001, GDPR, and SOC 2.
Execution
How does Pencheff run this?
- Register an LLM endpoint, chatbot, model gateway, MCP host, or agent workflow.
- Choose built-in categories, datasets, guardrails, custom prompts, and optional judge settings.
- Run adversarial campaigns across prompt, tool, memory, retrieval, output, and policy paths.
- Classify failures by category, strategy, severity, transcript, token cost, and guardrail recommendation.
- Turn passing and failing prompts into regression suites for releases and model upgrades.
Evidence
What evidence does this produce?
- Prompt, response, tool call, policy decision, transcript, category, strategy, judge result, and confidence.
- Recommended guardrails with exact unsafe behavior, enforcement point, and regression prompt.
- Token usage, model/provider metadata, retry behavior, and cost-oriented observability.
- Governance mappings for AI risk, safety, privacy, and compliance programs.
Controls
How is this kept safe to run?
- Tests can be run through HTTP, chat-completions, LiteLLM, MCP, or custom adapters.
- Guardrail recommendations stay tied to the scan that exposed the failure.
- Agentic testing focuses on authorization, context boundaries, and side-effect control.
- Runtime policy checks can be placed before prompts, after responses, or around tools.
Documentation
Read the full reference.
FAQ
Common questions
- What security testing should I run before releasing an AI product?
- Before releasing an AI product, you should run an LLM red team assessment covering the OWASP LLM Top 10, an agentic security test if your product uses tool-calling, a Sentry guardrail evaluation, and a supply-chain scan of your AI dependencies. Pencheff covers all four in a single platform.
- How does Pencheff help with AI product security certification?
- Pencheff produces audit-ready reports mapping AI security findings to OWASP LLM Top 10, MITRE ATLAS, and NIST AI RMF categories. These reports serve as evidence for enterprise customer security reviews, regulatory submissions, and AI governance programmes.
- What is the minimum security bar for an AI product release?
- At minimum, an AI product should demonstrate: no exploitable prompt injection vulnerabilities, no sensitive data leakage in model responses, guardrails on harmful output, and a secure supply chain for model weights and dependencies. Pencheff's AI product release profile covers all of these.
Related