SERVICES · OFFENSIVE

Adversarial Probing

A red-team campaign against your deployed AI endpoints — multi-vector, multi-day, and authorized in writing.

Adversarial Probing is the discipline of attempting to break a live AI deployment using the same techniques real adversaries use — prompt injection, jailbreak escalation, system-prompt extraction, output handling abuse — under controlled conditions and with full authorization. We test what you've shipped, not what you've built in a lab.

// THE PROBLEM

What we're solving when you hire us for this

Most AI systems that make it to production have been tested against benchmark suites — static, well-known adversarial prompts that AI safety teams have already trained against. Passing those benchmarks tells you the model behaves well on a known test set. It does not tell you whether your specific stack, with your specific system prompt, your specific tool integrations, and your specific user inputs, can survive an attacker who has spent a week studying your application.

Adversarial Probing closes that gap. We probe your actual deployment with techniques that are not in any published benchmark — adapted to your stack's architecture, your business context, and your specific defensive posture. The findings are reproducible. The methodology is documented. The deliverables are remediation-ready.

// HOW WE RUN IT

The five phases of an Adversarial Probing engagement

Reconnaissance

We document your deployment's surface: model in use, system prompt structure, retrieval architecture, tool integrations, output channels, and user interaction patterns. Reconnaissance is read-only — no probing yet, just mapping. The output is a written threat model specific to your stack, reviewed and approved by you before any active testing begins.

Duration 3–5 days · Output: written threat model + approval gate

Payload Development

We construct attack payloads tailored to the threat model. This is not a generic jailbreak corpus — payloads are designed against your specific architecture, your specific system prompt, and your specific business context. We disclose the payload categories in writing before testing begins; we do not disclose the specific payloads (they are how we test).

Duration 2–3 days · Output: payload categories disclosed, specifics held

Active Testing

We execute payloads against your deployment in pre-agreed test windows, recording inputs, outputs, and behavioral changes. Testing is fully logged and timestamped. Findings above HIGH severity are communicated within 24 hours of discovery, regardless of phase. CRITICAL findings within 4 hours. We do not hold findings for the final report.

Duration 5–10 days · Output: live findings stream

Exploitation & Impact Analysis

For each successful attack, we document what an attacker could accomplish — data exfiltration paths, privilege escalation chains, business-logic compromise scenarios. Severity ratings reflect realistic impact on your specific deployment, not abstract risk. We do not exploit beyond the proof needed to establish the finding.

Duration 3–5 days · Output: per-finding impact analysis

Reporting & Remediation Handoff

Final deliverable is a structured findings document, a sanitized executive summary, and a remediation roadmap prioritized by impact and effort. We meet with your engineering team to walk through findings and answer questions during the remediation window. Materials are deleted 30 days after engagement close per our Rules of Engagement.

Duration 3–5 days · Output: report + remediation walkthrough

// WHAT YOU RECEIVE

Deliverables, named and specific

Findings Document

Structured report covering every finding: reproduction steps, affected components, severity rating, exploitation evidence, and recommended remediation. Written for your security and engineering teams.

30–50 pages typical · Markdown + PDF

Executive Summary

Sanitized one-page summary suitable for board-level reporting, compliance documentation, or executive briefings. Names attack classes and risk levels without exposing exploitation details.

1–2 pages · Markdown + PDF

Remediation Roadmap

Findings prioritized by impact and engineering effort, with concrete remediation steps for each. Not generic guidance — specific to your stack and tied to specific findings.

Roadmap document + tracking template

Threat Model

Written threat model produced during reconnaissance, approved by you before testing began. Useful as ongoing reference for your engineering team beyond the engagement.

10–15 pages · Markdown

Reproduction Bundle

Each finding is reproducible. Reproduction steps, payload references (where appropriate), and environmental conditions are documented so your team can verify fixes and prevent regression.

Per-finding reproduction documentation

Remediation Walkthrough

A working session (60–90 minutes) with your engineering and security teams to walk through findings, prioritize fixes, and answer questions during the remediation window.

Live session + recording

// ENGAGEMENT SHAPE

Specific numbers, not approximations

// DURATION

3–5 weeks

Total engagement window

// TEAM SIZE

2 practitioners

Minimum, both senior

// CADENCE

Daily async updates

By 18:00 client timezone

// CRITICAL FINDING SLA

< 4 hours

Notification, not remediation

// SCOPE

Written, in SOW

No verbal expansion

// STARTING PRICE

$22,500

Single-deployment engagement

// REPORT DELIVERY

< 5 business days

After engagement close

// MATERIAL RETENTION

30 days default

Per Rules of Engagement

// WHEN THIS IS RIGHT

Honest fit criteria

// THE RIGHT FIT

—

Your AI system is in production or close to launch — testing has a real deployment to test against, not a research prototype.

—

You have engineering resources to act on findings — a report that sits in a drawer is worthless. We're a poor fit for orgs that need findings but cannot remediate.

—

You can authorize testing in writing — your stack, your endpoints, your scope. If authorization requires multiple sign-offs that take weeks, start that process before booking the engagement.

—

You can absorb findings without panic — adversarial probing finds vulnerabilities. That's the point. Orgs that treat findings as failures rather than data tend to bury reports.

// THE WRONG FIT

—

You need a 'red team in 5 days' — adversarial probing is 2–3 weeks minimum. Shorter engagements are theater. See Injection Vector Mapping for a faster, narrower scan.

—

You're looking for a compliance checkbox — we deliver compliance artifacts as a side effect, but if the primary goal is documentation rather than security, AI Risk Assessment is a better fit.

—

Your system isn't deployed yet — pre-launch threat modeling is valuable but it's not Adversarial Probing. We work against live systems.

—

You expect us to find nothing — every Adversarial Probing engagement finds something. If a clean report is the goal, this isn't the engagement.

// RELATED ENGAGEMENTS

Where this connects to the rest of our work

Injection Vector Mapping

A narrower, faster scan focused on prompt injection surface only. Often run as a prelude to Adversarial Probing.

DETAILS →

Agentic Guardrails

The defensive counterpart. After we find weaknesses, this is the engagement that hardens runtime behavior.

DETAILS →

Incident Response

If probing surfaces active exploitation, this is the engagement that triages the breach.

DETAILS →

Adversarial Probing engagements start from $22,500. Reply within 24h. NDA before scope.

BOOK THIS ENGAGEMENT →