SERVICES · DEFENSIVE

RAG Perimeter

Audits and hardens the boundary between your retrieval pipeline and your model — where 2026's most common breaches happen.

RAG Perimeter is a focused defensive engagement around retrieval-augmented generation systems. We audit permission bypasses, cross-tenant leakage, embedding poisoning, and indirect injection through ingested documents — then design the defenses that close those gaps. The engagement is narrower than Adversarial Probing but goes deeper on the RAG-specific attack surface.

// THE PROBLEM

What we're solving when you hire us for this

RAG systems are the single fastest-growing AI deployment pattern in 2026, and the most commonly breached. The reasons are structural: retrieval brings untrusted documents into the model's context, vector stores often lack proper access controls, similarity search can leak data across tenants, and document-ingest pipelines are rarely treated as security boundaries. Most RAG deployments have all four problems.

RAG Perimeter addresses each of them. We audit the retrieval boundary, identify the specific permission, leakage, and injection gaps in your deployment, and design defenses — access controls, sanitization, isolation, monitoring. The work is RAG-specific and deeper than what a general AI audit covers.

// HOW WE RUN IT

The five phases of a RAG Perimeter engagement

Retrieval Topology Audit

We document your RAG pipeline: document sources, ingestion process, embedding model, vector store, similarity search logic, retrieval-to-prompt assembly, and downstream model. Each handoff is a potential boundary.

Duration 3–5 days · Output: pipeline map

Boundary Threat Modeling

Against the topology, we identify the specific threats: document-layer injection, cross-tenant similarity leakage, ACL bypass via vector queries, embedding poisoning, context-window stuffing, retrieval-result manipulation.

Duration 2–3 days · Output: threat model

Vulnerability Testing

We test each identified threat against your live RAG deployment — confirming which boundaries hold and which fail. Findings are categorized by which boundary failed and how an attacker exploits it.

Duration 5–7 days · Output: tested findings

Hardening Design

For each failing boundary, we design the defense: tenant isolation in vector storage, document sanitization in ingest, ACL enforcement in retrieval, monitoring for poisoning patterns.

Duration 3–4 days · Output: design document + approval gate

Implementation & Validation

We work with your engineering team to deploy the hardening, then re-test to confirm the boundaries now hold. Final deliverable includes runbook for ongoing RAG operations.

Duration 5–7 days · Output: deployed hardening + runbook

// WHAT YOU RECEIVE

Deliverables, named and specific

RAG Pipeline Map

Complete topology of your retrieval system: sources, embeddings, storage, query path, prompt assembly. Useful as ongoing reference as your pipeline evolves.

15–25 pages · Markdown + diagram

Findings Document

Each failing boundary documented: attack technique, affected data, reproduction steps, impact analysis.

30–45 pages · Markdown + PDF

Hardening Design

Per-boundary defense specification: vector isolation, document sanitization, ACL enforcement, monitoring rules.

Design document + configuration

Implementation Artifacts

Deployed configurations and policy code, committed to your repos or delivered as patches your team applies.

Code + configuration

RAG Monitoring

Alerts for boundary tests: anomalous similarity queries, suspected poisoned embeddings, cross-tenant retrieval patterns.

Monitoring rules + alerting

RAG Operations Runbook

Documentation for maintaining boundaries as your document sources, embeddings, and queries evolve.

Runbook + playbooks

// ENGAGEMENT SHAPE

Specific numbers, not approximations

// DURATION

3–4 weeks

Total engagement window

// TEAM SIZE

2 practitioners

Minimum, both senior

// CADENCE

Daily async updates

By 18:00 client timezone

// CRITICAL FINDING SLA

< 4 hours

Cross-tenant or PII leakage

// SCOPE

Per-RAG-deployment

Written in SOW

// STARTING PRICE

$21,500

Single-pipeline engagement

// REPORT DELIVERY

< 5 business days

After engagement close

// MATERIAL RETENTION

30 days default

Sample documents deleted at close

// WHEN THIS IS RIGHT

Honest fit criteria

// THE RIGHT FIT

—

Your production AI uses RAG (vector store + retrieval + LLM) and you need confidence in the retrieval boundary.

—

You operate a multi-tenant RAG deployment and need defensible isolation between tenants.

—

Your RAG system ingests documents from semi-trusted sources (customer uploads, third-party feeds) and indirect injection is a real concern.

—

Compliance requirements demand documented controls around document-source access in retrieval.

// THE WRONG FIT

—

Your AI doesn't use retrieval — RAG Perimeter has nothing to audit.

—

Your vector store is single-tenant with fully trusted documents — most of the engagement's value is in multi-tenant or untrusted-document scenarios.

—

You need full-stack AI security assessment — Adversarial Probing covers the broader surface.

—

You haven't deployed the RAG system yet — the boundary tests need a live pipeline.

// RELATED ENGAGEMENTS

Where this connects to the rest of our work

Injection Vector Mapping

If document-layer injection is your primary concern, this offensive engagement maps the full injection surface first.

DETAILS →

Neural Hardening

Companion engagement for the underlying infrastructure that runs your RAG pipeline.

DETAILS →

Adversarial Probing

Run this for broader AI security testing across your full stack, not just RAG.

DETAILS →

RAG Perimeter engagements start from $21,500. Reply within 24h. NDA before scope.

BOOK THIS ENGAGEMENT →