// ADVERSARIAL AI SECURITY · LOGICLEAK RESEARCH

Your AI is leaking. We close the gaps before they cost you.

We red-team production AI systems — agents, RAG pipelines, tool-calling chains. Find the semantic flaws that firewalls and LLM-as-a-judge can't see. Patch them before they ship.

INCIDENTS · 7D
287▲ +14%
ACTIVE PAYLOADS
1,842▲ +6%
EXPOSED SYSTEMS
67%▲ +3%
MEAN DWELL · DAYS
11▼ −18%
// 01 — threat snapshot · live

What's actually happening this week.

Aggregated, anonymized telemetry from 47 active LogicLeak engagements plus public incident feeds. Updated continuously. Last refresh shown per panel.

PRODUCTION INCIDENTS
ROLLING 7D
LIVE
287
▲ +14.3%vs 251 prior 7d

Source: engagement telemetry · OWASP LLM feed · CVE/AI tracker

VECTORS · 7D · TOP 5
LIVE
Indirect Prompt Injection92
RAG Permission Bypass67
Tool-call Hijack54
Token Flooding (DoW)41
Cache Poisoning33
SEVERITY · 7D
LIVE
CRITICAL22%
HIGH41%
MEDIUM28%
LOW9%
MEAN DWELL
DETECTION DELAY
LIVE
12days

▼ −18% vs 90d avg

0d20d40d+
INDUSTRY EXPOSURE · 7D
LIVE
IPI
RAG
Tool
DoW
Cache
Other
Fintech
Healthcare
B2B SaaS
E-commerce
GovTech
◆ PEAK Healthcare × RAG · 31◆ LOWEST GovTech × DoW · 2
WEAKNESS · OWASP LLM CLASS
LIVE
CRITICALOWASP LLM01 · Prompt Injection

Indirect injection via document upload

Agent-mode chatbots ingesting user-uploaded PDFs are leaking system prompts and tool definitions to attackers in 38% of audited deployments.

> pdf.upload({ source: "knowledge_base_2024.pdf" })
> agent.process() → SYSTEM_PROMPT_LEAKED
> exfil.endpoint ← "https://attacker.tld/c2"

Observed in 18/47 audited systems · Q1 2026

REFRESH IN 60s→ Full threat landscape
// 02 — engagement surface

Three failure modes. Three engagements.

Every AI system we audit fails in one of three ways. We engage along the exact axis your stack is exposed on — not a generic security review.

01
ADVERSARIAL · AI · DEFENSEdepth: red team · 4 weeks
FAILURE MODE

Hostile content reaches the model and rewrites its instructions.

PDFs, web pages, emails, and tool responses become attack vectors when ingested by agentic systems.

OUR ENGAGEMENT

We run adversarial payloads against your live agent.

Indirect prompt injection, tool-call hijacking, system prompt extraction, jailbreak chaining.

WHAT WE TEST
  • Indirect prompt injection via uploads
  • Tool-call hijack & arg poisoning
  • System prompt & instruction leak
  • Multi-turn jailbreak chains

ENGAGEMENTS · 47 · MEDIAN FINDINGS · 14 · REMEDIATION · 94%

02
RAG · PERIMETER · HARDENINGdepth: data-plane audit · 3 weeks
FAILURE MODE

Your retrieval layer ignores the access controls your database respects.

Vector stores return chunks that the user's auth context should never have surfaced — embeddings don't carry permissions.

OUR ENGAGEMENT

We probe the retrieval boundary as a cross-tenant attacker.

ACL bypass at query-time, embedding similarity leakage, chunk overlap exfil, prompt-forced retrieval.

WHAT WE TEST
  • Cross-tenant retrieval bypass
  • Embedding inversion exposure
  • Chunk overlap & boundary leak
  • Prompt-forced retrieval of restricted docs

ENGAGEMENTS · 38 · MEDIAN FINDINGS · 11 · REMEDIATION · 91%

03
COST · CONTAINMENT · LAYERdepth: infra + economics · 2 weeks
FAILURE MODE

A single attacker drains your monthly token budget in nine hours.

Recursive tool calls, embedding-pump loops, and context inflation turn cost into a weapon — Denial-of-Wallet is real.

OUR ENGAGEMENT

We model your cost surface and plant guardrails.

Per-user token caps, semantic caching, prompt compression, anomaly triggers, recursive depth limits.

WHAT WE TEST
  • Per-user cost ceiling enforcement
  • Recursive agent depth limits
  • Cache poisoning & invalidation
  • Token-flooding anomaly detection

ENGAGEMENTS · 29 · MEDIAN SAVED · 42% · UPTIME · 99.9%

// SCOPING

Most engagements run across two pillars. The recon call sets the exact mix.

Two-week recon. Findings in week three. Hardening in week four.

// 04 — live attack chain

Indirect prompt injection,
live.

A real IPI exploit — from malicious PDF upload to classified data exfiltration — in 1.7 seconds. Every step reproduces a finding from an actual engagement.

logicleak — ipi-demo-2026INFO
ATTACKERINFOT+0

Attacker uploads a PDF containing a hidden injection directive via the support portal

 
typing…1 / 6
// SOURCE

Reproduced from LogicLeak engagement LL-2026-0142 · Fintech, Series C · sanitised

Request audit→ Read the IPI briefing
// 05 — findings · sanitized

From the last 90 days of engagements.

Three findings, anonymized. Each one shipped to production and survived internal review before we found it.

DISCLOSURE WINDOW · 90 DAYS · 6 ENGAGEMENTS · 47 FINDINGS

SEVERITYCRITICAL · 1HIGH · 1MEDIUM · 1LOW · 0
PILLARADVERSARIALRAGCOST
SORTby severity ↓
CRITICALLL-2026-0184

OWASP LLM01 · Prompt Injection

Customer-support agent leaked admin runbook via uploaded PDF

Series-C fintech. PDF helpdesk macro contained hidden white-on-white instructions that triggered an unauthorized knowledge_search('admin') call. 14 lines of internal runbook returned to anonymous external user in 1.7 seconds.

$ pdf.upload({ file: "helpdesk_macro_2024.pdf" })
> extract.complete · 14kb · trust=user_upload
$ agent.process({ ticket: "#48217" })
> tool.knowledge_search("admin") · acl_check=false
Breach indicator: ! 3 admin docs ████████ returned to ticket reply
> breach.elapsed · 1.7s · alarms_fired=0
VectorIndirect Prompt Injection
IndustryFintech · Series C
Dwell14 days
DISCLOSED · PATCHED
fix · 11 days
→ Full disclosure under NDA
HIGHLL-2026-0179

OWASP LLM06 · Sensitive Information Disclosure

RAG returned a competitor's draft contract to a junior CS rep

B2B SaaS, ~400 employees. Vector store contained partner agreements across all tenants. Embedding similarity query for 'pricing terms' bypassed row-level ACLs and surfaced a 2024 redlined contract from a different customer.

$ rag.query("pricing terms enterprise tier")
> vector.search · top_k=8 · acl_filter=disabled
> match.0 · doc="acme_2024_redlined.docx" · score=0.91
Breach indicator: ! cross_tenant_leak · doc_owner != caller_tenant
> response.compose · doc_excerpt=1,840 chars
VectorRAG Permission Bypass
IndustryB2B SaaS
Dwell47 days
DISCLOSED · IN REMEDIATION
fix · 21 days
→ Full disclosure under NDA
MEDIUMLL-2026-0171

OWASP LLM10 · Unbounded Consumption

Single user burned $4,200 in tokens via 9-hour recursive loop

AI-native startup. Agent's planner-executor pattern had no recursion ceiling and no per-user cost cap. One adversarial input triggered self-prompting that ran until the on-call engineer noticed billing spike on the next morning's dashboard.

$ agent.plan({ goal: "[adversarial input]" })
> plan.steps=14 · executor.invoked()
> executor.subplan · recursion_depth=27
> tokens.consumed · cumulative=42M ($4,234)
Breach indicator: ! billing.threshold_exceeded · alert_lag=8h 47m
> shutdown · manual · by on_call_eng
VectorDenial-of-Wallet
IndustryAI-native startup
Dwell1 day
DISCLOSED · PATCHED
fix · 4 days
→ Full disclosure under NDA
// DISCLOSURE

These are the ones we can talk about.

44 more findings from this quarter remain under embargo. Engagement clients receive the full feed monthly. Public briefings publish 90 days after remediation.

// 06 — methodology · 4 weeks

Four weeks from engagement to hardening.

Every engagement runs the same skeleton. Scope and depth are dialed in week one. By week four, the fixes are in production with regression tests.

PHASE 015 DAYS

Reconnaissance

Scope mapping. Surface enumeration. We don't write a single payload until we have the map.

DELIVERABLES

  • Threat model (Mermaid)
  • Agent + tool inventory
  • RAG perimeter sketch
  • Cost surface baseline
TOOLS · burp, garak, custom probes
PHASE 0210 DAYS

Adversarial test

Live red-team against the actual system. No theoretical findings — every issue we report has a working payload.

DELIVERABLES

  • ~80 standardized IPI payloads
  • RAG ACL bypass attempts
  • Tool-call hijack probes
  • DoW + recursion stress
TOOLS · promptfoo, llm-attacks, in-house tooling
PHASE 035 DAYS

Findings report

Severity-ranked findings with reproducible payloads, code-level remediation, and a signed PDF for compliance review.

DELIVERABLES

  • Signed PDF · ~40 pages
  • Reproducible payload repo
  • Executive briefing · 60 min
  • Remediation roadmap
TOOLS · internal report stack
PHASE 0410 DAYS

Harden + ship

We implement the fixes ourselves. Semantic sandboxing, cost guards, prompt firewalls. PR-ready, with regression tests in your CI.

DELIVERABLES

  • PR-ready remediation patches
  • Regression test suite (CI)
  • Re-test against original payloads
  • 30-day post-engagement support
TOOLS · your repo, your CI, our patches
TYPICAL TIMELINE
30 days · scope to ship
TEAM SIZE
2-3 researchers
ENGAGEMENTS · 2026 Q1
11 active
→ See full engagement contract sample
// 07 — engagement · scoping call

Tell us what's in production.
We'll tell you where it's exposed.

A 30-minute recon call. We look at your stack, name the realistic attack surface, and tell you whether an engagement makes sense. No deck. No follow-up sequence.

Reply within 24h
Under engagement NDA
Fixed-scope proposal in 48h