BUILDPILLED

Agent-audit for tool-using AI systems

Your agent is the
attack surface.

We audit the agents you ship before someone else finds the holes. Hardened artifacts, cited to recognized standards, ready to drop into your CI/CD process.

Proof point
Prompt hardened
Proof point
Tool schemas locked
Proof point
Findings cited
Proof point
CI-ready artifact

Early API invites go to work emails first.

Anchored on
  • NIST AI RMF
  • NIST AI 600-1
  • OWASP LLM Top 10
  • MITRE ATLAS
  • NVIDIA Garak
  • ISO/IEC 42001
§ 01 / Protocol

One POST. One price. Persistent proof.

A one page protocol. No SDK to integrate, no dashboard to provision, no SSO, no SOW. POST your agent, settle the 402, get a hardened artifact back.

01 / ShipPOST

Ship the raw agent.

One JSON payload from your CI step, code-review agent, or curl. No client library required.

POST /agent-audit
content-type: application/json

{
  "system_prompt": "...",
  "tools": [ ... ],
  "tier": "surface"
}
02 / Auditautomated

Automated audit.

NIST AI RMF subcategories scored. OWASP LLM Top 10 mapped. MITRE ATLAS techniques flagged. Active fires Garak probes against the endpoints you authorize.

  • MAP-2.1 · contextpassed
  • MAP-3.4 · riskspassed
  • MEASURE-2.6 · evalspassed
  • OWASP LLM01 · injectionmitigated
  • ATLAS T0051 · jailbreakmitigated
03 / Deploy200 OK

Deploy the artifact.

Hardened prompt, locked schemas, structured findings, and a Stripe-MPP receipt. Attach to the PR, file with GRC, then ship with evidence.

{
  "hardened_prompt": "...",
  "locked_schemas": [ ... ],
  "findings": [ ... ],
  "receipt": { "spt": "spt_..." }
}
Artifact preview

Raw agent in. Hardened agent out.

Surface tier sample

Before

Open-ended tool access

After

Declared tool scope + failure modes

Before

No injection policy

After

OWASP LLM01 mitigation block

Before

Untracked eval surface

After

NIST MEASURE evidence map

Protocol · preview
POST /agent-audit
preview
{
  "status": 402,
  "title": "Payment Required",
  "detail": "Payment is required (BuildPilled agent-audit · Surface tier).",
  "challenge": {
    "method": "stripe",
    "intent": "charge",
    "request": {
      "amount": "2500",
      "currency": "usd",
      "methodDetails": {
        "networkId": "buildpilled",
        "paymentMethodTypes": [
          "card"
        ]
      }
    },
    "description": "BuildPilled agent-audit · Surface tier"
  }
}
sanitized 402 previewmachine-readable spec
§ 02 / Tiers

Rightsize your scrutiny.

Tier 01Analysis & Hardening

Surface

$25per call

We hand back a hardened agent, not a report. Tightened prompt, locked tool schemas, missing guardrails added. Every change cited.

Best for PR checks, internal agents, and prompt hardening.

  • Hardened system prompt and tool schemas, drop-in ready
  • Explicit changelog: what we changed, what we added, why it matters
  • Every finding cited to NIST AI RMF + AI 600-1, OWASP LLM Top 10, MITRE ATLAS
Tier 02Holistic Secure Rebuild

Active

$250per call

Everything in Surface, plus our adversarial agent actively probes the endpoints you declare. Hardened agent + attack transcripts + reproducible cases.

Best for customer-facing agents, tool use, and pre-launch security review.

  • Prompt-injection, data-leakage, and jailbreak probes against authorized test targets
  • Probe coverage anchored on NVIDIA Garak, open-source and inspectable
  • Attack transcripts and reproducible test cases for every gap
  • Capped, transparent test budget with no surprise bills

Settled per call via Stripe Link MPP. No subscription, no seat license, no minimum, no MSA.

Sample report →
§ 03 / Questions

Due diligence.

  • Q.01

    Who is this for?

    Teams shipping AI agents that touch real data, real tools, or real customers, especially when a security review or compliance check is on the horizon.

  • Q.02

    How long does an audit take?

    Surface comes back the same day. Active typically completes within a business day, depending on the size of your agent and authorized endpoints.

  • Q.03

    What do you actually look at?

    Your agent's setup: system prompt, tools, model, and (Active only) the endpoints you authorize. We never touch anything you haven't declared.

  • Q.04

    Why should we trust the findings?

    Every finding cites a published standard: NIST AI RMF + AI 600-1, OWASP LLM Top 10, MITRE ATLAS. No black-box severity numbers.