peyeeye.ai/Docs/Introduction

The peyeeye.ai API

Redact PII on the way into your LLM prompts and rehydrate it on the way out. One round-trip, deterministic tokens, zero data retention by default.

First call in 90 seconds. If you've got a terminal and an API key, skip to Quickstart. Everything else is reference material.
i
Building an LLM agent or tool-using pipeline? See the agents guide for tool schemas, Claude/OpenAI/Gemini snippets, LangChain/LlamaIndex/CrewAI wrappers, streaming, and stateless multi-turn sessions.

How it fits into your stack #

peyeeye.ai is not an LLM provider. It's a thin, stateless shield that sits between your application and whatever model you're using — Claude, GPT, Gemini, a fine-tune, your own checkpoint. Two HTTP endpoints do the whole dance:

  1. POST /v1/redact — your raw prompt in, tokenized text out.
  2. You prompt the LLM with the tokenized text.
  3. POST /v1/rehydrate — the model's reply in, original values back in.

Everything peyeeye.ai does is synchronous, idempotent, and observable. There is no queue, no background worker, no magic. If your LLM call times out, ours already finished.

Guarantees #

  • Zero retention by default. Redacted text and source values are held in memory for the session's TTL (default 15m) and then discarded. Set session: "stateless" to skip server-side storage entirely.
  • Deterministic tokens within a session. Ada Lovelace is always [PERSON_1] inside one session, and never leaks across sessions.
  • At-rest encryption (AES-256-GCM) and TLS 1.3 in transit.
  • Per-org isolation. Custom detectors, policies, and API keys are scoped to your organization — cross-tenant leakage is impossible by construction.

Quickstart #

One redact + one rehydrate, from zero to working code. Grab your API key from the dashboard first.

1 · Install

# Node / TypeScript
npm install peyeeye

2 · Set your key

export PEYEEYE_KEY="pk_live_51H…"

3 · Round-trip a prompt

The SDK wraps redact+rehydrate into a single shield() helper. This is the recommended pattern — you can still call the raw endpoints if you need to.

import { Peyeeye } from "peyeeye";
import Anthropic from "@anthropic-ai/sdk";

const peyeeye  = new Peyeeye({ apiKey: process.env.PEYEEYE_KEY });
const claude    = new Anthropic();

const shield = await peyeeye.shield();
const safe   = await shield.redact("Hi, I'm Ada, ada@a-e.com");

const reply = await claude.messages.create({
  model: "claude-sonnet-*",
  max_tokens: 256,
  messages: [{ role: "user", content: safe }]
});

console.log(await shield.rehydrate(reply.content[0].text));
// "Hi Ada, thanks — we've emailed ada@a-e.com."
i
The returned session handle is opaque — it's how rehydrate matches tokens back to real values. Pass it verbatim. Don't persist it longer than the redacted text lives.

Authentication #

All requests use bearer-token auth. Keys are prefixed pk_live_, scoped to one organization, and don't expire — rotate them yourself in the dashboard.

Authorization: Bearer pk_live_51H…
Content-Type:    application/json
Idempotency-Key: req_a1b2c3d4   # optional, recommended
!
Never ship a key to the browser. Call peyeeye from your backend and proxy the redacted text forward. Anything you put in a browser bundle leaks — peyeeye keys included.

Idempotency #

Pass an Idempotency-Key header to safely retry. We cache the full response keyed on the tuple (api_key, idempotency_key). Mismatched bodies raise idempotency_conflict.

POST /v1/redact #

Detect PII in a block of text, replace each span with a deterministic token, and return a session handle you can later rehydrate.

POSThttps://api.peyeeye.ai/v1/redactsince v1.0

Body parameters

text*
string
Input to scan. UTF-8, up to 128K characters. Arrays accepted — each element is redacted in the same session.
locale
string
BCP-47 language tag. Biases detectors toward locale-specific formats (e.g. fr-FR SIRET, en-GB NHS number).default: "auto"
policy
string | object
Name of a saved policy, or an inline policy object. Controls which entities are redacted, allow-lists, and severity.default: "default"
session
string | "stateless"
Optional existing session ID to extend. Pass "stateless" to skip server-side storage — the response will include a rehydration_key blob you must present to /rehydrate.
entities
string[]
Restrict detection to these entity IDs. Omit to use the policy's default set.
placeholder
string
Token template. "[{TYPE}_{N}]" (default), "<{TYPE}>", or a custom format with {TYPE}, {N}, {HASH} variables.

Example

POST /v1/redact
Authorization: Bearer pk_live_…
Content-Type: application/json

{
  "text": "Hi, I'm Ada Lovelace.\nEmail: ada@analytic-engines.com\nCard: 4242 4242 4242 4242",
  "locale": "en-US",
  "policy": "default"
}
{
  "redacted": "Hi, I'm [PERSON_1].\nEmail: [EMAIL_1]\nCard: [CARD_1]",
  "session": "ses_7fA2kLw9MxPq",
  "entities": [
    { "token": "[PERSON_1]", "type": "PERSON",
      "span": [8, 20], "confidence": 0.98 },
    { "token": "[EMAIL_1]",  "type": "EMAIL",
      "span": [29, 55], "confidence": 1.00 },
    { "token": "[CARD_1]",   "type": "CARD",
      "span": [62, 81], "confidence": 0.99 }
  ],
  "latency_ms": 38,
  "expires_at": "2026-05-01T14:27:03Z"
}

POST /v1/rehydrate #

Substitute tokens in a string with the original values held in a session. Unknown tokens pass through verbatim — we don't fail the call if the LLM made one up.

POSThttps://api.peyeeye.ai/v1/rehydratesince v1.0

Body parameters

text*
string
Text containing tokens to swap back.
session*
string
Session ID returned by /redact, or rehydration_key blob if you used stateless mode.
strict
boolean
When true, any unknown tokens raise unknown_token instead of passing through. Useful for catching model hallucinations.default: false

Response

{
  "text": "Hi Ada, thanks — we've emailed ada@analytic-engines.com.",
  "replaced": 2,
  "unknown": [],
  "latency_ms": 11
}

More endpoints #

Everything else the dashboard uses is available over the same bearer-token API.

GET /v1/sessions/:id
stateful only
Inspect a session — locale, policy, chars processed, entity count, expires_at, and whether it's already expired.
DELETE /v1/sessions/:id
stateful only
Drop the mapping immediately, don't wait for TTL.
GET /v1/entities
read
List built-in detectors plus your org's custom detectors. Built-ins come with id, category, sample, locales; customs add kind, pattern, enabled.
POST /v1/entities
plan-gated
Create or upsert a custom detector. Body: id, kind: "regex" | "fewshot", pattern, examples, confidence_floor. Plan-gated: Free allows 1, Build 3, Pro 10, Scale unlimited. Over-cap returns 403 forbidden.
PATCH /v1/entities/:id
plan-gated
Update pattern, toggle enabled, or tune confidence_floor without a full replace.
DELETE /v1/entities/:id
plan-gated
Retire a custom detector.
POST /v1/entities/test
dry-run
Compile a regex and run it against a sample string. Returns matches and spans without creating a detector — safe to call repeatedly while iterating on a pattern.
GET /v1/entities/templates
read
Starter detector templates (Twilio SIDs, Stripe keys, AWS access keys, GitHub PATs, JWTs, Slack tokens, a generic customer-id shape). Copy the pattern into a POST /v1/entities call to adopt one.

Errors & retries #

All errors return a JSON body with code, message, and request_id. Transient errors (429, 5xx) are safe to retry with exponential backoff — the SDKs do this for you.

400 invalid_request
terminal
Missing required field, unknown entity ID, malformed JSON. Don't retry — fix and re-send.
400 unknown_token
terminal
Rehydrate in strict: true mode hit a token that wasn't in the session. Often means the LLM hallucinated a placeholder.
401 unauthorized
terminal
Missing, malformed, or revoked API key.
403 forbidden
terminal
Your plan doesn't include the requested capability (streaming, custom detectors, detector cap exceeded). Upgrade to proceed.
404 session_not_found
terminal
Session expired or never existed. Re-run /redact.
409 idempotency_conflict
terminal
Same idempotency key, different body. Use a fresh key.
413 payload_too_large
terminal
Input exceeds 128K characters. Split the text and redact each chunk into the same session.
429 rate_limited
retry
Burst capacity exhausted. Honor the Retry-After header. SDKs back off automatically.
5xx internal_error
retry
Transient server fault. Retry with exponential backoff — SDKs do this for you.

Rate limits #

Per-key limits, measured as requests-per-second with a burst bucket of 2× sustained RPS. Response headers report your remaining budget:

X-RateLimit-Limit:      500
X-RateLimit-Remaining:  487
Retry-After:            0.42   # seconds, only on 429
  • Free — 2 rps sustained, 5 rps burst
  • Build — 200 rps sustained, 400 rps burst
  • Pro — 1000 rps sustained, 2000 rps burst
  • Scale — 3000 rps sustained, 6000 rps burst

Sessions & tokens #

A session is the bridge that lets peyeeye.ai swap tokens back to real values later. Two modes:

Stateful (default)

We hold the mapping for 15m after the last touch, then discard it. Simple, low-latency, but requires server-side storage on our end — if that's a non-starter for you, use stateless mode instead. DELETE /v1/sessions/:id to drop the mapping early.

Stateless

Pass session: "stateless". The response includes an opaque rehydration_key (prefixed skey_) — an AES-256-GCM-sealed blob of the token→value mapping. Store it yourself. Send it back to /rehydrate as the session value. We never persist anything.

Entity catalog #

62 built-in entity types (regex + checksum validated, supplemented by ML NER), grouped below. Every ID is usable in entities: [...] or as a policy rule.

ID
Category
Sample
Locales
PERSON
Identity
Ada Lovelace · Dr. Maya Chen
all
EMAIL
Contact
ada@example.com
all
PHONE
Contact
+1 (415) 555-0134
all
ADDRESS
Location
221B Baker St, London NW1 6XE
all
POSTAL_CODE
Location
94610 · SW1A 1AA · 100-0001
all
GEO_COORDS
Location
37.7749, -122.4194
all
SSN
Government ID
432-11-8890
en-US
NIN
Government ID
AB 12 34 56 C
en-GB
SIN
Government ID
046 454 286
en-CA, fr-CA
PASSPORT
Government ID
M12345678
all
DRIVER_LICENSE
Government ID
B1234567 · Y8765432
all
CARD
Financial
4242 4242 4242 4242
all
IBAN
Financial
GB82 WEST 1234 5698 7654 32
eu
ROUTING
Financial
121000358
en-US
AMOUNT
Financial
$4,820.00 · €1.250,50
all
MRN
Health
MRN 00912774
all
DIAGNOSIS
Health
ICD-10 · SNOMED
en
DOB
Temporal
1984-03-12 · March 12, 1984
all
IP
Network
192.0.2.84 · 2001:db8::1
all
SECRET
Credential
sk_live_51HabcdEF… · gh_pat_abc…
all
ORG
Entity
Acme Analytics Inc.
all

Custom detectors #

Define your own detector with a regex, or drop in a handful of example strings and let peyeeye induce the pattern (LLM-backed when enabled, heuristic fallback otherwise):

{
  "id": "ORDER_ID",
  "kind": "regex",
  "pattern": "#A-\\d{6,}",
  "examples": ["#A-884217", "#A-007431"],
  "confidence_floor": 0.9
}

If pattern is omitted, peyeeye induces one from examples at create time. Test-drive patterns against sample text before you save them with POST /v1/entities/test.

Streaming #

When you're piping an LLM's token stream back to a user, naive rehydration breaks on mid-token boundaries. The streaming API buffers partial tokens until they complete, then emits cleanly. Build plan and higher.

POSThttps://api.peyeeye.ai/v1/redact/streamsince v1.0

Post a list of chunks; get back Server-Sent Events in three flavours — session fires once with the new session id, redacted fires per chunk, done closes the stream.

# POST /v1/redact/stream  body: { "chunks": ["Hi, I'm Ada", " — card 4242 4242 4242 4242"] }
event: session
data: {"session":"ses_7fA2kLw9MxPq"}

event: redacted
data: {"text":"Hi, I'm [PERSON_1]","entities":1}

event: redacted
data: {"text":" — card [CARD_1]","entities":1}

event: done
data: {"chars":37}

Both SDKs wrap this with partial-token buffering so you can interleave upstream LLM chunks with rehydration safely. Open a shield once, redact the user prompt, then pipe each streamed LLM chunk through rehydrateChunk:

import { Peyeeye } from "peyeeye";
import Anthropic from "@anthropic-ai/sdk";

const peyeeye = new Peyeeye({ apiKey: process.env.PEYEEYE_KEY! });
const claude  = new Anthropic();

const shield = await peyeeye.shield();
const safe   = await shield.redact(userInput);

const upstream = await claude.messages.stream({
  model: "claude-sonnet-*",
  messages: [{ role: "user", content: safe }],
});

for await (const chunk of upstream) {
  if (chunk.type !== "content_block_delta") continue;
  const out = await shield.rehydrateChunk(chunk.delta.text);  // partial-token safe
  process.stdout.write(out);
}
process.stdout.write(await shield.flush());  // emit any buffered remainder
from peyeeye import Peyeeye
from anthropic import Anthropic
import os, sys

peyeeye = Peyeeye(api_key=os.environ["PEYEEYE_KEY"])
claude  = Anthropic()

with peyeeye.shield() as shield:
    safe = shield.redact(user_input)

    with claude.messages.stream(
        model="claude-sonnet-*",
        max_tokens=512,
        messages=[{"role": "user", "content": safe}],
    ) as upstream:
        for text in upstream.text_stream:
            sys.stdout.write(shield.rehydrate_chunk(text))  # partial-token safe
            sys.stdout.flush()

    sys.stdout.write(shield.flush())  # emit any buffered remainder

If you want the raw SSE — for example from a runtime without the SDK on it — post directly to /v1/redact/stream and consume the stream of session / redacted / done events:

import { Peyeeye } from "peyeeye";

const peyeeye = new Peyeeye({ apiKey: process.env.PEYEEYE_KEY! });

for await (const ev of peyeeye.redactStream({
  chunks: ["Hi, I'm Ada", " — card 4242 4242 4242 4242"],
})) {
  if (ev.event === "session")  sessionId = ev.data.session;
  if (ev.event === "redacted") process.stdout.write(ev.data.text);
}
from peyeeye import Peyeeye

peyeeye = Peyeeye(api_key="pk_live_...")

for ev in peyeeye.redact_stream([
    "Hi, I'm Ada",
    " — card 4242 4242 4242 4242",
]):
    if ev.event == "session":
        session_id = ev.data["session"]
    elif ev.event == "redacted":
        print(ev.data["text"])
!
Never flush during a streaming response — only after upstream closes. Flushing mid-stream can emit a partial token to the user.

SDKs #

First-party libraries, open-source under MIT. Full parity with the HTTP API — redact, rehydrate, streaming with partial-token buffering, stateless sealed sessions, custom detectors, session management. Current stable release: v1.0.0.

ts

TypeScript / Node

peyeeye · v1.0.0

Node 18+, Bun, Deno, Cloudflare Workers, Vercel Edge. Zero runtime dependencies — uses the platform fetch. Dual ESM + CJS build with typed .d.ts / .d.cts.

py

Python

peyeeye · v1.0.0

Python 3.9+. Single runtime dependency (httpx). Fully type-hinted with py.typed. Shield context manager handles session lifecycle automatically.

TypeScript / Node

Install:

# npm, pnpm, yarn, or bun — pick your poison
npm install peyeeye
pnpm add peyeeye
bun add peyeeye

Quickstart — end-to-end redact → LLM → rehydrate:

import { Peyeeye } from "peyeeye";
import Anthropic from "@anthropic-ai/sdk";

const peyeeye = new Peyeeye({ apiKey: process.env.PEYEEYE_KEY! });
const claude  = new Anthropic();

const shield = await peyeeye.shield();
const safe   = await shield.redact("Hi, I'm Ada, ada@a-e.com");

const reply = await claude.messages.create({
  model: "claude-sonnet-*",
  max_tokens: 256,
  messages: [{ role: "user", content: safe }],
});

console.log(await shield.rehydrate(reply.content[0].text));
// "Hi Ada, thanks — we've emailed ada@a-e.com."

shield() opens a session on the first redact() call, keeps reusing it across subsequent calls, and swaps tokens back on rehydrate(). The same real value always yields the same token within a shield; tokens never leak across shields.

Client configuration:

new Peyeeye({
  apiKey: "pk_live_…",
  baseUrl: "https://api.peyeeye.ai",   // optional
  maxRetries: 3,                        // 429 + 5xx back off exponentially
  timeoutMs: 30_000,                    // per-request timeout
  defaultHeaders: { "X-App": "my-app" },
  fetch: globalThis.fetch,              // override on Cloudflare Workers
});

Low-level calls (when you don't want the shield helper):

const r = await peyeeye.redact("Card: 4242 4242 4242 4242");
// r.redacted  → "Card: [CARD_1]"
// r.session   → "ses_…"
// r.entities  → [{ token: "[CARD_1]", type: "CARD", span: [6, 25], confidence: 0.99 }]

const back = await peyeeye.rehydrate("Confirmation for [CARD_1].", r.session);
// back.text → "Confirmation for 4242 4242 4242 4242."

Full surface: README — shield, stateless sealed mode, SSE streaming, custom detectors, session management, retry / rate-limit headers, typed errors.

Python

Install:

# pip, poetry, pdm, uv — works with any installer
pip install peyeeye
poetry add peyeeye
uv pip install peyeeye

Quickstart — end-to-end redact → LLM → rehydrate:

import os
from peyeeye import Peyeeye
from anthropic import Anthropic

peyeeye = Peyeeye(api_key=os.environ["PEYEEYE_KEY"])
claude  = Anthropic()

with peyeeye.shield() as shield:
    safe  = shield.redact("Hi, I'm Ada, ada@a-e.com")
    reply = claude.messages.create(
        model="claude-sonnet-*",
        max_tokens=256,
        messages=[{"role": "user", "content": safe}],
    )
    print(shield.rehydrate(reply.content[0].text))

Inside the with block the shield pins a single session: the same real value always maps to the same token, and the session is cleaned up on exit (stateful mode).

Client configuration:

from peyeeye import Peyeeye

peyeeye = Peyeeye(
    api_key="pk_live_...",
    base_url="https://api.peyeeye.ai",   # optional
    timeout=30.0,                         # per-request timeout (seconds)
    max_retries=3,                        # 429 + 5xx back off exponentially
    default_headers={"X-App": "my-app"},
)

Low-level calls (skip the shield helper):

r = peyeeye.redact("Card: 4242 4242 4242 4242")
# r.redacted  → "Card: [CARD_1]"
# r.session   → "ses_…"
# r.entities  → [DetectedEntity(token="[CARD_1]", type="CARD", span=(6, 25), confidence=0.99)]

back = peyeeye.rehydrate("Confirmation for [CARD_1].", session=r.session)
# back.text → "Confirmation for 4242 4242 4242 4242."

Stateless sealed mode — server never persists the mapping, the sealedskey_… blob carries everything the rehydrate step needs:

with peyeeye.shield(stateless=True) as shield:
    safe = shield.redact("Ada, 4242 4242 4242 4242")
    # shield.rehydration_key → "skey_AES-GCM-sealed..."
    # Shipped to a client, used later, no server-side state.
    print(shield.rehydrate("Hi [PERSON_1], your [CARD_1] is active."))

Typed errors from the API:

from peyeeye import PeyeeyeError

try:
    peyeeye.redact(text)
except PeyeeyeError as e:
    # e.status, e.code, e.message, e.request_id
    if e.code == "rate_limited":
        retry(e.retry_after)
    elif e.code == "forbidden":
        upgrade_plan()
    else:
        raise

Full surface: README — shield, stateless sealed mode, SSE streaming via redact_stream(), custom detectors, session management, retry / rate-limit headers, typed errors.

i
Both SDKs follow semver. Major versions track the HTTP API major version. Older majors are supported for 18 months after a new major ships.