§00|RFC · SATUS-001

seed data that looks like a real business, not a faker dump.


satus reads your live Postgres schema and writes rows that respect every foreign key, constraint, and business rule you didn’t write down. Built for the demo, the screenshot, and the QA run, not for load testing.

$
npm i -g @passkeybridge/satus
example output · satus generate --profile medical-booking
$ satus generate --profile medical-booking
  introspecting schema           14 tables · 38 FKs
  planning insert order          topological
  generating · clinics              12 rows
  generating · providers            48 rows
  generating · patients            420 rows
  generating · appointments      1,840 rows
  validating invariants          ok
  inserting (transaction)        ok
 2,320 rows · $0.04 · 7.2s
§01|Problem statement

faker writes strings. your customers read businesses.


Seed data is the silent embarrassment of every product demo. Patients with negative ages. Orders that don’t sum to their line items. “John Doe, Lorem Ipsum Corp.” in the screenshot the founder is about to post.

The fix isn’t a better random-name library. The fix is data that knows your schema is a system: a subscription marked canceled needs a canceled_at after its created_at, a clinic in Vermont doesn’t employ 4,000 cardiologists, an order’s total equals the sum of its rows.

faker / factory_bot
satus
random strings per column
+rows that reference real parents
ignores foreign keys
+topological insert from pg_catalog
John Doe, Acme, Lorem Ipsum
+Maren Holloway, Northwind, Burlington VT
constraints fail at runtime
+zod validation before any INSERT
one shape per table
+tone & distribution from a profile
§02|How it works

five quiet steps. no magic. no daemons.


The CLI runs on your machine and talks to your database. There is no hosted runtime, no telemetry of your row data, and no surprise infrastructure.

  1. 01introspectRead tables, columns, types, foreign keys, unique constraints, checks, and enums directly from pg_catalog. No annotations. No ORM plugins.
  2. 02planBuild a dependency DAG from your foreign keys and topologically sort the insert order. Parents before children, always.
  3. 03generatePer table, send schema, parent-row samples, and the active profile to the LLM. Receive rows as structured JSON via tool-calling, never free-text.
  4. 04validateA zod schema generated from the table catches type, length, enum, unique, and invariant violations before they ever reach the database.
  5. 05insertWrap the entire run in a single Postgres transaction. Parameterized inserts in topological order. Any failure rolls back the whole run; your database is never left half-seeded.
§03|Guarantees

four contracts the cli enforces at run time.


Marketing copy is cheap. These are the four invariants the binary itself refuses to violate. If any is broken in a release, it is a P0 bug.

  • guarantee
    G-01
    Foreign-key integrity

    Every generated row references parent keys that exist in the same run. Cycles are detected up front and either broken with nullable back-patching or fail loudly with E_FK_CYCLE.

  • guarantee
    G-02
    Atomic insertion

    All inserts for a single generate run execute inside one Postgres transaction. A failure on row 4,811 of 4,812 rolls back the entire run; your database is never left half-seeded.

  • guarantee
    G-03
    Cost ceiling

    The CLI prints an estimated token cost during the planning phase and refuses to proceed past --max-cost (default $1.00) without explicit confirmation. No silent overruns on your provider bill.

  • guarantee
    G-04
    Row-data locality

    Your row values are sent only to the LLM provider you configure with your own API key. We have no hosted runtime, no proxy, and no telemetry that includes generated content.

§04|Anti-features

what satus deliberately will not do.


Every line item below is a feature request we’ve already decided to decline. Stating them up front saves you an issue and us a wontfix.

not foruse instead
Production data anonymizationUse pgAnonymizer or Tonic.ai. We generate fresh data; we don't redact yours.
Load-testing volume (10M+ rows)Use pgbench or a faker pipeline. LLM calls cost too much at that scale.
A graphical schema editorYour migrations are the source of truth. We read pg_catalog, we don't replace it.
ML model training datasetsUse real, licensed data. Synthetic rows are a demo aid, not a training corpus.
Cross-database support (MySQL, Mongo)Postgres-only on purpose. We use pg_catalog features that don't translate.
§05|Sample output

three rows, three tables, one consistent story.


Below: the same patient referenced as a foreign key on an appointment, scheduled with a provider whose specialty and working hours both check out. No detail contradicts another.

table
patients
id
8e2c0a13-…
full_name
Marisol Aguirre-Velez
dob
1984-07-19
state
CO
insurance_plan_id
→ plans.id (Anthem BCBS, CO)
table
providers
id
1f9d2b77-…
full_name
Dr. Khalil Okonkwo, MD
specialty
Family Medicine
clinic_id
→ clinics.id (Westside Family Health)
working_hours
Mon–Thu 08:00–17:00 MT
table
appointments
id
a4c11e8f-…
patient_id
→ patients.id (Marisol Aguirre-Velez)
provider_id
→ providers.id (Dr. Khalil Okonkwo)
starts_at
2026-06-04 14:30 America/Denver
reason
annual wellness visit
see all three reference profiles →
§06|Continue reading

the rest of the specification.