Skaldborn's art content pipeline
There is no AI judging the AI. No vision-language model rates the output. No aesthetic-classifier auto-promotes the best of a batch. Every image the Skaldborn art pipeline ships passes my review before it reaches the manifest, and the pipeline is built so only my review can promote it.
The pipeline generates pixel art with two AI backends — a local Stable Diffusion stack and PixelLab’s hosted pixel-art API — under a governance discipline borrowed from the simulation side of the engine. This post is about why we built it that way, what’s underneath, and where the receipts are.
Want to skip straight to the technical bits? Set up ComfyUI for your own content pipeline walks the install end to end on a 4 GB consumer GPU, plus the full PixelLab API integration. This post is the architectural why; that one’s the operational how.
The thirty-second version
A recipe is a JSON file that describes what to make. It declares a small set of levers (constrained to enums) and a larger set of locked fields (everything else — checkpoint, LoRA strengths, seeds, canvas size, anchor offsets). Calling the CLI with a recipe and lever values queues one job per backend into a Postgres-backed saga. The saga drives each job through a state machine: generate, stage to a content-addressed path, wait for human approval at two gates, copy the approved file into a frozen “approved” slot, and emit a manifest entry.
Every randomness source in the pipeline lives upstream of my approval. By the time an asset is in the manifest, it’s a frozen file pinned to a content-addressed path. The simulation consumes the frozen manifest. It never sees the generation pipeline.
That is the spine. The rest of this post is what’s underneath.
The stack
Everything in the pipeline is open source or publicly purchasable, so a reader who wants to rebuild a piece of it can:
- .NET 9.0 — worker host, CLI, saga state machine, all pipeline C#
- PostgreSQL — durable
art_jobsqueue (claim ordering viaSELECT FOR UPDATE SKIP LOCKED) - Npgsql — Postgres driver
- ComfyUI — local Stable Diffusion orchestrator; runs on a host GPU and is reachable from inside Docker via
host.docker.internal:8188 - PixelLab — hosted pixel-art API; eight-direction character rotations, isometric tiles, map objects
- Docker + Compose — the worker runs as a long-lived compose service against a hermetic multi-stage Dockerfile
- OpenTelemetry (OTLP) — metrics out of the worker; a Datadog Agent receives them when the opt-in env flag is set
- System.CommandLine — CLI argument parsing
- Bash —
validate-recipe-coverage.sh,promote-assets.sh,process-character-animations.sh - free-tex-packer — sprite-atlas packing (queued, not yet wired)
On the local Stable Diffusion side, the production stack is concrete:
- Checkpoint: aZovyaRPGArtistTools v4VAE — an SD 1.5 fine-tune by Zovya for illustrative RPG concept art (game art, tabletop, book covers). VAE baked in.
- Style LoRA: NorseViking_v10 by Nontime — Norse warrior / berserker style, applied at
strength_model: 0.7 / strength_clip: 0.5. - Upscaler: 4xFoolhardyRemacri by FoolhardyVEVO — a 4× ESRGAN, run after FaceDetailer to bring concept output up to 2048×3072.
- Negative embedding: negative_hand-neg by Nerfgun3 — corrects bad hand anatomy without dragging the style with it.
- Face detector: face_yolov8n.pt — Bingsu’s nano-scale YOLOv8 fine-tune (mAP50 0.660 across multiple face datasets), used inside FaceDetailer for the bounding box.
- Segmenter: sam_vit_b — Meta AI’s smallest Segment Anything ViT (the
sam_vit_b_01ec64.pthcheckpoint), used by FaceDetailer to mask the detected face for the inpainting pass.
All public, all linkable. Reproducing the stack is one shell script and one recipe; Set up ComfyUI for your own content pipeline walks through it end to end.
The hero artifact: the recipe
The single most useful file to understand the pipeline is a recipe. Here is one, with in-world strings genericized:
{
"recipe_id": "<age>.<kind>.<slug>.v<N>",
"kind": "<kind>.<sub-kind>",
"lever_schema": {
"type": "object",
"additionalProperties": false,
"required": ["<lever_a>", "<lever_b>"],
"properties": {
"<lever_a>": { "enum": ["val1", "val2", "val3"] },
"<lever_b>": { "enum": ["val1", "val2"] }
}
},
"locked_fields": {
"checkpoint": "<base-model>.safetensors",
"loras": [
{ "name": "<style-lora>.safetensors",
"strength_model": 0.7,
"strength_clip": 0.5 }
],
"cfg_scale": 7.5,
"sampler": "dpmpp_2m",
"scheduler": "karras",
"steps": 30,
"canvas_width": 512,
"canvas_height": 768,
"seed_production": 31337,
"composition_mode": "standalone",
"anchor_offsets": {
"feet": { "n": [0, 0], "s": [0, 0],
"e": [0, 0], "w": [0, 0] },
"hand_main": { "n": [16, -24], "s": [-16, -24],
"e": [20, -20], "w": [-20, -20] }
}
},
"prompt_templates_by_backend": {
"pixellab": {
"endpoint": "/v2/...",
"description": "..., {lever_a} ..., {lever_b} ..."
},
"comfyui": {
"workflow_ref": "content/recipes/workflows/<id>.workflow.json",
"positive_prompt_template":
"..., {lever_a} ..., {lever_b} ..."
}
},
"valid_output_kinds": ["<kind>.<sub-kind>"],
"default_palette_enforcement": true
}
Three things to notice.
Levers are the only caller-adjustable parameters, and they are constrained to enums. A recipe with three hair colors and two moods has a total output space of exactly six combinations. The CLI rejects any value that isn’t in the enum. This makes the total surface enumerable — you can pre-compute every possible output path before generating a single image — and it makes the fan-out hash deterministic.
Everything else is locked. Checkpoint, LoRA strengths, sampler, scheduler, steps, seed, canvas size, anchor offsets. The recipe author commits to a generation profile at recipe-authoring time and the profile doesn’t shift run-to-run. If you want to change a locked field, you create recipe.v2.json next to recipe.v1.json. In-place mutation of locked fields is forbidden by convention; the validator catches drift; the v1 directory tree stays addressable for audit.
prompt_templates_by_backend is the multi-backend fan-out. A character recipe declares both a ComfyUI workflow (for concept art, generated by a local Stable Diffusion stack) and a hosted pixel-art API template (for the eight-direction sprite rotations the game actually renders). At queue time, the CLI reads the map keys and inserts one row per backend into the saga’s job table. The saga worker picks each row up and routes it to the matching runner via reflection-based registry discovery. A closed enum here would mean editing the engine to add a new backend; the registry means a new backend is a new class with an attribute.
This last property — adding a content type without editing engine code — is the rule that triggered the rebuild covered in our previous post. Recipe-as-type-object is the same pattern, applied at the content layer.
The receipt: when we added the second backend kind to the pipeline, it landed as a new JSON recipe and zero edits to the runner, the saga, the registry, the queue, or the dependency-injection wiring. Recipe-as-type-object passed the same extensibility test that closed-enum dispatch had failed.
The pipeline, top to bottom
Here is what happens to an asset, end to end.
graph TD
Recipe["Recipe (JSON)"]
CLI["CLI: art generate"]
Queue[("art_jobs (Postgres)")]
Concept["Concept generation<br/>(local Stable Diffusion)"]
ConceptGate{"Concept gate<br/>(human)"}
Rotation["Rotation generation<br/>(hosted pixel-art API)"]
Stage["Art-pool staging<br/>(content-addressed path)"]
ReviewGate{"Final review gate<br/>(human)"}
Promotion["Promotion<br/>(copy to approved/)"]
Projection["Manifest entry projection"]
Manifest[("Age manifest")]
Recipe --> CLI
CLI -->|"one row per backend"| Queue
Queue --> Concept
Queue --> Rotation
Concept --> ConceptGate
ConceptGate -- "approved" --> Rotation
Rotation --> Stage
Stage --> ReviewGate
ReviewGate -- "approved" --> Promotion
Promotion --> Projection
Projection --> Manifest
Walking the stages:
-
Recipe authoring. A human writes a JSON file under
content/recipes/. A bash validator (validate-recipe-coverage.sh) runs on every push and refuses the commit if a required field is missing, a lever isn’t an enum, or a component contract declares arequired_recipe_idthat doesn’t have a matching file. Human time: minutes. -
Job submission. The CLI command
art generate <recipe-id> --<lever> valuevalidates the lever values against the recipe’s schema, computes a content-addressed hash of the lever binding, and inserts oneart_jobsrow per declared backend.--dry-runshort-circuits the database write. Wall time: under a second.
3a. Concept generation (characters). The saga worker claims a pending row, substitutes lever values into the ComfyUI workflow template, posts it to the local ComfyUI daemon, polls until completion, and writes the resulting PNG to a content-addressed path. Wall time on a 4 GB consumer GPU: 30–90 seconds per concept (512×768 base + a face-detection inpainting pass + a 4× ESRGAN upscale).
3b. Tile / object generation (terrain, environment). Tiles and props skip the concept gate entirely. The saga worker posts directly to the hosted pixel-art API with the prompt template, polls the background-job endpoint, and writes the result. Wall time: 10–140 seconds depending on endpoint (tiles ~35 s, icons ~15 s, environment objects ~90 s).
-
Concept gate (characters only). I run
art review-concepts <brief-id>. The terminal displays each concept image; I approve or reject. Approved concepts advance to rotation generation. Rejected concepts go to a terminalRejectedstate and never reach the paid API. This is the most cost-load-bearing review step in the pipeline. -
Rotation generation (characters). For each approved concept, the saga calls the hosted pixel-art API’s “create character with eight directions” endpoint, supplying the concept image as both reference and seed. Wall time: 60–120 seconds. Output is a ZIP containing eight 1×N rotation strips plus skeleton data.
-
Art-pool staging. The runner copies the bytes to
art_pool/candidates/<recipe_id>/<hash16>/<backend>/000.png. The path is the content-addressed hash of the lever binding; multiple frames live as000.png,001.png, etc. -
Final review gate.
art review <job-id>shows me the staged candidate. I approve or reject. -
Promotion.
art promote <recipe-id> <hash16> --backend <name> --frame 0copies the approved candidate to anapproved/subdirectory via copy-then-atomic-rename. Once promoted, the file is frozen on disk. Re-runs of upstream generation can land new candidates next to it; they cannot overwrite it. -
Manifest entry projection.
RecipeManifestProjector.Project()produces aManifestEntryrecord carryingcomposition_mode,anchor_offsets,variant_sprites, andasset_path(pointing at the frozen file). The projector ships today; the actual write intocontent/ages/<age>/v<N>/manifest.jsonis the next thing queued for this surface area.
The Age manifest already carries entries for characters and tiles whose v2 generation hasn’t run yet — those entries hold placeholder asset paths. When promotion completes for a recipe, the projection updates the matching entry in place rather than appending a new one. The manifest is an authored spine; the pipeline fills in cells.
The sprite-atlas packing step lives in package.json as a stub today (pack:sprites is a placeholder). The plan is to wire free-tex-packer into the manifest-write step so promotion → atlas → client load is one motion. It isn’t shipped yet, and saying so out loud is part of the discipline of these posts.
I am the art director
The repo has assets/art-director.log and assets/art-directed/. The naming was deliberate, and it does not mean what it looks like.
There is no LLM judging the art. There is no vision model deciding which concept advances. The “art director” in this pipeline is a governance structure — two human-review gates that sit between every generation step and the shipped manifest, enforced by the saga’s state machine. I’m the human in question.
The assets/art-directed/ directory is a content-addressed cache. Each subdirectory is named by the SHA-256 hash of the inputs that produced it (dc61df083f4a49c5/...) and contains stage-numbered output files: stage-1-concept.png from the local stack, stage-3-rotations.zip from the hosted API. The content addressing is what makes the pipeline survive process death — when the worker comes back up, it checks “is the output already at this path?” before re-issuing a vendor call. Two runs with identical inputs collapse into one set of files.
The assets/art-director.log is the append-only execution transcript from the v1 CLI pipeline that preceded the saga rebuild — 787 lines of timestamped per-asset outcomes, costs, and diagnostic detail from spring 2026. It exists because in the old pipeline, my eye was the only place log information landed; in the v2 saga, every state transition is in Postgres and the log is supplementary.
I exercise the directorial role at exactly two stops: the concept gate and the final review gate. Both are CLI commands that transition art_jobs rows. Rejected jobs go to a terminal state without burning further vendor credits. The recipe’s levers (hair=red, mood=stoic) produce deterministic fan-outs that I evaluate as a batch — generate ten concept images by sweeping a lever’s enum, look at them, keep two, throw away eight. I’m the art director. The pipeline is my tool.
This isn’t an ideological choice. The cost of a wrong vision-model judgment in this domain is a generation that doesn’t look like the world; the cost of a right vision-model judgment is asset throughput. Until I have a judge calibrated on my taste — not a public CLIP-style scorer trained on internet aesthetics — the throughput win isn’t worth the calibration risk. My eye is cheaper, faster, and accurate by construction in a way no off-the-shelf scorer can be for a generational life-sim’s pixel art. So I stay in the loop.
Bounding the random
The pipeline uses generative models, which are probabilistic by construction. The simulation side of Skaldborn is built on a determinism guarantee. These two things have to coexist or the architecture is a lie.
They coexist through four mechanisms.
Seed pinning. Every recipe’s locked_fields carry a production seed (31337 in the canonical example). The ComfyUI workflow template hardcodes the seed in the KSampler node. Per-concept variation comes from a child_index field on each art_jobs row: the effective seed is recipe_seed + child_index, so a ten-concept batch produces seeds 31337 through 31346. Each seed is reproducible given the same checkpoint, LoRA, prompt, and parameters.
Content-addressed paths. The function LeverBindingHash.Compute() produces a SHA-256 of the lever binding map (keys sorted ordinal-ascending, compact JSON, no Unicode escaping). The first 16 hex characters become the directory name: art_pool/candidates/<recipe_id>/<hash16>/<backend>/000.png. Identical inputs produce the same hash; re-running with the same levers lands in the same directory. The stager writes there, the resolver reads from there, the worker checks for existence before issuing a vendor call.
Input: {"hair":"red","mood":"stoic"}
SHA-256: a1b2c3d4e5f67890... (truncated to 16 hex chars)
Path: art_pool/candidates/<recipe>/a1b2c3d4e5f67890/<backend>/000.png
Recipe versioning. If a checkpoint changes (new model, updated LoRA), the recipe is versioned: <recipe>.v1 becomes <recipe>.v2. In-place mutation of locked_fields is forbidden by convention and called out in the architecture decision that governs the pipeline. Old recipes remain addressable; old art_jobs rows still reference them; the directory tree separates outputs of different generation profiles cleanly.
Frozen-output promotion. The promotion step copies a candidate to approved/ via copy-then-atomic-rename. Once promoted, the file is immutable from the pipeline’s perspective. Upstream regeneration cannot overwrite it; new generations land in a new candidate slot and require an explicit second promotion from me to replace the approved file. The manifest entry produced by the projector carries the asset_path to the promoted file. The simulation consumes the frozen manifest. It never sees the candidate pool.
The honest gap: determinism at the hosted pixel-art API is not contractually guaranteed. The vendor’s seed-determinism story across the eight-rotation endpoint is a “probably yes” rather than a “documented yes.” The pipeline treats the hosted output as “potentially different on each call” and relies on the promotion gate as the determinism boundary. Once a sprite is in approved/, it doesn’t matter whether the same sprite would come back the same way next time.
The saga
The v2 pipeline is a nine-state saga backed by a Postgres art_jobs table. Each row tracks its job through:
stateDiagram-v2
[*] --> Pending
Pending --> ConceptGenerating
ConceptGenerating --> AwaitingConceptApproval
AwaitingConceptApproval --> RotationGenerating: approve
AwaitingConceptApproval --> Rejected: reject
RotationGenerating --> Persisting
Persisting --> AwaitingReview
AwaitingReview --> Approved: approve
AwaitingReview --> Rejected: reject
ConceptGenerating --> Failed: error
RotationGenerating --> Failed: error
Persisting --> Failed: error
Approved --> [*]
Rejected --> [*]
Failed --> [*]
Two design choices carry the durability story.
Claim ordering uses SELECT FOR UPDATE SKIP LOCKED. Each worker claims exactly one job at a time, ordered by (state, priority DESC, created_at). Multiple workers can run concurrently without colliding. A TtlReclaimBackgroundService periodically sweeps for jobs whose processing_deadline_at has passed and re-queues them; after three reclaims a job is force-failed with error_class='stage_timeout_repeated' so a permanently stuck job can’t loop forever.
State output lands on disk before the row transitions. The saga writes generated bytes to the content-addressed path before updating the row to the next state. If the worker crashes between writing the file and updating the database, the next pickup finds the file already present and skips the vendor call. The cost ceiling is preserved across crashes; partial work survives.
The schema is built by an embedded migration runner that ships inside the worker. Five migrations live as embedded resources in the project (V1 through V5) and apply sequentially on startup. V1 creates the table with twenty-something columns. V2 drops an over-eager category CHECK constraint that was rejecting valid recipes. V3 adds cost-enforcement columns. V4 adds reclaim tracking. V5 adds the child_index column the seed-variation fix needed (more on that in a war story below). Six indexes — claim order, brief id, trace id, category-state, TTL reclaim, created-at — keep the common queries cheap.
A daily cost ceiling (ART_WORKER_COST_CEILING_DAILY_USD, default $25) is checked before each vendor call. If the day’s spend exceeds the ceiling, the worker stops claiming new jobs until UTC midnight. This is the failsafe behind any pipeline that touches a paid API: the per-call cost estimates are unreliable, the daily cap isn’t.
What we validate (and what we don’t)
The repo’s scripts/ directory has roughly twenty-five fail-close validators that run on every push. Most of them protect the simulation. The art side has one mechanical validator. We’re going to be honest about why.
validate-recipe-coverage.sh (Tier 1, wired into make validate-governance) checks that every component contract that declares required_recipe_ids has matching files in content/recipes/. It shape-validates each recipe: all eight required top-level fields present, lever_schema.additionalProperties === false, every lever property declares an enum keyword. The validator runs three controlled-failure scenarios against temp fixtures as a self-test (--self-test) so the validator itself stays falsifiable. If a recipe file is missing or a lever isn’t an enum, the push fails with a structured error.
Beyond that bash script, the pipeline relies on:
- A C#
LeverBindingValidatorat job-submission time (rejects unknown levers, missing required levers, values outside the enum set). - The
art generate --dry-runpre-flight (validates and prints what would queue, without writing). - A circuit breaker on the hosted API runner (opens at ten consecutive failures; subsequent calls return immediately).
- The two review gates I described above.
Why not more mechanical gates? Because art quality is fundamentally a judgment call. You cannot write a bash script that determines whether a character portrait looks good. The pipeline compensates by making me the gate at two explicit review steps and ensuring no generated asset reaches the manifest without my “yes.” The mechanical validators catch structural errors — missing files, malformed levers, schema drift. Aesthetic errors are mine to catch. Pretending otherwise would be dishonest, and dishonesty here would mean shipping bad art under a fig leaf of automation.
A few additional validators (validate-tile-cohesion.ts, validate-composition.ts, validate-rendering.ts, validate-motion.ts) are scoped for tile variety, transition coverage, color-profile adherence, and motion presence. Tier 2 is in progress; Tiers 3–5 are not started. They will land before the manifest write does.
War stories
Three things broke instructively. They were teachers.
The “ten identical concepts” surprise. The first time we tested the concept-gate batch flow, art queue --concepts 10 produced ten identical PNGs. All ten child jobs shared the recipe’s pinned seed; the ComfyUI workflow hardcoded that seed in the sampler; the pipeline was faithfully and deterministically producing ten copies of the same image. Determinism was working too well. The fix, in commit 29f0028, added a child_index column to art_jobs (migration V5) and made the effective seed recipe_seed + child_index. Child 0 gets seed 31337, child 1 gets 31338, and so on. Each concept is reproducible (same child_index always produces the same image) but distinct from its siblings. Lesson: determinism and variety are not opposites, but they don’t reconcile by accident. You need an explicit variance axis, named, in the data.
The rotation_urls null surprise. Early character generation called the hosted API’s “create character with directions” endpoint, polled the resulting background job, fetched the character object, and then crashed trying to download rotation images. The API response had rotation_urls present in the schema but null in the actual response body. Every character generation failed identically across five consecutive batch runs in March; the log records each one. The fix was to read the full polling envelope rather than the immediate response: the character_id and rotation_urls materialize in a last_response field on the completed background job, not in the 202 Accepted from the original POST. The vendor’s OpenAPI schema typed last_response as opaque object, so the first implementation reasonably assumed the completion payload had the same shape as the synchronous one. Lesson: when an async API hands you a background job, the completion payload is not the same shape as the submission response. Poll the completion endpoint, dump the raw JSON, and then build the deserializer.
Saga CAS vs. TTL sweep. The saga used compare-and-swap on updated_at to prevent double-processing: UPDATE art_jobs SET state = ... WHERE id = ... AND updated_at = @expected. In overnight runs with the face-detection inpainting step enabled (longer generation times), the worker would claim a job, run the generation for sixty-plus seconds, then attempt the CAS update — which failed because the TTL reclaim sweep had touched the row’s updated_at in the meantime. The job was stuck: claimed but never advanced, never reclaimed (the sweep only touches un-claimed jobs). Commit 64c8e3e5 added a continuation loop: if the CAS fails, the worker re-reads the row, verifies it still owns the claim, and retries the transition. The same commit bumped polling ceilings for overnight workloads. Lesson: if your saga uses optimistic concurrency, make sure your background sweeps don’t silently invalidate the CAS token of in-flight work.
What it’s like to add an asset
A new character portrait, end to end, looks like this:
# Validate the recipe and lever values without queueing
art generate <recipe-id> --hair red --mood stoic --dry-run
# Queue the job
art generate <recipe-id> --hair red --mood stoic
# (~60 seconds: local concept generation)
art list-candidates <recipe-id>
art review-concepts <brief-id> # me: approve/reject
# (~90 seconds: hosted API rotation generation)
art review <job-id> # me: approve/reject
# Promote to frozen
art promote <recipe-id> <hash16> --backend <name> --frame 0
About five commands, three to five minutes wall time, dominated by generation time. If the recipe doesn’t already exist, add roughly fifteen minutes of authoring (write the JSON, choose lever values, draft the prompt template). The validator catches structural mistakes before the push reaches CI.
There’s no hot-reload. Each iteration is a full generate-review-promote cycle, bottlenecked by the 30–140 seconds the model needs. The single change that would noticeably tighten the loop is wiring promotion directly into the manifest write and the atlas pack — collapsing today’s “promote → run a copy script → rebuild the client” chain into one art ship command. The substrate exists (the projector is shipped); the wiring is the next thing on the queue.
Receipts
A grab-bag of concrete numbers, current as of this post:
- Manifest entries today: 157 across characters, portraits, buildings, terrain tiles and tilesets, environment objects, icons, and UI elements. All currently in a
Generatedstate under the v1 path; none have been promoted through the v2 pipeline yet (we’re mid-cutover). - Raw sprite files in the client’s asset tree: ~2,360 files, ~9.6 MB total. Most of that is character animation frames.
- v2 outputs in
assets/art-directed/: 8 content-hash directories so far; ~38 MB total (concept PNGs are ~6 MB each at 2048×3072 post-upscale; rotation ZIPs are ~150 KB). - Concept generation time on a 4 GB consumer GPU under WSL2: 30–90 s per concept. The VRAM-safety flags on the ComfyUI launcher are load-bearing —
--cpu-vaein particular, because the VAE decode spike will crash the entire WSL session on a 4 GB card without it. (Bare-metal Linux merely OOM-kills the process. WSL takes the whole subsystem down.) - Hosted-API generation time: tiles 35–105 s, portraits 25–135 s, icons 11–22 s, UI elements 15–50 s.
- Hosted-API cost: $0.008–$0.010 per image. Daily ceiling: $25.
- Recipe catalog: 2 canonical recipes shipped, 1 ComfyUI workflow template. Recipe files are 1.7–3.6 KB each.
- Saga schema: 5 embedded SQL migrations, 20-plus columns per row, 6 indexes.
What we’d tell ourselves in February
A list, since this kind of advice tends to land best as a list.
-
Put the operator review gate before the expensive step, not after. Local concept generation costs nothing; the hosted rotation API costs money and time. Rejecting a bad concept before it reaches the paid API saved roughly 60% of vendor spend in the first batch runs. Design your pipeline so cheap evaluation precedes expensive generation.
-
Version your recipes; don’t mutate them. When
locked_fieldschange — new checkpoint, new LoRA strength, different canvas — createrecipe.v2.jsonnext torecipe.v1.json. Old versions remain addressable for audit. In-place mutation makes “what produced this asset?” unanswerable after the fact. Immutable recipe files are a cheap form of experiment tracking. -
Content-address your outputs by their inputs. The SHA-256 hash of the lever binding means re-running a generation with the same parameters is a no-op (the file already exists). This saves you from re-generating after worker crashes, and it makes the pipeline inherently idempotent. If you build any batch pipeline that talks to expensive APIs, compute the content hash before the API call and check the filesystem first.
-
Constrain your variance axes to enums, not free-form strings. Free-form parameters feel flexible until you try to answer “have we generated all the variants we need?” and discover you can’t define “all.” Enums make the total output space finite, the content hash deterministic, and the validation mechanical. The expressivity you give up was never load-bearing; the enumerability you gain is.
-
Make your queue durable independently of your interactive session. Our v1 pipeline ran in-process inside a CLI invoked from a terminal session. A lost tmux pane destroyed the batch state. The v2 pipeline stores every job in Postgres — process death loses at most one in-flight generation, and the TTL reclaim sweep picks it up automatically. If your pipeline takes more than five minutes end-to-end, it should not live inside the process that initiated it.
-
Separate “generate” from “ship.” Distinct generate, review, promote, and manifest-write steps mean you can generate fifty variants, review them over coffee, promote the three best, and only those three enter the game. Pipelines that auto-ship every generation force you to be conservative with parameters; pipelines with an explicit promotion step let you be exploratory with generation and selective with shipping.
-
Budget your API costs with a daily ceiling, not just a per-call estimate. A hard daily cap (
$25in our case) prevents a runaway batch from spending more than you planned. Per-call estimates are unreliable — our v1 log shows$0.000cost for images the vendor actually charged for. The ceiling is the failsafe. -
Don’t ask the AI to judge the AI unless your judge is calibrated on your world. A vision-model gate sounds like throughput. The failure mode is a generation that looks like something, just not yours — and that failure ships silently because the judge said yes. Until you can train and validate a judge against your own taste, the operator gate is cheaper than the cost of one shipped wrong-looking asset.
What’s next
The next post in the launch arc takes the manifest itself as its subject. The recipe’s composition_mode, anchor_offsets, and variant_sprites carry an entire visual-composition vocabulary; the renderer reads them as data, the projector emits them, the Age manifest binds them. That surface area is its own post.
Further out: the audio pipeline. A separate service following the same governance pattern — external producer, build-time snapshot, manifest-bound, no runtime authority — with a completely different stack.
Adjacent to both: validate-recipe-coverage.sh is one of roughly twenty-five fail-close validators wired into our pre-push hook. The broader story — how we turn architectural rules into CI gates — is its own post.
Companion to this one: Set up ComfyUI for your own content pipeline — a step-by-step walkthrough of the local Stable Diffusion install end to end, the PixelLab API integration, and the launch flags that keep VAE decode from OOM-ing on a 4 GB consumer GPU.
If you want to follow along, subscribe via the form at the bottom of any page — one short email when the next post lands. If you want to argue, write to devlog@skaldborn.com.
Everything else is the boring engineering of making it true.