Vision Rate-Limit Runbook for Photo Sidecars

When processing a larger batch of photos with pnpm vision, it is normal to hit rate limits before you hit a credit limit.

Example:

Credit balance can still be positive (for example $3.16)
but requests still fail with 429 if the organization TPM limit is exhausted

That means the immediate issue is throughput per minute, not account balance.

Quick TODO checklist

Start with a conservative run: pnpm vision -- --concurrency=1 --retries=10 --backoff-ms=2000
If the run is stable, increase slowly to: --concurrency=2
If you still need higher throughput, raise your OpenAI rate-limit tier
Optional next optimization: downscale images before upload to reduce payload and token pressure

Recommended run sequence

Run with concurrency=1 first.
Watch logs for 429 retries and total runtime.
Increase to concurrency=2 only after a full successful run.
Keep retries enabled for long batches.

Why this works

scripts/vision.ts now has:

bounded concurrency for Vision calls
automatic retry with exponential backoff for 429

So the process no longer fails immediately when TPM is saturated for a short window.

Next improvement

If batches continue to be slow, implement automatic image downscale before the API call.
That usually improves both cost-efficiency and throughput for large libraries.