← Home

Vision Rate-Limit Runbook for Photo Sidecars

When processing a larger batch of photos with pnpm vision, it is normal to hit rate limits before you hit a credit limit.

Example:

  • Credit balance can still be positive (for example $3.16)
  • but requests still fail with 429 if the organization TPM limit is exhausted

That means the immediate issue is throughput per minute, not account balance.

Quick TODO checklist

  • Start with a conservative run: pnpm vision -- --concurrency=1 --retries=10 --backoff-ms=2000
  • If the run is stable, increase slowly to: --concurrency=2
  • If you still need higher throughput, raise your OpenAI rate-limit tier
  • Optional next optimization: downscale images before upload to reduce payload and token pressure
  1. Run with concurrency=1 first.
  2. Watch logs for 429 retries and total runtime.
  3. Increase to concurrency=2 only after a full successful run.
  4. Keep retries enabled for long batches.

Why this works

scripts/vision.ts now has:

  • bounded concurrency for Vision calls
  • automatic retry with exponential backoff for 429

So the process no longer fails immediately when TPM is saturated for a short window.

Next improvement

If batches continue to be slow, implement automatic image downscale before the API call.
That usually improves both cost-efficiency and throughput for large libraries.

← Home