Vision Rate-Limit Runbook for Photo Sidecars
When processing a larger batch of photos with pnpm vision, it is normal to hit rate limits before you hit a credit limit.
Example:
- Credit balance can still be positive (for example
$3.16) - but requests still fail with
429if the organization TPM limit is exhausted
That means the immediate issue is throughput per minute, not account balance.
Quick TODO checklist
- Start with a conservative run:
pnpm vision -- --concurrency=1 --retries=10 --backoff-ms=2000 - If the run is stable, increase slowly to:
--concurrency=2 - If you still need higher throughput, raise your OpenAI rate-limit tier
- Optional next optimization: downscale images before upload to reduce payload and token pressure
Recommended run sequence
- Run with
concurrency=1first. - Watch logs for
429retries and total runtime. - Increase to
concurrency=2only after a full successful run. - Keep retries enabled for long batches.
Why this works
scripts/vision.ts now has:
- bounded concurrency for Vision calls
- automatic retry with exponential backoff for
429
So the process no longer fails immediately when TPM is saturated for a short window.
Next improvement
If batches continue to be slow, implement automatic image downscale before the API call.
That usually improves both cost-efficiency and throughput for large libraries.