Migrating the Vision Script from OpenAI to Claude
The script that generates photo sidecar files — scripts/vision.ts — was originally written against the OpenAI API. This post covers what changed when migrating to Claude.
What the script does
scripts/vision.ts processes new JPG files in the photo albums directory. For each image without a .json sidecar, it:
- Extracts EXIF metadata with
exiftool(camera, lens, aperture, ISO, focal length, shutter speed, GPS) - Sends the image to an AI Vision API to generate alt text, title suggestions, and tags
- Merges both into a JSON file written next to the image
The resulting sidecar drives the photo stream on this site — alt text for accessibility, titles for the detail page, EXIF for the metadata panel.
What changed
Dependency
# before
pnpm add openai
# after
pnpm add @anthropic-ai/sdk
Environment variable
# before
OPENAI_API_KEY=sk-...
# after
ANTHROPIC_API_KEY=sk-ant-...
Client
// before
import OpenAI from "openai";
const openai = new OpenAI();
// after
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({ maxRetries: 0 });
maxRetries: 0 disables the SDK’s built-in retry behaviour. The script manages its own retry loop with configurable backoff, so double-retrying would be redundant.
Structured output
OpenAI used a json_schema response format to constrain the model output:
const completion = await openai.chat.completions.create({
model: "gpt-5.1-chat-latest",
max_completion_tokens: 2048,
response_format: {
type: "json_schema",
json_schema: {
name: "visionResponse",
strict: true,
schema: { ... },
},
},
messages: [{ role: "user", content: [...] }],
});
const result = JSON.parse(completion.choices[0].message.content);
Claude uses tool use with tool_choice: { type: "tool" } to force a specific tool call — this guarantees the model always responds with the schema, with no JSON parsing step:
const response = await anthropic.messages.create({
model: "claude-opus-4-6",
max_tokens: 2048,
tools: [{
name: "vision_response",
description: "Return the vision analysis of the image.",
input_schema: {
type: "object",
additionalProperties: false,
properties: {
title_ideas: { type: "array", items: { type: "string" } },
description: { type: "string" },
tags: { type: "array", items: { type: "string" } },
},
required: ["title_ideas", "description", "tags"],
},
}],
tool_choice: { type: "tool", name: "vision_response" },
messages: [{
role: "user",
content: [
{
type: "image",
source: { type: "base64", media_type: "image/jpeg", data: encodedImage },
},
{ type: "text", text: prompt },
],
}],
});
const toolUseBlock = response.content.find((b) => b.type === "tool_use");
const result = toolUseBlock.input; // already a typed object, no JSON.parse needed
The image content block format also differs: OpenAI uses { type: "image_url", image_url: { url: "data:image/jpeg;base64,..." } }, while Anthropic uses a dedicated source block with type: "base64" and a separate media_type field.
Rate limit handling
The script catches rate limit errors and retries with exponential backoff. The detection and retry-after extraction now use the SDK’s typed exception class:
// before — checking a raw status property
function isRateLimitError(error: unknown): boolean {
return (error as { status?: number }).status === 429;
}
function extractRetryAfterMs(error: unknown): number | null {
// parsed "Please try again in Xs" from error message
}
// after — using Anthropic's typed exception
function isRateLimitError(error: unknown): boolean {
return error instanceof Anthropic.RateLimitError;
}
function extractRetryAfterMs(error: unknown): number | null {
if (!(error instanceof Anthropic.RateLimitError)) return null;
const retryAfter = error.headers?.["retry-after"];
if (retryAfter) {
const seconds = Number.parseFloat(retryAfter);
if (Number.isFinite(seconds) && seconds > 0) return Math.ceil(seconds * 1000);
}
return null;
}
Everything else — EXIF extraction, concurrency control, batching, file writing, CLI flags — stayed exactly the same.
Why claude-opus-4-6
claude-opus-4-6 is Anthropic’s most capable model and handles dense visual scenes, low-light photography, and culturally specific subjects well. For a batch script that runs offline before a deploy, quality matters more than latency.