Migrating the Vision Script from OpenAI to Claude

The script that generates photo sidecar files — scripts/vision.ts — was originally written against the OpenAI API. This post covers what changed when migrating to Claude.

What the script does

scripts/vision.ts processes new JPG files in the photo albums directory. For each image without a .json sidecar, it:

Extracts EXIF metadata with exiftool (camera, lens, aperture, ISO, focal length, shutter speed, GPS)
Sends the image to an AI Vision API to generate alt text, title suggestions, and tags
Merges both into a JSON file written next to the image

The resulting sidecar drives the photo stream on this site — alt text for accessibility, titles for the detail page, EXIF for the metadata panel.

What changed

Dependency

# before
pnpm add openai

# after
pnpm add @anthropic-ai/sdk

Environment variable

# before
OPENAI_API_KEY=sk-...

# after
ANTHROPIC_API_KEY=sk-ant-...

Client

// before
import OpenAI from "openai";
const openai = new OpenAI();

// after
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({ maxRetries: 0 });

maxRetries: 0 disables the SDK’s built-in retry behaviour. The script manages its own retry loop with configurable backoff, so double-retrying would be redundant.

Structured output

OpenAI used a json_schema response format to constrain the model output:

const completion = await openai.chat.completions.create({
  model: "gpt-5.1-chat-latest",
  max_completion_tokens: 2048,
  response_format: {
    type: "json_schema",
    json_schema: {
      name: "visionResponse",
      strict: true,
      schema: { ... },
    },
  },
  messages: [{ role: "user", content: [...] }],
});

const result = JSON.parse(completion.choices[0].message.content);

Claude uses tool use with tool_choice: { type: "tool" } to force a specific tool call — this guarantees the model always responds with the schema, with no JSON parsing step:

const response = await anthropic.messages.create({
  model: "claude-opus-4-6",
  max_tokens: 2048,
  tools: [{
    name: "vision_response",
    description: "Return the vision analysis of the image.",
    input_schema: {
      type: "object",
      additionalProperties: false,
      properties: {
        title_ideas: { type: "array", items: { type: "string" } },
        description: { type: "string" },
        tags: { type: "array", items: { type: "string" } },
      },
      required: ["title_ideas", "description", "tags"],
    },
  }],
  tool_choice: { type: "tool", name: "vision_response" },
  messages: [{
    role: "user",
    content: [
      {
        type: "image",
        source: { type: "base64", media_type: "image/jpeg", data: encodedImage },
      },
      { type: "text", text: prompt },
    ],
  }],
});

const toolUseBlock = response.content.find((b) => b.type === "tool_use");
const result = toolUseBlock.input; // already a typed object, no JSON.parse needed

The image content block format also differs: OpenAI uses { type: "image_url", image_url: { url: "data:image/jpeg;base64,..." } }, while Anthropic uses a dedicated source block with type: "base64" and a separate media_type field.

Rate limit handling

The script catches rate limit errors and retries with exponential backoff. The detection and retry-after extraction now use the SDK’s typed exception class:

// before — checking a raw status property
function isRateLimitError(error: unknown): boolean {
  return (error as { status?: number }).status === 429;
}

function extractRetryAfterMs(error: unknown): number | null {
  // parsed "Please try again in Xs" from error message
}

// after — using Anthropic's typed exception
function isRateLimitError(error: unknown): boolean {
  return error instanceof Anthropic.RateLimitError;
}

function extractRetryAfterMs(error: unknown): number | null {
  if (!(error instanceof Anthropic.RateLimitError)) return null;
  const retryAfter = error.headers?.["retry-after"];
  if (retryAfter) {
    const seconds = Number.parseFloat(retryAfter);
    if (Number.isFinite(seconds) && seconds > 0) return Math.ceil(seconds * 1000);
  }
  return null;
}

Everything else — EXIF extraction, concurrency control, batching, file writing, CLI flags — stayed exactly the same.

Why claude-opus-4-6

claude-opus-4-6 is Anthropic’s most capable model and handles dense visual scenes, low-light photography, and culturally specific subjects well. For a batch script that runs offline before a deploy, quality matters more than latency.