Obfuscating Contact Data on a Static Site

Category: Development

Tags: privacy, astro


German law (§ 5 TMG) requires a publicly accessible imprint with a real name, postal address, and a working email address. That’s essentially a legal mandate to publish exactly the data spam harvesters are looking for. This post walks through the two-layer approach I settled on to protect that data on a static site without making the page unusable.

The setup

  • Site: Astro 6 static build — no server-side rendering, no API for on-demand obfuscation.
  • Legal baseline: the imprint page must show a real name, address, and email.
  • Threat model: automated harvesters (static scrapers, regex crawlers, JS-capable bots). Not a determined human.

The problem

A static site has no server-side rendering to help. The HTML is delivered as-is, and any email address in the source is one regex away. The most common markup — <a href="mailto:hey@example.com"> — is a harvester’s dream: no JavaScript required, no interaction needed.

First layer: Base64 + JavaScript

Problem: plain text in the HTML source is readable without executing anything.

Implementation: move the real data out of the source entirely. Leave the element empty, carry the content as Base64 in a data-obf attribute:

<span data-obf="cmVudGxBIG5haXJkQQ=="></span>

A small script decodes and injects at runtime:

document.querySelectorAll('[data-obf]').forEach(el => {
  el.textContent = atob(el.dataset.obf);
});

For links, data-obf-href carries the full mailto: URL — so the mailto: prefix is never in the HTML source either:

<a data-obf-href="bWFpbHRvOmhleUBleGFtcGxlLmNvbQ=="></a>

Solution: static HTML scrapers see only Base64. The decoded text exists only in the DOM after JS runs.

What it doesn’t protect: copy-paste. Once the script runs and the real text lands in the DOM, selecting and copying hands out the real address.

Second layer: CSS RTL reversal

Problem: a JS-capable scraper — or a simple copy-paste — defeats the first layer on its own.

Implementation: store the text reversed in the DOM, flip it back visually with CSS. Common trick among privacy-focused blogs:

.r {
  direction: rtl;
  unicode-bidi: bidi-override;
}

With unicode-bidi: bidi-override, the browser renders characters strictly right-to-left. So the DOM string rentlA nairdA displays as Adrian Altner. When a visitor copies the text, they get rentlA nairdA — not the real name.

Combining both

The two techniques compose cleanly. data-obf stores a Base64-encoded reversed string. JavaScript decodes it and writes the reversed text into the DOM. CSS renders it visually correct.

<span class="r" data-obf="cmVudGxBIG5haXJkQQ=="></span>

The flow:

  1. Static HTML — empty element, opaque Base64 attribute.
  2. After JS — rentlA nairdA in the DOM.
  3. After CSS — displays as Adrian Altner.
  4. On copy-paste — rentlA nairdA lands in the clipboard.

A static scraper sees nothing. A JS-capable scraper sees reversed text. A human copying gets the reversed string.

Generating the reversed strings

Problem: plain text reverses cleanly, but email addresses carry @ and . that need placeholder substitutions to survive legibly. The placeholders run into another browser behavior: bidi mirroring.

Bracket characters like {, }, (, ) are Unicode bidi-mirror pairs. In RTL context the browser swaps them — { becomes } and vice versa. That means to display {at}, the stored string must contain {ta} — the reversed character order — because the browser mirrors the braces as part of RTL rendering:

  • Stored: {ta} → reversed char order: }at{ → bidi-mirrored: {at}
  • Stored: }ta{ → reversed char order: {at} → bidi-mirrored: }at{

Same applies to {dot}:

  • Stored: {tod} → reversed: }dot{ → mirrored: {dot}

So to display hey{at}adrian-altner{dot}com, the stored string is:

moc{tod}rentla-nairda{ta}yeh

Implementation: a small Node script generates the values for every address field:

const obfuscate = s =>
  s.split('').reverse().join('')
   .replace(/@/g, '{ta}')
   .replace(/\./g, '{tod}');

The result is Base64-encoded and placed in the data-obf attribute.

What this protects, and what it doesn’t

ThreatProtected
Static HTML scrapers (most spam bots)
Regex email harvesters
Copy-paste by visitors✅ (reversed text in clipboard)
Headless browsers running JS (Puppeteer)⚠️ reversed text, no plain email
Someone inspecting the decoded DOM
Manual reading by a human

The last two cases are unavoidable — if the data is readable by a human, a determined human will get it. The goal isn’t perfect protection, it’s raising the bar enough to stop the automated tools responsible for the vast majority of harvested-address spam. For a public imprint on a personal blog, that’s a reasonable trade-off.

What to take away

  • Don’t publish mailto: links in HTML. They’re the first thing harvesters look for.
  • Two cheap layers beat one clever layer. Base64 defeats static scrapers; RTL reversal defeats copy-paste and JS-capable scrapers.
  • Bidi-mirroring is real. Bracket-like placeholders need to be stored mirrored too — test with an actual browser, not just a reverse function.
  • Base64 stays in the source, plaintext never does. The decoded address only exists in the DOM after the script runs.
  • Aim for the bar, not perfection. Anything readable by a human is eventually readable by a patient human — optimise for the bots that send the spam.