Mar 23, 2026

Obfuscating Contact Data on a Static Site

German law (§ 5 TMG) requires a publicly accessible imprint with a real name, postal address, and a working email address. That’s essentially a legal mandate to publish exactly the data spam harvesters are looking for. This post walks through the two-layer approach I settled on to protect that data on a static site without making the page unusable.

The setup

Site: Astro 6 static build — no server-side rendering, no API for on-demand obfuscation.
Legal baseline: the imprint page must show a real name, address, and email.
Threat model: automated harvesters (static scrapers, regex crawlers, JS-capable bots). Not a determined human.

The problem

A static site has no server-side rendering to help. The HTML is delivered as-is, and any email address in the source is one regex away. The most common markup — <a href="mailto:hey@example.com"> — is a harvester’s dream: no JavaScript required, no interaction needed.

First layer: Base64 + JavaScript

Problem: plain text in the HTML source is readable without executing anything.

Implementation: move the real data out of the source entirely. Leave the element empty, carry the content as Base64 in a data-obf attribute:

<span data-obf="cmVudGxBIG5haXJkQQ=="></span>

A small script decodes and injects at runtime:

document.querySelectorAll('[data-obf]').forEach(el => {
  el.textContent = atob(el.dataset.obf);
});

For links, data-obf-href carries the full mailto: URL — so the mailto: prefix is never in the HTML source either:

<a data-obf-href="bWFpbHRvOmhleUBleGFtcGxlLmNvbQ=="></a>

Solution: static HTML scrapers see only Base64. The decoded text exists only in the DOM after JS runs.

What it doesn’t protect: copy-paste. Once the script runs and the real text lands in the DOM, selecting and copying hands out the real address.

Second layer: CSS RTL reversal

Problem: a JS-capable scraper — or a simple copy-paste — defeats the first layer on its own.

Implementation: store the text reversed in the DOM, flip it back visually with CSS. Common trick among privacy-focused blogs:

.r {
  direction: rtl;
  unicode-bidi: bidi-override;
}

With unicode-bidi: bidi-override, the browser renders characters strictly right-to-left. So the DOM string rentlA nairdA displays as Adrian Altner. When a visitor copies the text, they get rentlA nairdA — not the real name.

Combining both

The two techniques compose cleanly. data-obf stores a Base64-encoded reversed string. JavaScript decodes it and writes the reversed text into the DOM. CSS renders it visually correct.

<span class="r" data-obf="cmVudGxBIG5haXJkQQ=="></span>

The flow:

Static HTML — empty element, opaque Base64 attribute.
After JS — rentlA nairdA in the DOM.
After CSS — displays as Adrian Altner.
On copy-paste — rentlA nairdA lands in the clipboard.

A static scraper sees nothing. A JS-capable scraper sees reversed text. A human copying gets the reversed string.

Generating the reversed strings

Problem: plain text reverses cleanly, but email addresses carry @ and . that need placeholder substitutions to survive legibly. The placeholders run into another browser behavior: bidi mirroring.

Bracket characters like {, }, (, ) are Unicode bidi-mirror pairs. In RTL context the browser swaps them — { becomes } and vice versa. That means to display {at}, the stored string must contain {ta} — the reversed character order — because the browser mirrors the braces as part of RTL rendering:

Stored: {ta} → reversed char order: }at{ → bidi-mirrored: {at} ✓
Stored: }ta{ → reversed char order: {at} → bidi-mirrored: }at{ ✗

Same applies to {dot}:

Stored: {tod} → reversed: }dot{ → mirrored: {dot} ✓

So to display hey{at}adrian-altner{dot}com, the stored string is:

moc{tod}rentla-nairda{ta}yeh

Implementation: a small Node script generates the values for every address field:

const obfuscate = s =>
  s.split('').reverse().join('')
   .replace(/@/g, '{ta}')
   .replace(/\./g, '{tod}');

The result is Base64-encoded and placed in the data-obf attribute.

What this protects, and what it doesn’t

Threat	Protected
Static HTML scrapers (most spam bots)	✅
Regex email harvesters	✅
Copy-paste by visitors	✅ (reversed text in clipboard)
Headless browsers running JS (Puppeteer)	⚠️ reversed text, no plain email
Someone inspecting the decoded DOM	❌
Manual reading by a human	❌

The last two cases are unavoidable — if the data is readable by a human, a determined human will get it. The goal isn’t perfect protection, it’s raising the bar enough to stop the automated tools responsible for the vast majority of harvested-address spam. For a public imprint on a personal blog, that’s a reasonable trade-off.

What to take away

Don’t publish mailto: links in HTML. They’re the first thing harvesters look for.
Two cheap layers beat one clever layer. Base64 defeats static scrapers; RTL reversal defeats copy-paste and JS-capable scrapers.
Bidi-mirroring is real. Bracket-like placeholders need to be stored mirrored too — test with an actual browser, not just a reverse function.
Base64 stays in the source, plaintext never does. The decoded address only exists in the DOM after the script runs.
Aim for the bar, not perfection. Anything readable by a human is eventually readable by a patient human — optimise for the bots that send the spam.

← Home