Obfuscating Contact Data on a Static Site
How I protect email addresses and personal contact details on a static Astro site from harvesting bots, using a combination of Base64 obfuscation and a CSS RTL trick that also defends against copy-paste.
Category: Development
German law (§ 5 TMG) requires a publicly accessible imprint with a real name, postal address, and a working email address. That’s essentially a legal mandate to publish exactly the data spam harvesters are looking for. This post walks through the two-layer approach I settled on to protect that data on a static site without making the page unusable.
The setup
- Site: Astro 6 static build — no server-side rendering, no API for on-demand obfuscation.
- Legal baseline: the imprint page must show a real name, address, and email.
- Threat model: automated harvesters (static scrapers, regex crawlers, JS-capable bots). Not a determined human.
The problem
A static site has no server-side rendering to help. The HTML is delivered as-is, and any email address in the source is one regex away. The most common markup — <a href="mailto:hey@example.com"> — is a harvester’s dream: no JavaScript required, no interaction needed.
First layer: Base64 + JavaScript
Problem: plain text in the HTML source is readable without executing anything.
Implementation: move the real data out of the source entirely. Leave the element empty, carry the content as Base64 in a data-obf attribute:
<span data-obf="cmVudGxBIG5haXJkQQ=="></span>
A small script decodes and injects at runtime:
document.querySelectorAll('[data-obf]').forEach(el => {
el.textContent = atob(el.dataset.obf);
});
For links, data-obf-href carries the full mailto: URL — so the mailto: prefix is never in the HTML source either:
<a data-obf-href="bWFpbHRvOmhleUBleGFtcGxlLmNvbQ=="></a>
Solution: static HTML scrapers see only Base64. The decoded text exists only in the DOM after JS runs.
What it doesn’t protect: copy-paste. Once the script runs and the real text lands in the DOM, selecting and copying hands out the real address.
Second layer: CSS RTL reversal
Problem: a JS-capable scraper — or a simple copy-paste — defeats the first layer on its own.
Implementation: store the text reversed in the DOM, flip it back visually with CSS. Common trick among privacy-focused blogs:
.r {
direction: rtl;
unicode-bidi: bidi-override;
}
With unicode-bidi: bidi-override, the browser renders characters strictly right-to-left. So the DOM string rentlA nairdA displays as Adrian Altner. When a visitor copies the text, they get rentlA nairdA — not the real name.
Combining both
The two techniques compose cleanly. data-obf stores a Base64-encoded reversed string. JavaScript decodes it and writes the reversed text into the DOM. CSS renders it visually correct.
<span class="r" data-obf="cmVudGxBIG5haXJkQQ=="></span>
The flow:
- Static HTML — empty element, opaque Base64 attribute.
- After JS —
rentlA nairdAin the DOM. - After CSS — displays as
Adrian Altner. - On copy-paste —
rentlA nairdAlands in the clipboard.
A static scraper sees nothing. A JS-capable scraper sees reversed text. A human copying gets the reversed string.
Generating the reversed strings
Problem: plain text reverses cleanly, but email addresses carry @ and . that need placeholder substitutions to survive legibly. The placeholders run into another browser behavior: bidi mirroring.
Bracket characters like {, }, (, ) are Unicode bidi-mirror pairs. In RTL context the browser swaps them — { becomes } and vice versa. That means to display {at}, the stored string must contain {ta} — the reversed character order — because the browser mirrors the braces as part of RTL rendering:
- Stored:
{ta}→ reversed char order:}at{→ bidi-mirrored:{at}✓ - Stored:
}ta{→ reversed char order:{at}→ bidi-mirrored:}at{✗
Same applies to {dot}:
- Stored:
{tod}→ reversed:}dot{→ mirrored:{dot}✓
So to display hey{at}adrian-altner{dot}com, the stored string is:
moc{tod}rentla-nairda{ta}yeh
Implementation: a small Node script generates the values for every address field:
const obfuscate = s =>
s.split('').reverse().join('')
.replace(/@/g, '{ta}')
.replace(/\./g, '{tod}');
The result is Base64-encoded and placed in the data-obf attribute.
What this protects, and what it doesn’t
| Threat | Protected |
|---|---|
| Static HTML scrapers (most spam bots) | ✅ |
| Regex email harvesters | ✅ |
| Copy-paste by visitors | ✅ (reversed text in clipboard) |
| Headless browsers running JS (Puppeteer) | ⚠️ reversed text, no plain email |
| Someone inspecting the decoded DOM | ❌ |
| Manual reading by a human | ❌ |
The last two cases are unavoidable — if the data is readable by a human, a determined human will get it. The goal isn’t perfect protection, it’s raising the bar enough to stop the automated tools responsible for the vast majority of harvested-address spam. For a public imprint on a personal blog, that’s a reasonable trade-off.
What to take away
- Don’t publish
mailto:links in HTML. They’re the first thing harvesters look for. - Two cheap layers beat one clever layer. Base64 defeats static scrapers; RTL reversal defeats copy-paste and JS-capable scrapers.
- Bidi-mirroring is real. Bracket-like placeholders need to be stored mirrored too — test with an actual browser, not just a reverse function.
- Base64 stays in the source, plaintext never does. The decoded address only exists in the DOM after the script runs.
- Aim for the bar, not perfection. Anything readable by a human is eventually readable by a patient human — optimise for the bots that send the spam.