🛡️ Privacy 📚 Guide 🤖 AI Tools

How to Sanitize PII Before Sending Text to ChatGPT or Claude

📅 April 8, 2026 • ⏱️ 7 min read • ✍️ By FunWithText Team

TL;DR: Before pasting a document, email thread, support ticket, or log file into an AI chatbot, remove personal data first. This post walks you through what counts as PII, why it matters (GDPR, breach risk, training data), and a concrete five-step workflow you can do in under a minute using our free PII Sanitizer.

Why bother sanitizing?

"I'll just paste the whole email and ask ChatGPT to summarise it" is how most data leaks start in 2026. Three concrete reasons to sanitize first:

Compliance. Under the GDPR, sending personal data to a third-party processor without a basis and a data processing agreement is a breach. Most individuals using ChatGPT at work don't have either in place.
Leakage risk. Provider terms evolve. The ChatGPT/Claude/Gemini account you use today may not have the same training opt-out defaults tomorrow. Data you paste in may be retained, logged, or accessible to support staff under various conditions.
Attack surface. If your conversation history is ever exposed — phishing, session hijack, or a provider incident — the blast radius is proportional to the personal data you typed in.

⚠️ A realistic framing

You usually don't need names, addresses, or account numbers for the AI to do its job. If you're asking it to rewrite an email, the phrase "[client name] is unhappy with the proposal" works as well as "John Hartmann at Acme Corp in Oslo is unhappy with the proposal." Strip what you can.

What actually counts as PII?

Under the GDPR, "personal data" is anything relating to an identified or identifiable person. That's broader than most people think. A partial list:

Category	Examples	Strip?
Identifiers	Full name, initials + context, user ID	Yes
Contact	Email, phone, postal address	Yes
Financial	IBAN, card number, account balance	Yes
Government	SSN, national ID, passport, VAT number (individuals)	Yes
Technical	IP address, device ID, MAC, precise GPS	Yes
Health	Diagnoses, medications, medical record numbers	Yes (special category)
Employment	Salary, performance rating tied to a person	Yes
Quasi-identifiers	Date of birth + postcode, rare job title at a small company	Yes — combinations re-identify
Content about third parties	Client names in a meeting note, CV text you received	Yes — it isn't yours to share

Mask, redact, or tokenize — which should you use?

Three common approaches, each with trade-offs:

Redact (remove)

Delete the value outright. Safest, and fine when the AI doesn't need to reason about the identifier.

"Send the contract to john@acme.com by Friday"
  → "Send the contract to [REDACTED] by Friday"

Mask (placeholder)

Replace with a type-specific placeholder. Preserves structure so the AI can still produce a fluent draft — ideal for rewriting emails.

"Dear John, thanks for your email at john@acme.com"
  → "Dear [NAME_1], thanks for your email at [EMAIL_1]"

Tokenize (reversible)

Replace with a unique token you can map back later. Useful when the output will be post-processed and re-populated with the real values.

"John Hartmann at Acme Corp"
  → "PERSON_#a4f1 at ORG_#b92c"

After the AI reply, swap the tokens back locally.

The PII Sanitizer on this site supports masking out of the box and runs entirely in your browser — nothing you paste into it leaves your device.

A 5-step workflow you can run in under a minute

Paste the raw text into the PII Sanitizer

Open the PII Sanitizer in a separate tab. Paste your draft email, support ticket, or document. The tool highlights detected personal data (names, emails, phones, IDs, addresses) locally — it never uploads your text.

Review the highlights

Detectors aren't perfect. Quickly scan the unhighlighted parts of the text for anything the detector missed — unusual name formats, project codenames that identify a client, or freeform addresses.

Pick a strategy and sanitize

Choose masking (for rewrites) or redaction (for summaries/classification). Apply the transformation and copy the sanitized output.

Send to ChatGPT / Claude / Gemini

Paste the sanitized text into your AI tool. Tell the model explicitly how to handle placeholders:

"The following is an anonymised draft. Placeholders
like [NAME_1], [EMAIL_1], and [ORG_1] represent
real values I've redacted locally. Keep the
placeholders in the output — I'll re-insert the
real values afterwards."

Re-insert real values locally

Copy the AI's response back, replace the placeholders with the original values, and send. You kept the useful reasoning of the model without sharing the personal data.

Common mistakes to avoid

🚩 Sanitizing only the "obvious" fields. A name can hide in a signature block, a calendar quote, or an email forwarded chain. Review the whole text.
🚩 Trusting free-form context. "The VP of engineering at a 40-person Oslo fintech" uniquely identifies a small number of people.
🚩 Leaving log files untouched. IP addresses and user IDs in logs are personal data.
🚩 Pasting CVs and candidate notes. That data isn't yours — GDPR treats the candidate as the data subject.
🚩 Forgetting metadata. Document titles, file names, and email subjects often contain names or client identifiers.
🚩 Re-uploading the real version "to check the output." Once you have the sanitized answer, don't send the raw one "just to compare" — you've now leaked everything.

For teams and consultants

If you're doing this more than occasionally:

Write a one-page AI usage policy. What's allowed, what isn't, and which tools employees should run inputs through before pasting.
Use the enterprise tier of your AI provider (ChatGPT Enterprise, Claude for Work, Gemini Enterprise) where training on your inputs is disabled by default and you can sign a DPA.
Keep a short incident log. If personal data does leak into an AI chat, log it: what, when, whose data, steps taken. You'll need this for GDPR Article 33 assessments.
Default to anonymise-first. Make sanitizing the normal step, not the exception.

🛡️ The one-line version

Before pasting, ask yourself: "Would I be comfortable if this conversation showed up in a future ChatGPT breach?" If the answer is no, sanitize first.

Conclusion

Sanitizing PII before sending text to an AI chatbot is the single highest-leverage privacy habit you can build in 2026. It takes under a minute with the right tool, protects you against future breaches you can't predict, and quietly keeps you on the right side of GDPR.

🛡️ Try the workflow now

Open the PII Sanitizer in a new tab and run a real email through it:

🧹 Open PII Sanitizer 🧪 Check for Hidden Characters 📄 Our Privacy Policy

👨‍💻

About FunWithText

We build free, privacy-focused text tools. The PII Sanitizer, Paste Detector, and password generator all run entirely in your browser — nothing you type into them leaves your device.

📚 Related Articles

📧

Indirect Prompt Injection

When your AI reads content you didn't write.

🔐

Protect ChatGPT in 5 Steps

Practical ChatGPT hardening guide.