What Is PDF Metadata?
Metadata is data about data. In a PDF, metadata is information stored in the file header that describes the document rather than the document's content. You don't see it when you read the PDF, but anyone who knows where to look can read it instantly using free tools — including potential employers, clients, legal opponents, and journalists.
Standard PDF metadata fields include the document's Title, Author (usually your name or your account username), Creator (the application that created the file — Word, Photoshop, InDesign, etc.), Producer (the PDF renderer), Creation Date, and Modification Date. Most PDFs contain all of these fields populated automatically without the author ever seeing them.
A journalist investigating a leaked government document can check the Author field to see who created it. A lawyer reviewing a contract sent by the opposing party can check the Modification Date to see if it was edited after a deadline. A client can see that you created a "new" proposal using a template originally made for a different client, because the original Author and Company fields are still present. These are not hypothetical — these situations occur regularly.
Types of Hidden Data in PDFs
Beyond the standard metadata fields, PDFs can contain several other types of hidden information:
- Author and company name — Pulled from Microsoft Office or system account settings at the time of creation. Can reveal your full name, username, or organisation even if it doesn't appear anywhere in the document itself.
- Revision history — Some PDF creation tools (particularly those from Adobe) store revision history or comments that were added and then deleted. Text you thought you removed may still be in the file.
- Embedded thumbnails — PDF viewers often store a small preview image of the first page. This preview is generated at creation time — if the document has changed since, the thumbnail may show an older version of page 1.
- GPS and location data — PDFs created from photos (particularly from mobile phone cameras) may carry EXIF data including GPS coordinates of where the photo was taken.
- Form field data — If a PDF form was filled in and the data was subsequently cleared, some PDF editors leave the data in the file structure even though it's no longer visible on screen.
How to Check What Metadata Your PDF Contains
To inspect a PDF's metadata in Adobe Acrobat Reader (free): open the file, go to File → Properties, and check the Description and Custom tabs. In the Chrome browser, you can see basic properties by opening the PDF and right-clicking → Document Properties.
For more detailed inspection, tools like pdfinfo on Linux/Mac, or online metadata viewers, will show the full XMP metadata including fields that Adobe Reader's interface doesn't display by default.
Open any PDF you've recently created or edited. Check the Author field. If it contains your personal name, your work username, or your company name — and this is a document you're planning to share externally — consider whether that information is appropriate for the recipient to have.
How to Remove Metadata from a PDF Before Sharing
The most reliable methods for stripping metadata:
- Print to PDF — Open the PDF, print it using your operating system's "Print to PDF" feature (available on Windows, Mac, and most mobile systems). This creates a fresh PDF from the rendered pages, discarding the original metadata. Simple and effective.
- Export → Optimise — In Adobe Acrobat Pro, use File → Save as Other → Optimised PDF and check the "Discard Hidden Information" option. This strips metadata, hidden layers, embedded content, and revision history.
- Re-compress the PDF — Running a PDF through Rifix Compress PDF re-renders the pages as images, which strips the original text layer and metadata. The result is clean of personal metadata, though the text is no longer selectable — suitable for documents you're sharing for viewing only, like a portfolio or a brochure.
What Data Is NOT in a PDF (Common Misconceptions)
Your browsing history is not in the PDF. Your IP address is not embedded in a PDF you create (though it may be logged if you upload to a service). Deleted text that was removed before the PDF was exported is generally not recoverable from the PDF itself — it would only be in the source document's undo history. And a scanned PDF contains almost no metadata unless the scanner software adds it — scans are typically very clean from a metadata perspective.
The risk is real, but it's targeted and specific. Checking and cleaning your PDFs before sharing high-stakes documents takes about 30 seconds and is a habit worth building.
What Is PDF Metadata?
Every PDF file contains metadata — information about the document stored separately from the visible content. Standard PDF metadata fields include: Author (the name of the person who created the document), Title (the document title, which may differ from the filename), Subject, Keywords, Creator (the application that created the document, such as "Microsoft Word 16.0"), Producer (the PDF library or tool that generated the PDF), and creation and modification dates. This metadata is invisible to casual readers viewing the document, but it is easily accessible to anyone who views the document properties in a PDF reader or examines the file with any metadata tool.
Why Metadata Creates Privacy Risks
The privacy risks from PDF metadata are real and well-documented. The Author field reveals the name of the person who created or last edited the document — which may be a personal name, a username, or an employee ID. This can reveal the identity of a whistleblower, the name of an employee who prepared a company document, or the personal name of someone who intended to share a document anonymously. The Creator field reveals what software was used — Microsoft Word 16.81, LibreOffice 7.5, Adobe InDesign 18 — which can reveal software licensing information or indicate an organisation's technology stack. Modification dates reveal when a document was last changed, which can contradict stated dates in legal documents.
High-Profile Metadata Incidents
Metadata exposure has caused significant real-world consequences. A UK government department published a redacted document in 2003 that had been copied from a source with author metadata intact — an analysis by a researcher revealed the document's origin by examining the metadata, undermining official claims about its provenance. Law firms have inadvertently revealed client information through document metadata in disclosed PDFs during litigation. Journalists have been identified through metadata left in documents they intended to share anonymously. These cases are not technical edge cases — they represent common oversight by people who understood the content of the document but were unaware of the metadata layer.
Checking Your PDF Metadata
In Adobe Reader: File → Properties → Description tab. In Preview on Mac: Tools → Inspector → More Info. In Chrome: open the PDF, right-click, select Document Properties. You can also check metadata programmatically using free tools like ExifTool. Look at the Author, Title, Creator, and modification date fields. If these contain information you would not want associated with the document — your real name when sharing anonymously, your organisation's internal naming conventions, or timestamps that reveal when the document was actually written — clean the metadata before sharing.
Removing Metadata from PDFs at rifix.xyz
The privacy tools at rifix.xyz/edit include the ability to strip PDF metadata. Upload your document, use the metadata cleaning option to remove or replace Author, Title, Subject, and Keywords fields, and download a clean version. The document content — text, images, layout — is completely unchanged. Only the hidden metadata fields are removed or replaced with generic values. This process takes seconds and produces a PDF that reveals no information about its origin, author, or editing history beyond what is visible in the document content itself.
When to Clean Metadata
Before sharing any document externally that was created or edited by named individuals whose identity should remain private. Before publishing PDFs on a public website where Author and Creator fields would be indexed. Before sharing documents in legal proceedings where metadata could be used to challenge the document's stated provenance. Before sending documents that contain information you prefer to keep anonymous — whistleblower submissions, confidential tip-offs, anonymous feedback. Before sharing any PDF where the creation timestamp would be embarrassing or contradictory — a document dated one day but showing a modification timestamp from months later reveals that it was substantially changed after the stated date.
Beyond Metadata — Full Privacy for PDF Documents
Metadata cleaning is one component of PDF privacy. Other considerations: document comments and tracked changes can contain author information and revision history — flatten or remove these before sharing. Embedded thumbnails sometimes contain earlier versions of page content — these can be cleaned along with metadata. Embedded scripts in PDFs can phone home when the document is opened — rare but possible in professionally produced PDFs. For maximum privacy, the combination of metadata removal, comment and annotation flattening, and ensuring the document was processed using a tool that does not log or store the content (such as rifix.xyz, which processes entirely locally) provides the most complete privacy protection available for PDF documents.
Try It Free — Right Now
Re-render your PDF pages to strip metadata and reduce file size — private, local, no upload.
Clean & Compress Your PDF →