OCR Scan — Make Scanned PDFs Searchable

Rx Rifix
← All Tools

Uses Tesseract.js to recognise text from scanned documents or images. Supports 12 languages. Processing is done locally — your file never leaves your device.

🔍
Drop PDF or image here
PDF, PNG, JPG, BMP, TIFF — scanned documents work best
⬇️
Loading OCR engine
📄
Rendering PDF page
🔍
Recognising text
Initialising…
0%
🖥️ Your browser is working — not frozen. Tesseract runs in a background Web Worker so your tab stays responsive.
📄 Extracted Text

What Is OCR and When Do You Need It?

OCR (Optical Character Recognition) converts scanned PDFs and images into searchable, selectable text. A scanned PDF is a photograph of paper — you cannot search, copy, or edit the text. After OCR, the document has a real text layer: search with Ctrl+F, copy passages, and have content indexed by document management systems.

Test if you need OCR: try selecting text on a page. If you cannot select it, OCR is required. If text highlights normally, the document already has a text layer.

How to Use OCR Scan

  1. Select your document language from the dropdown
  2. Drop your PDF or image onto the upload area (or click to browse)
  3. Click Run OCR — text appears in the panel on the right
  4. Copy the text or download as a .txt file

For Best OCR Results

Frequently Asked Questions

Does OCR change how my document looks?

No. The original scan image is preserved exactly. OCR adds an invisible text layer — the document looks identical but text becomes selectable and searchable.

What languages are supported?

English, Malay, Tamil, Chinese (Simplified and Traditional), Arabic, Hindi, Japanese, Korean, French, German, and Spanish.

Can it read handwriting?

Printed text achieves 95%+ accuracy. Handwriting is more challenging — neat block capitals may work acceptably, but cursive handwriting typically needs manual correction after OCR.

Are my files safe?

Yes. All processing happens locally in your browser using Tesseract.js. Your file is never uploaded to any server.

Frequently Asked Questions

What languages does the OCR support?

Rifix OCR supports English, Malay, Tamil, Chinese (Simplified and Traditional), Arabic, Hindi, Japanese, Korean, French, German, and Spanish.

How accurate is the OCR?

Printed text from a good quality scan achieves 95%+ accuracy. Handwritten text is less reliable. For best results, scan at 300 DPI or higher with black text on white background.

How long does OCR take?

A single page typically takes 10-30 seconds depending on your device speed and language. The Tesseract OCR engine runs in a background Web Worker so your browser stays responsive.

Does OCR change the appearance of my PDF?

No. OCR adds an invisible text layer under the original image. The document looks identical but text becomes searchable, selectable and copyable.