Skip to content

yazanbaker94/OCRMYPDF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDF OCR Converter

PDF OCR Converter

A simple desktop app that converts scanned/image PDFs to searchable text PDFs.

Features

  • 📄 Convert multiple PDFs at once
  • 🌍 Multi-language OCR support (English, Arabic, German, French, Spanish, Chinese, Japanese)
  • ⏭️ Skip pages that already have text
  • 🎨 Modern dark-themed interface

Download

📥 Download PDF_OCR_Converter.exe

Note: Requires Tesseract OCR to be installed. The app will prompt you if Tesseract is not detected.

Installation (For Developers)

1. Install Tesseract OCR

Windows:

Or use Chocolatey:

choco install tesseract

2. Install Python Dependencies

pip install -r requirements.txt

3. Install Language Packs (Optional)

For Arabic support:

# The Windows installer includes language selection
# Or download language files from: https://github.com/tesseract-ocr/tessdata

Usage

  1. Run the app:

    python ocr_app.py
  2. Click "Select PDF Files" to choose your PDFs

  3. Select the OCR language

  4. Click "Convert to Searchable PDF"

  5. Find output files with _ocr suffix in the same folder

Output

The converted files are saved in the same directory as the input with _ocr appended to the filename:

  • document.pdfdocument_ocr.pdf

About

This will turn any PDF to Text using OCR.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages