Handwritten Text Extraction OCR

A deep learning-based Optical Character Recognition (OCR) system for extracting text from handwritten documents using Connectionist Temporal Classification (CTC) and Convolutional Neural Networks (CNN).

Features

Handwritten text recognition using deep learning
Utilizes CTC (Connectionist Temporal Classification) for sequence prediction
Implements a CNN + LSTM architecture for robust text recognition
Supports variable-length text sequences
Preprocessing pipeline for image normalization and frame splitting
Batch processing capabilities for efficient training

Architecture

The model architecture consists of:

Time-distributed CNN layers for feature extraction
Bidirectional LSTM layers for sequence modeling
CTC loss function for sequence prediction
Batch normalization for improved training stability

Requirements

Python 3.x
Keras
OpenCV
NumPy
Pandas
TensorFlow (backend for Keras)

Project Structure

Handwritten_Text_Extraction_OCR/
├── Data/                  # Dataset directory
│   ├── list.csv          # Dataset annotations
│   └── class.txt         # Character classes
├── CTCModel.py           # CTC model implementation
├── configuration.py      # Model configuration parameters
├── model.py             # Main model architecture
├── read_images.py       # Image preprocessing utilities
└── LICENSE              # MIT License

Usage

Prepare your dataset:
- Place images in the Data directory
- Update list.csv with image paths and annotations
- Ensure class.txt contains all character classes
Configure parameters in configuration.py:
- Set window dimensions
- Adjust batch size and epochs
- Configure model parameters
Train the model:
```
python model.py
```
The model will:
- Preprocess images
- Train on the dataset
- Save the trained model
- Evaluate performance

Model Training

The training process includes:

Image preprocessing and normalization
Frame splitting for sequence processing
CTC loss optimization
Performance evaluation using:
- Loss metrics
- Label Error Rate (LER)
- Sequence Error Rate (SER)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Kunal Bhujbal

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Handwritten Text Extraction OCR

Features

Architecture

Requirements

Project Structure

Usage

Model Training

License

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Data		Data
CTCModel.py		CTCModel.py
LICENSE		LICENSE
README.md		README.md
configuration.py		configuration.py
model.py		model.py
read_images.py		read_images.py

Folders and files

Latest commit

History

Repository files navigation

Handwritten Text Extraction OCR

Features

Architecture

Requirements

Project Structure

Usage

Model Training

License

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages