Playfair Cipher Decryption Project

This project provides multiple approaches to decrypt Playfair cipher text using frequency analysis, statistical optimization, and constraint-based key recovery.

Overview

The Playfair cipher is a digraphic substitution cipher that encrypts pairs of letters. This project implements several methods to:

Decrypt ciphertext when the key is unknown (using frequency analysis)
Recover the key when both plaintext and ciphertext are known
Find the key that produces statistically closest decryption to known plaintext

Project Structure

playfairgame/
├── Core Modules
│   ├── preprocessing.py          # Text cleaning and formatting
│   ├── frequency_analysis.py     # N-gram frequency analysis
│   ├── playfair_cipher.py        # Playfair encryption/decryption
│   ├── scoring.py                # English-likeness scoring
│   └── key_search.py             # Key search algorithms
│
├── Main Scripts
│   ├── main.py                   # Standard decryption (ciphertext only)
│   ├── main_improve.py           # Improved decryption with strategies
│   ├── restore_from_open.py      # Key recovery from plaintext-ciphertext pairs
│   └── find_key_statistical.py  # Statistical optimization approach
│
├── Data Files
│   ├── cyphertext.txt            # Encrypted message to decrypt
│   ├── microcypheropen.txt       # Ciphertext and plaintext pairs
│   ├── english_1grams.csv        # English letter frequencies
│   ├── english_2grams.csv         # English bigram frequencies
│   ├── english_2grams_norm.csv   # Normalized bigram frequencies
│   ├── english_3grams.csv        # English trigram frequencies
│   └── english_3grams_norm.csv   # Normalized trigram frequencies
│
├── Utilities
│   └── normalize/
│       └── normalize_ngrams.py   # Normalizes n-gram frequency files
│
└── Output Files
    ├── decrypted_output.txt      # Decryption results
    └── restored_key.txt           # Recovered key

Core Modules

`preprocessing.py`

Purpose: Handles loading and cleaning ciphertext/plaintext.

Functions:

load_ciphertext(filepath): Reads ciphertext from a file
preprocess_text(raw_text): Removes punctuation, converts to uppercase, treats J as I
restore_formatting(decrypted_text, non_alpha_positions): Restores original formatting

Usage: Used by all main scripts for text preprocessing.

`frequency_analysis.py`

Purpose: Analyzes letter and digram frequencies in text.

Functions:

count_monograms(text): Counts single letter frequencies
count_digrams(text): Counts letter pair frequencies
get_english_ngram_data(): Loads English frequency tables from CSV files
analyze_frequency_match(decrypted_text, ...): Compares decrypted text to English patterns

Usage: Provides frequency data for scoring and analysis.

`playfair_cipher.py`

Purpose: Implements Playfair cipher encryption and decryption.

Functions:

create_key_matrix(key_string): Creates 5x5 key matrix from 25-letter string
decrypt_playfair(cipher_text, key_matrix, letter_to_pos): Decrypts ciphertext
encrypt_playfair(plain_text, key_matrix, letter_to_pos): Encrypts plaintext
find_position(letter, letter_to_pos): Finds letter position in key matrix

Playfair Rules:

Same row: shift right (encrypt) or left (decrypt)
Same column: shift down (encrypt) or up (decrypt)
Rectangle: swap columns

`scoring.py`

Purpose: Scores how "English-like" decrypted text is.

Functions:

load_ngram_tables(): Loads English frequency tables
score_text(text, mono_table, bi_table, tri_table): Scores text using n-gram probabilities

Scoring Method: Uses log probabilities of monograms, bigrams, and trigrams. Higher score = more English-like.

`key_search.py`

Purpose: Searches for the correct Playfair key.

Functions:

mutate_key(key_str): Swaps two random letters in key
hill_climb_search(initial_key, cipher_text, score_func, ...): Hill climbing algorithm
simulated_annealing_search(initial_key, cipher_text, score_func, ...): Simulated annealing algorithm
search_for_key(cipher_text, english_ngram_tables, ...): Main search function combining both methods

Algorithms:

Hill Climbing: Accepts only better keys, can get stuck in local optima
Simulated Annealing: Sometimes accepts worse keys to escape local optima

Main Scripts

1. `main.py` - Standard Decryption

Purpose: Decrypts ciphertext when key is unknown.

How to Run:

python main.py

What It Does:

Loads ciphertext from cyphertext.txt
Preprocesses text (removes punctuation, converts to uppercase)
Analyzes ciphertext frequencies
Loads English frequency tables
Searches for key using hill climbing and simulated annealing
Scores decrypted text using n-gram statistics
Saves results to decrypted_output.txt

Output:

Key matrix (5x5 grid)
Decrypted plaintext
Frequency match score
N-gram analysis

When to Use: You have ciphertext but don't know the key or plaintext.

2. `main_improve.py` - Improved Decryption

Purpose: Enhanced decryption with multiple mutation strategies.

How to Run:

python main_improve.py

Improvements:

Multiple mutation strategies (swap, rotate, shuffle)
Better key initialization
Strategy analysis and selection
Improved convergence

When to Use: Standard decryption isn't finding a good key.

3. `restore_from_open.py` - Key Recovery from Known Pairs

Purpose: Recovers the exact Playfair key when you have both plaintext and ciphertext.

How to Run:

python restore_from_open.py

Input Format: microcypheropen.txt with:

Line 1: Ciphertext
Line 2: Plaintext

What It Does:

Reads ciphertext and plaintext pairs
Extracts digram pairs (plaintext → ciphertext)
Uses constraint propagation to build key matrix
Processes pairs to determine letter positions
Uses backtracking to resolve conflicts
Verifies recovered key
Saves key to restored_key.txt

Algorithm: Constraint-based solving with recursive backtracking.

When to Use: You have both plaintext and ciphertext and want to recover the exact key.

Note: Requires sufficient plaintext-ciphertext pairs (typically 20+ digrams).

4. `find_key_statistical.py` - Statistical Optimization

Purpose: Finds key that produces decrypted text statistically closest to known plaintext.

How to Run:

python find_key_statistical.py

Input Format: microcypheropen.txt with:

Line 1: Ciphertext
Line 2: Plaintext

What It Does:

Loads ciphertext and plaintext
Uses optimization algorithms (hill climbing, simulated annealing)
Scores keys by statistical distance between decrypted text and plaintext
Compares:
- Monogram frequencies
- Bigram frequencies
- Character-level accuracy
Runs multiple trials to find best key
Saves best key to restored_key.txt

Statistical Metrics:

Chi-square distance
Total variation distance
Character accuracy
KL divergence

When to Use: You have plaintext and want to find a key that produces statistically similar decryption (even if not exact).

Advantages: Works even with partial or noisy plaintext.

Utility Scripts

`normalize/normalize_ngrams.py`

Purpose: Normalizes n-gram frequency files by converting counts to probabilities.

How to Run:

python normalize/normalize_ngrams.py

What It Does:

Reads english_2grams.csv and english_3grams.csv
Calculates total frequency sum
Divides each frequency by total to get probability
Saves normalized versions to *_norm.csv files

When to Use: After updating frequency data files, or to create normalized versions for scoring.

Step-by-Step Usage Guide

Scenario 1: Decrypt Unknown Ciphertext

Place your ciphertext in cyphertext.txt
Run:
```
python main.py
```
Check decrypted_output.txt for results
If results are poor, try main_improve.py for better search

Scenario 2: Recover Key from Known Plaintext-Ciphertext

Option A: Exact Key Recovery

Create microcypheropen.txt:
- Line 1: ciphertext
- Line 2: plaintext
Run:
```
python restore_from_open.py
```
Check restored_key.txt for recovered key

Option B: Statistical Optimization

Create microcypheropen.txt (same format)
Run:
```
python find_key_statistical.py
```
Check restored_key.txt for statistically best key

Scenario 3: Update Frequency Data

Update CSV files with new frequency data
Run:
```
python normalize/normalize_ngrams.py
```
Normalized files will be used for scoring

How It Works

Standard Decryption Process

Preprocessing:
- Removes punctuation and spaces
- Converts to uppercase
- Treats J as I (Playfair convention)
- Records original formatting positions
Frequency Analysis:
- Counts digrams in ciphertext
- Compares to English bigram frequencies
- Identifies common patterns
Key Search:
- Starts with random key
- Mutates key (swaps letters)
- Decrypts with new key
- Scores decrypted text
- Accepts better keys (hill climbing)
- Sometimes accepts worse keys (simulated annealing)
Scoring:
- Uses English n-gram probabilities
- Monograms: single letter frequencies
- Bigrams: letter pair frequencies
- Trigrams: letter triple frequencies
- Higher score = more English-like
Output:
- Best key found
- Decrypted plaintext
- Frequency match analysis
- Formatted with original punctuation

Key Recovery Process

Pair Extraction:
- Processes plaintext-ciphertext pairs
- Extracts digram relationships
- Determines encryption rule (row/column/rectangle)
Constraint Building:
- For each pair, determines letter positions
- Builds constraints on key matrix
- Propagates constraints
Solving:
- Uses backtracking to resolve conflicts
- Tries all valid placements
- Verifies consistency
Verification:
- Tests recovered key
- Encrypts plaintext to verify ciphertext match

Requirements

Python 3.6+
CSV files with English frequency data:
- english_1grams.csv
- english_2grams.csv
- english_3grams.csv
- english_2grams_norm.csv (generated)
- english_3grams_norm.csv (generated)

Performance Notes

Standard Decryption: May take 5-15 minutes depending on ciphertext length
Key Recovery: Can be fast (< 1 minute) with sufficient pairs, or slow with few pairs
Statistical Optimization: Typically 5-10 minutes for thorough search

All scripts print progress updates during execution.

Troubleshooting

Problem: Decryption produces gibberish

Solution: Try main_improve.py or increase search iterations

Problem: Key recovery fails

Solution: Ensure you have enough plaintext-ciphertext pairs (20+ digrams recommended)

Problem: Statistical optimization has low accuracy

Solution: Increase number of trials or iterations in find_key_statistical.py

Problem: Missing CSV files

Solution: Ensure all frequency data files are present in the project directory

License

This project is for educational and research purposes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Playfair Cipher Decryption Project

Overview

Project Structure

Core Modules

`preprocessing.py`

`frequency_analysis.py`

`playfair_cipher.py`

`scoring.py`

`key_search.py`

Main Scripts

1. `main.py` - Standard Decryption

2. `main_improve.py` - Improved Decryption

3. `restore_from_open.py` - Key Recovery from Known Pairs

4. `find_key_statistical.py` - Statistical Optimization

Utility Scripts

`normalize/normalize_ngrams.py`

Step-by-Step Usage Guide

Scenario 1: Decrypt Unknown Ciphertext

Scenario 2: Recover Key from Known Plaintext-Ciphertext

Scenario 3: Update Frequency Data

How It Works

Standard Decryption Process

Key Recovery Process

Requirements

Performance Notes

Troubleshooting

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
normalize		normalize
README.md		README.md
cyphertext.txt		cyphertext.txt
decrypted_output.txt		decrypted_output.txt
english_1grams.csv		english_1grams.csv
english_2grams.csv		english_2grams.csv
english_2grams_norm.csv		english_2grams_norm.csv
english_3grams.csv		english_3grams.csv
english_3grams_norm.csv		english_3grams_norm.csv
find_key_statistical.py		find_key_statistical.py
frequency_analysis.py		frequency_analysis.py
key_search.py		key_search.py
main.py		main.py
main_improve.py		main_improve.py
microcypheropen.txt		microcypheropen.txt
microopen.txt		microopen.txt
playfair_cipher.py		playfair_cipher.py
preprocessing.py		preprocessing.py
restore_from_open.py		restore_from_open.py
scoring.py		scoring.py
suspect_key.txt		suspect_key.txt

mtiutin/playfaircryptoanalysis

Folders and files

Latest commit

History

Repository files navigation

Playfair Cipher Decryption Project

Overview

Project Structure

Core Modules

preprocessing.py

frequency_analysis.py

playfair_cipher.py

scoring.py

key_search.py

Main Scripts

1. main.py - Standard Decryption

2. main_improve.py - Improved Decryption

3. restore_from_open.py - Key Recovery from Known Pairs

4. find_key_statistical.py - Statistical Optimization

Utility Scripts

normalize/normalize_ngrams.py

Step-by-Step Usage Guide

Scenario 1: Decrypt Unknown Ciphertext

Scenario 2: Recover Key from Known Plaintext-Ciphertext

Scenario 3: Update Frequency Data

How It Works

Standard Decryption Process

Key Recovery Process

Requirements

Performance Notes

Troubleshooting

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`preprocessing.py`

`frequency_analysis.py`

`playfair_cipher.py`

`scoring.py`

`key_search.py`

1. `main.py` - Standard Decryption

2. `main_improve.py` - Improved Decryption

3. `restore_from_open.py` - Key Recovery from Known Pairs

4. `find_key_statistical.py` - Statistical Optimization

`normalize/normalize_ngrams.py`

Packages