Fast JPEG Decoder

視訊壓縮期末專案

組員: 周哲瑋 (Wayne)，陳冠霖，許詠約

高效能 JPEG 解碼器實現，核心使用 C++ 開發並透過 pybind11 提供 Python API。專案包含 C++ 和 NumPy 兩種實現，用於比較不同實現方式的性能差異。

專案特點

⚡ 高效能: C++ 核心實現，比 NumPy 版本快 約 4.4 倍
🐍 Python 友好: 透過 pybind11 提供簡潔的 Python API
📚 教學價值: 包含詳細的 JPEG 解碼流程實現和文檔
🔧 可擴展: 模組化設計，便於後續優化（SIMD、多執行緒等）
📊 完整 Benchmark: 包含性能測試和 PSNR 品質驗證
✅ 高品質: PSNR 35+ dB，視覺上無失真

性能表現

基於標準測試圖片的結果：

圖片	C++ Decoder	NumPy Decoder	加速比
Lena (512×512)	67.50 ms	295.99 ms	4.38×
Images (183×275)	7.50 ms	33.09 ms	4.41×
Sample (64×64)	0.56 ms	2.05 ms	3.63×

品質驗證（PSNR vs PIL）:

C++ Decoder: 35.20 dB ✅ (良好)
NumPy Decoder: 35.15 dB ✅ (良好)
兩者均達到視覺無失真標準（> 30 dB）

詳細的 benchmark 結果請參考 BENCHMARK_RESULTS.md

快速開始

環境要求

Python: 3.8 或更高版本
NumPy: 任意版本
pybind11: 2.6.0 或更高版本
C++ 編譯器: 支援 C++17 (GCC 7+, Clang 5+, MSVC 2017+)

安裝

從原始碼安裝（推薦）

# 1. Clone 專案
git clone https://github.com/5000user5000/Fast-Jpeg-Decoder.git
cd Fast-Jpeg-Decoder

# 2. 安裝 Python 依賴
pip install numpy pybind11

# 3. 編譯並安裝 C++ 模組（開發模式）
make develop

# 或者使用 setup.py
pip install -e .

驗證安裝

import fast_jpeg_decoder as fjd
print(fjd.__version__)  # 應該輸出版本號

使用方法

基本用法

import fast_jpeg_decoder as fjd

# 方法 1: 從檔案路徑載入
image = fjd.load('photo.jpg')
print(image.shape)  # (height, width, 3)
print(image.dtype)  # uint8

# 方法 2: 從 bytes 載入
with open('photo.jpg', 'rb') as f:
    data = f.read()
image = fjd.load_bytes(data)

使用 Decoder 類別

import fast_jpeg_decoder as fjd

decoder = fjd.JPEGDecoder()
decoder.decode_file('photo.jpg')

print(f"Width: {decoder.width}")
print(f"Height: {decoder.height}")
print(f"Channels: {decoder.channels}")

image = decoder.get_image_data()

與其他庫整合

使用 Matplotlib 顯示圖片

import fast_jpeg_decoder as fjd
import matplotlib.pyplot as plt

image = fjd.load('photo.jpg')
plt.imshow(image)
plt.axis('off')
plt.show()

使用 OpenCV 處理圖片

import fast_jpeg_decoder as fjd
import cv2

# 解碼 JPEG
image = fjd.load('photo.jpg')

# OpenCV 使用 BGR，需要轉換
image_bgr = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

# 進行圖像處理
gray = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2GRAY)
cv2.imwrite('output_gray.jpg', gray)

保存為其他格式

import fast_jpeg_decoder as fjd
from PIL import Image

# 使用 Fast JPEG Decoder 解碼
image = fjd.load('input.jpg')

# 使用 PIL 保存為 PNG
Image.fromarray(image).save('output.png')

專案結構

Fast-Jpeg-Decoder/
├── src/
│   ├── cpp/                    # C++ 核心實現
│   ├── bindings/               # pybind11 綁定
│   └── python/                 # Python 包裝
├── python_implementations/     # 純 Python 實現
│   ├── __init__.py
│   └── numpy_decoder.py        # NumPy 版本解碼器
├── tests/
│   ├── test_data/              # 測試圖片
│   └── test_decoder.py         # 單元測試
├── benchmarks/
│   └── run_benchmark.py        # 性能測試與品質驗證
├── doc/
│   └── BENCHMARK_RESULTS.md    # Benchmark 結果文檔
├── output/                     # 解碼輸出結果（benchmark 生成）
├── example.py                  # 使用範例
├── Makefile                    # 建構腳本
├── setup.py                    # Python 安裝腳本
└── README.md                   # 本文件

開發與測試

編譯專案

# 開發模式（可編輯安裝）
make develop

# 清理編譯產物
make clean

# 重新編譯
make rebuild

執行測試

# 使用 Makefile
make test

# 或直接使用 pytest
pytest tests/ -v

# 執行特定測試
pytest tests/test_decoder.py::test_load_jpeg -v

執行 Benchmark

# 從專案根目錄執行
python benchmarks/run_benchmark.py

# 或從 benchmarks 目錄執行
cd benchmarks
python run_benchmark.py

輸出內容：

性能數據（解碼時間、加速比）
PSNR 品質指標（與 PIL 比較）

JPEG 解碼流程

本專案實現了完整的 JPEG Baseline DCT 解碼流程：

JPEG 檔案
    ↓
1. 解析檔案結構 (Parse Markers)
   - SOI (Start of Image)
   - DQT (Define Quantization Table)
   - SOF0 (Start of Frame - Baseline DCT)
   - DHT (Define Huffman Table)
   - SOS (Start of Scan)
    ↓
2. Huffman 解碼 (Huffman Decoding)
   - 使用霍夫曼表解碼位元流
   - DC 差分編碼
   - AC 遊程編碼 (RLE)
    ↓
3. 反 Zigzag 排序 (De-Zigzag)
   - 將一維數組轉換為 8×8 矩陣
    ↓
4. 反量化 (Dequantization)
   - 使用量化表恢復 DCT 係數
    ↓
5. 逆離散餘弦變換 (IDCT)
   - 從頻域轉換回空間域
    ↓
6. 色度上採樣 (Upsampling)
   - 處理 4:2:0, 4:2:2 子採樣
    ↓
7. 色彩空間轉換 (YCbCr → RGB)
   - 轉換為標準 RGB 格式
    ↓
解碼完成的圖片

技術亮點

C++ 實現

BitStream 處理: 32-bit 緩衝區機制，正確處理 byte stuffing (0xFF 0x00)
Huffman 解碼: 使用 hash map 快速查找
IDCT: 實現標準的 8×8 逆離散餘弦變換
pybind11 綁定: 零拷貝數據傳輸，高效的 Python/C++ 接口

NumPy 實現

向量化 IDCT: 使用矩陣運算加速計算
廣播機制: 利用 NumPy 的廣播進行批量處理
修復的關鍵問題: 解決了多個嚴重的實現錯誤
- ✅ 量化表 Zigzag 排列錯誤（PSNR 從 ~15 dB 提升到 35+ dB）
- ✅ 4:2:0 色度上採樣崩潰（現已支援所有子採樣模式）
- ✅ 數值精度問題（達到視覺無失真標準）

已知限制

支援的 JPEG 格式

✅ 支援:

Baseline DCT (SOF0)
色度子採樣: 4:4:4, 4:2:0, 4:2:2 ✅ 已修復
Huffman 編碼
標準量化表

❌ 不支援:

Progressive JPEG (漸進式)
Lossless JPEG (無損)
Arithmetic coding (算術編碼)
JPEG 2000
JPEG-LS

性能與工業標準的差距

實現	Lena (512×512)	vs libjpeg-turbo
本專案 C++	67.50 ms	~13× 慢
libjpeg-turbo	~5 ms	基準 (工業標準)

未來優化空間：

SIMD 指令集（AVX2）: 預期提升 4-8×
整數運算（Fixed-Point）: 預期提升 2-3×
多執行緒（OpenMP）: 預期提升接近 CPU 核心數
查表法（LUT）: 預期提升 1.5-2×

使用建議

✅ 推薦使用場景

學習 JPEG 原理: 代碼清晰，文檔完整
性能比較研究: C++ vs Python 的實際案例
原型開發: 快速驗證 JPEG 相關算法
教學用途: 理解圖像壓縮技術

⚠️ 不建議使用場景

生產環境: 請使用成熟的庫（libjpeg-turbo, PIL/Pillow）
完整 JPEG 支援: 本專案僅支援 Baseline DCT
關鍵應用: NumPy 實現性能較低（比 C++ 慢 4.4 倍）

文檔

README.md: 專案概述和快速開始（本文件）
BENCHMARK_RESULTS.md: 詳細的性能測試結果和正確性驗證

開發指南

遵循 C++17 標準
使用 Python PEP 8 編碼風格
添加單元測試覆蓋新功能
更新相關文檔

參考資料

JPEG 標準

ITU-T Recommendation T.81 - JPEG 標準文件
ISO/IEC 10918-1:1994 - 同上

技術文章

授權

本專案採用 MIT 授權 - 詳見 LICENSE 文件

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
doc		doc
python_implementations		python_implementations
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
example.py		example.py
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
setup.py		setup.py

License

5000user5000/Fast-Jpeg-Decoder

Folders and files

Latest commit

History

Repository files navigation