🖼️ imgshape

The Data-Centric AI Toolkit for Vision Engineers

"Automatically analyze any image dataset and get model-ready preprocessing recommendations in one command."

🚀 Live Demo (Web) • 📖 Documentation • 💬 Report Bug / Discuss

⚡ 30-Second Start

Don't guess your dataset's health. Audit it immediately with the Atlas engine.

pip install imgshape

from imgshape import Atlas

# 1. Initialize the Atlas Orchestrator
atlas = Atlas()

# 2. Extract deterministic fingerprint
result = atlas.extract_fingerprint("./my_dataset")

# 3. View the verdict
print(result.summary())

System Output:

{
  "fingerprint_id": "fp_8a7d9f2",
  "total_images": 4502,
  "corrupt_files": 12,
  "metrics": {
    "avg_resolution": "1024x768",
    "diversity_score": 0.89,
    "channel_consistency": "FAIL"
  },
  "issues": ["Found 14 grayscale images in RGB dataset"]
}

🔍 The Visual Dashboard (Atlas UI)

Experience imgshape's capabilities visually. The dashboard provides a real-time interface for dataset fingerprinting, augmentation previews, and pipeline configuration.

Dashboard v4.1.0 showing GPU acceleration status and drift detection.

🚀 Why imgshape?

Most vision models fail because of garbage data—corrupt files, mixed channels (RGBA vs RGB), or weird aspect ratios. imgshape catches these before you train using a deterministic rule engine.

Module	Technical Function
🔍 Instant Audit	Multi-threaded + GPU-accelerated scan for entropy, blur, and variance using `PyTorch`.
🧠 Decision Engine	Heuristic-based suggestion engine with Provenance IDs and Reproducibility Hashes.
📊 Comparison Layer	NEW: Drift Analysis and Similarity Indexing between dataset versions.
🛠️ Pipeline Export	Generates serialization-safe code for PyTorch, TensorFlow, and Albumentations.
🎨 Visual Studio	Local Web Dashboard for interactive augmentation testing and hypothesis verification.

📦 Installation Matrix

Choose your deployment flavor.

Command	Use Case	Size
`pip install imgshape`	Core / CI/CD	~12MB
`pip install "imgshape[full]"`	Research / Power User	~45MB
`pip install "imgshape[ui]"`	Interactive / Dashboard	~30MB

💡 Practical Use Cases

1. The "Sanity Check" (CI/CD Integration)

Block bad data from entering your training bucket. Ideal for GitHub Actions or Jenkins.

# Returns exit code 1 if corrupt files or schema violations are found
imgshape --check ./new_batch_v2 --strict-schema

2. The "Pipeline Builder"

Don't guess augmentation parameters. Let the entropy statistics decide.

# analyze -> recommend -> export PyTorch snippet
imgshape --path ./train_data --analyze --recommend --out transforms.py

3. The "Visual Explorer"

Verify RandomCrop or ColorJitter intensity manually before training.

# Launches local studio with auto-reload
imgshape --web --reload

🏗️ Architecture & Internal Mechanics

imgshape (Aurora Engine) operates on a Fingerprint-Analyze-Decide loop, acting as a middleware between raw storage and compute.

graph TD
    subgraph "Data Layer"
    A[Raw Images]
    end

    subgraph "imgshape Core (Atlas)"
    B[Fingerprint Extractor] -->|Hash & Meta| C{Decision Engine}
    C -->|Rules v4.0| D[Recommendation]
    end

    subgraph "Integration Layer"
    D --> E[PyTorch/TF Code]
    D --> F[JSON Artifacts]
    D --> G[HTML/PDF Reports]
    end

    A --> B

Core Components

Atlas Orchestrator: The central intent-driven API that manages the lifecycle of an analysis session.
Fingerprint Extractor: A stateless module that computes immutable signatures for datasets (distributions, channel counts, hashes).
Decision Engine: A rule-based system that maps dataset signatures + User Intent (e.g., "Speed" vs "Accuracy") to concrete preprocessing steps.

🤝 Community & Support

Issues: Found a bug? Open an issue.
Discussions: Feature requests? Join the discussion.

Built by Stifler for the AI Engineering community.

Star on GitHub ⭐ — it helps more people find clean data.

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
.github/workflows		.github/workflows
assets		assets
imgshape_v4_output		imgshape_v4_output
service		service
src/imgshape		src/imgshape
tests		tests
ui		ui
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
VERSION_v4.1		VERSION_v4.1
cloudbuild.yaml		cloudbuild.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🖼️ imgshape

The Data-Centric AI Toolkit for Vision Engineers

⚡ 30-Second Start

🔍 The Visual Dashboard (Atlas UI)

🚀 Why imgshape?

📦 Installation Matrix

💡 Practical Use Cases

1. The "Sanity Check" (CI/CD Integration)

2. The "Pipeline Builder"

3. The "Visual Explorer"

🏗️ Architecture & Internal Mechanics

Core Components

🤝 Community & Support

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

STiFLeR7/imgshape

Folders and files

Latest commit

History

Repository files navigation

🖼️ imgshape

The Data-Centric AI Toolkit for Vision Engineers

⚡ 30-Second Start

🔍 The Visual Dashboard (Atlas UI)

🚀 Why imgshape?

📦 Installation Matrix

💡 Practical Use Cases

1. The "Sanity Check" (CI/CD Integration)

2. The "Pipeline Builder"

3. The "Visual Explorer"

🏗️ Architecture & Internal Mechanics

Core Components

🤝 Community & Support

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages