Apeiron

A high-performance, GPU-accelerated binary entropy and complexity visualizer using Hilbert curves and advanced signal analysis techniques.

Apeiron provides interactive visual analysis of binary files through seven visualization modes, helping identify patterns, encrypted regions, compressed data, and structural anomalies in executables, firmware, and other binary formats.

apeiron.mp4

Features

Seven Visualization Modes: Comprehensive binary analysis from entropy to multi-scale complexity
GPU Acceleration: Hardware-accelerated rendering via wgpu compute shaders (WGSL)
Portable SIMD: High-performance algorithms using the wide crate (AVX2 on x86_64, NEON on ARM64)
Progressive Rendering: Large files (100MB+) render instantly with background refinement
Memory-Mapped I/O: Efficient handling of multi-gigabyte files without loading into RAM
Tiered Compression: Adaptive Kolmogorov complexity using XZ (LZMA2) and Zstd based on file size
Interactive Hex Inspector: Synchronized hex view with Hilbert curve region highlighting
Real-time Analysis: Hover over any region to see detailed byte-level analysis
Hilbert Curve Mapping: Space-filling curve preserves locality - nearby bytes appear as nearby pixels
Pan & Zoom: Explore large files with smooth navigation
Cross-Platform: macOS (Apple Silicon & Intel), Linux, and Windows

Visualization Modes

Hilbert Curve (HIL)

Maps file bytes to a Hilbert curve with forensic color coding based on byte characteristics:

Blue: Null bytes / padding / zeroes
Cyan: ASCII text regions
Green: Code / machine instructions
Red/Orange: High entropy (compressed/encrypted data)

The Hilbert curve preserves spatial locality, meaning bytes that are close together in the file appear close together in the visualization.

Similarity Matrix (SIM)

A recurrence plot from nonlinear dynamics theory. Each pixel (x,y) shows the similarity between the byte window at position x and position y:

Diagonal lines: Repeating patterns or sequences
Vertical/horizontal lines: Laminar states (unchanged regions)
Checkerboard patterns: Periodic structures

Uses SIMD-accelerated chi-squared distance with branchless division for real-time computation.

Byte Digraph (DIG)

A 256x256 heatmap showing byte transition frequencies. X-axis is the source byte value, Y-axis is the following byte value:

Bright regions: Frequently occurring byte pairs
Dark regions: Rare or absent transitions
Clusters: Reveal character set usage (ASCII, Unicode, binary patterns)

Computed with parallel thread-local histograms and SIMD merge operations.

Byte Phase Space (PHS)

Plots byte[i] vs byte[i+1] for all sequential bytes, colored by file position:

Shows the file's "attractor" in phase space
Reveals underlying data structure and patterns
Position coloring shows how patterns evolve through the file

Kolmogorov Complexity (KOL)

Approximates algorithmic complexity using a tiered compression system that adapts to file size:

Purple/Blue: Low complexity - highly compressible (nulls, repetitive data)
Teal/Green: Medium complexity - structured data
Yellow/Orange: High complexity - compressed or complex data
Red/Pink: Maximum complexity - encrypted or truly random data

Uses XZ (LZMA2) for small/medium files (best compression ratio) and Zstd for large files (high throughput).

Jensen-Shannon Divergence (JSD)

Measures how much each region's byte distribution diverges from the file's overall distribution:

Blue/Green: Normal regions matching file's typical byte distribution
Yellow/Orange: Anomalous regions with unusual byte patterns
Red: Highly anomalous - encrypted, compressed, or foreign data

JSD is symmetric and bounded [0,1], making it ideal for detecting embedded or injected content.

Multi-Scale Entropy (MSE)

Refined Composite Multi-Scale Entropy (RCMSE) analysis revealing complexity across multiple time scales:

Blue: Low multi-scale complexity (simple, regular patterns)
Green/Yellow: Medium complexity (structured data)
Orange/Red: High complexity across scales (complex or random data)

MSE distinguishes between different types of complexity - truly random data vs. complex but structured data.

Uses an optimized histogram-based fast approximation with O(n) complexity instead of O(n^2).

Interactive Hex Inspector

The right panel provides a synchronized hex view that:

Shows bytes at the current cursor position
Highlights the visible hex region on the Hilbert curve visualization
Displays offset in both hex and decimal
Shows ASCII representation alongside hex values
Scrolls through the file with the visualization

Installation

Pre-built Binaries

Download the latest release for your platform from the Releases page:

apeiron-macos-arm64 - macOS Apple Silicon (M1/M2/M3)
apeiron-macos-x86_64 - macOS Intel
apeiron-linux-x86_64 - Linux x86_64
apeiron-windows-x86_64.exe - Windows x86_64

Building from Source

Requirements:

Rust 1.70+ (install via rustup)
On Linux: libxcb, libxkbcommon, libgtk-3

# Clone the repository
git clone https://github.com/anomalyco/apeiron.git
cd apeiron

# Build release version (with LTO optimization)
cargo build --release

# Run
./target/release/apeiron

Linux Dependencies

# Debian/Ubuntu
sudo apt-get install libxcb-render0-dev libxcb-shape0-dev libxcb-xfixes0-dev libxkbcommon-dev libgtk-3-dev

# Fedora
sudo dnf install libxcb-devel libxkbcommon-devel gtk3-devel

# Arch Linux
sudo pacman -S libxcb libxkbcommon gtk3

Usage

Open a file: Drag and drop any binary file onto the window, or click "Open File..."
Navigate:
- Scroll to zoom in/out
- Click and drag to pan
- Hover over regions to inspect bytes
Switch modes: Use the Mode dropdown in the toolbar
Reset view: Click "Reset View" to fit the visualization to the window
Analyze: Review the entropy and complexity metrics in the right panel

Controls

Action	Control
Zoom	Scroll wheel
Pan	Click and drag
Inspect	Hover over pixels
Open file	Drag & drop or "Open File..." button
Reset view	"Reset View" button
Help	"Help" button

Data Inspector Panel

The right panel shows detailed information about the currently hovered byte position:

File Information

File Type: Auto-detected via magic bytes (PE, ELF, Mach-O, ZIP, PDF, etc.)
File Size: Human-readable size

Cursor Location

Offset (Hex): Current byte position in hexadecimal
Offset (Dec): Current byte position in decimal

Entropy Analysis

Entropy: Shannon entropy (0-8 bits) with visual bar
Interpretation: Low / Medium / High entropy classification

Kolmogorov Complexity

Complexity: Compression ratio percentage
Interpretation: Simple / Structured / Complex / Random

Hex View

Interactive hex dump with ASCII representation
Scrollable through entire file
Current position highlighted
Region outline synced with visualization

Technical Details

Entropy Calculation

Shannon entropy is calculated over a sliding window using SIMD-accelerated histogram counting:

H = -Σ p(x) * log₂(p(x))

where p(x) is the probability of byte value x in the window. Result ranges from 0 (uniform) to 8 bits (maximum entropy).

Optimizations:

4-way parallel histogram counting to avoid cache contention
Cache-aligned (64-byte) histogram buffers
True SIMD log2 approximation using IEEE 754 bit manipulation
Dual accumulators for instruction-level parallelism

Kolmogorov Complexity Approximation

Complexity is approximated using a tiered compression system that adapts to file/chunk size:

Tier	Size Range	Algorithm	Throughput
Streaming	<4KB	Zstd -1	~300 MB/s
1	4KB-1MB	XZ -9	~0.9 MB/s
2	1-64MB	XZ -6	~1.1 MB/s
3	64MB-1GB	Zstd -19	~1.2 MB/s
4	1-16GB	Zstd -8	~36 MB/s
5	16-100GB	Zstd -1	~233 MB/s

Pre-computed on demand when switching to Kolmogorov mode (sampled every 64 bytes with 128-byte windows).

Optimizations:

XZ (LZMA2) for maximum compression ratio on small/medium data
Zstd for high throughput on large files
Background streaming computation with progress updates
Lazy computation: only computed when user switches to KOL mode

Jensen-Shannon Divergence

JSD between window distribution P and file distribution Q:

JSD(P||Q) = ½ D_KL(P||M) + ½ D_KL(Q||M)

where M = ½(P + Q) and D_KL is Kullback-Leibler divergence.

Optimizations:

SIMD f64x4 for 256-element distribution operations
Fused mixture + KL computation reducing memory passes
Dual accumulators for better ILP

Multi-Scale Entropy (RCMSE)

Refined Composite Multi-Scale Entropy using a fast histogram-based approximation:

Compute byte histograms for pattern counting (O(n) instead of O(n²))
Sample sparse scales [1, 3, 6] instead of all scales 1-6
Aggregate into complexity score

Optimizations:

Histogram-based pattern matching (~50-100x speedup)
O(n × 3) complexity instead of O(n² × 36)
Lazy computation: only computed when user switches to MSE mode

Hilbert Curve

The Hilbert curve dimension is chosen as the smallest power of 2 where n² >= file_size. This ensures all bytes can be mapped while maintaining the locality-preserving property.

Optimizations:

Precomputed lookup tables for dimensions 64, 128, 256, 512 (O(1) access)
Lazy initialization with OnceLock
Batch conversion functions for SIMD-friendly processing

GPU Acceleration

When available, visualization rendering uses wgpu compute shaders (WGSL) for parallel pixel generation:

Hilbert: Computes d2xy transform and byte analysis per pixel
Digraph: Parallel frequency counting with atomic operations
Phase Space: Trajectory accumulation with position coloring
Similarity Matrix: Chi-squared distance computation

Falls back to CPU (with rayon parallelization) for modes requiring CPU-side computation (KOL, JSD, MSE) or when GPU is unavailable.

Progressive Rendering

Files larger than 100MB use a two-phase rendering approach:

Coarse pass: ~10K hierarchical samples for instant preview
Fine pass: Full precision sequential computation in background

The main thread reads computed values lock-free while the background thread refines data progressively.

Performance

Large Files: 100MB+ files handled efficiently via viewport-aware rendering and memory-mapped I/O
Lazy Computation: Kolmogorov and RCMSE maps computed on-demand when switching to those modes
GPU Acceleration: Significant speedup for Hilbert, Digraph, Phase Space, and Similarity Matrix modes
Portable SIMD: AVX2 on x86_64, NEON on ARM64 via the wide crate
Texture Caching: Smart regeneration thresholds prevent excessive recomputation during navigation
Memory Efficient: Streaming hex view renders only visible rows; mmap for file access

File Type Detection

Apeiron automatically detects common file types via magic bytes:

Category	Formats
Executables	PE (EXE/DLL), ELF, Mach-O (all variants)
Archives	ZIP, RAR, GZIP, BZIP2, 7-Zip, XZ
Images	PNG, JPEG, GIF, BMP, TIFF
Documents	PDF
Media	MP4/MOV, WAV/AVI (RIFF), MP3
Databases	SQLite
Other	Java CLASS, WebAssembly (WASM)

Use Cases

Malware Analysis: Identify packed/encrypted sections, detect suspicious entropy patterns
Firmware Analysis: Find compressed regions, locate file systems, identify anomalies
Forensics: Detect hidden data, identify file fragments, find injected content
Reverse Engineering: Understand binary structure, locate interesting regions
Data Recovery: Locate file boundaries in raw disk images
Security Research: Analyze encryption patterns, study packing techniques
CTF Competitions: Quickly identify steganography, hidden data, or unusual structures

References

Lyda, R., & Hamrock, J. (2007). "Using Entropy Analysis to Find Encrypted and Packed Malware." IEEE Security & Privacy, 5(2), 40-45.
Costa, M., et al. (2002). "Multiscale entropy analysis of complex physiologic time series." Physical Review Letters, 89(6).
Hilbert, D. (1891). "Über die stetige Abbildung einer Linie auf ein Flächenstück." Mathematische Annalen, 38, 459-460.

License

MIT License - See LICENSE for details.

Acknowledgments

Inspired by binary visualization research and tools like binvis.io and Veles
Uses egui for the immediate mode GUI
GPU compute via wgpu
Portable SIMD via wide
Compression via xz2 (LZMA2) and zstd
File dialogs via rfd
Parallel processing via rayon

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
papers		papers
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

License

19h/apeiron

Folders and files

Latest commit

History

Repository files navigation

Apeiron

Features

Visualization Modes

Hilbert Curve (HIL)

Similarity Matrix (SIM)

Byte Digraph (DIG)

Byte Phase Space (PHS)

Kolmogorov Complexity (KOL)

Jensen-Shannon Divergence (JSD)

Multi-Scale Entropy (MSE)

Interactive Hex Inspector

Installation

Pre-built Binaries

Building from Source

Linux Dependencies

Usage

Controls

Data Inspector Panel

File Information

Cursor Location

Entropy Analysis

Kolmogorov Complexity

Hex View

Technical Details

Entropy Calculation

Kolmogorov Complexity Approximation

Jensen-Shannon Divergence

Multi-Scale Entropy (RCMSE)

Hilbert Curve

GPU Acceleration

Progressive Rendering

Performance

File Type Detection

Use Cases

References

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 18

Packages 0

Languages

Packages