A high-performance, GPU-accelerated binary entropy and complexity visualizer using Hilbert curves and advanced signal analysis techniques.
Apeiron provides interactive visual analysis of binary files through seven visualization modes, helping identify patterns, encrypted regions, compressed data, and structural anomalies in executables, firmware, and other binary formats.
apeiron.mp4
- Seven Visualization Modes: Comprehensive binary analysis from entropy to multi-scale complexity
- GPU Acceleration: Hardware-accelerated rendering via wgpu compute shaders (WGSL)
- Portable SIMD: High-performance algorithms using the
widecrate (AVX2 on x86_64, NEON on ARM64) - Progressive Rendering: Large files (100MB+) render instantly with background refinement
- Memory-Mapped I/O: Efficient handling of multi-gigabyte files without loading into RAM
- Tiered Compression: Adaptive Kolmogorov complexity using XZ (LZMA2) and Zstd based on file size
- Interactive Hex Inspector: Synchronized hex view with Hilbert curve region highlighting
- Real-time Analysis: Hover over any region to see detailed byte-level analysis
- Hilbert Curve Mapping: Space-filling curve preserves locality - nearby bytes appear as nearby pixels
- Pan & Zoom: Explore large files with smooth navigation
- Cross-Platform: macOS (Apple Silicon & Intel), Linux, and Windows
Maps file bytes to a Hilbert curve with forensic color coding based on byte characteristics:
- Blue: Null bytes / padding / zeroes
- Cyan: ASCII text regions
- Green: Code / machine instructions
- Red/Orange: High entropy (compressed/encrypted data)
The Hilbert curve preserves spatial locality, meaning bytes that are close together in the file appear close together in the visualization.
A recurrence plot from nonlinear dynamics theory. Each pixel (x,y) shows the similarity between the byte window at position x and position y:
- Diagonal lines: Repeating patterns or sequences
- Vertical/horizontal lines: Laminar states (unchanged regions)
- Checkerboard patterns: Periodic structures
Uses SIMD-accelerated chi-squared distance with branchless division for real-time computation.
A 256x256 heatmap showing byte transition frequencies. X-axis is the source byte value, Y-axis is the following byte value:
- Bright regions: Frequently occurring byte pairs
- Dark regions: Rare or absent transitions
- Clusters: Reveal character set usage (ASCII, Unicode, binary patterns)
Computed with parallel thread-local histograms and SIMD merge operations.
Plots byte[i] vs byte[i+1] for all sequential bytes, colored by file position:
- Shows the file's "attractor" in phase space
- Reveals underlying data structure and patterns
- Position coloring shows how patterns evolve through the file
Approximates algorithmic complexity using a tiered compression system that adapts to file size:
- Purple/Blue: Low complexity - highly compressible (nulls, repetitive data)
- Teal/Green: Medium complexity - structured data
- Yellow/Orange: High complexity - compressed or complex data
- Red/Pink: Maximum complexity - encrypted or truly random data
Uses XZ (LZMA2) for small/medium files (best compression ratio) and Zstd for large files (high throughput).
Measures how much each region's byte distribution diverges from the file's overall distribution:
- Blue/Green: Normal regions matching file's typical byte distribution
- Yellow/Orange: Anomalous regions with unusual byte patterns
- Red: Highly anomalous - encrypted, compressed, or foreign data
JSD is symmetric and bounded [0,1], making it ideal for detecting embedded or injected content.
Refined Composite Multi-Scale Entropy (RCMSE) analysis revealing complexity across multiple time scales:
- Blue: Low multi-scale complexity (simple, regular patterns)
- Green/Yellow: Medium complexity (structured data)
- Orange/Red: High complexity across scales (complex or random data)
MSE distinguishes between different types of complexity - truly random data vs. complex but structured data.
Uses an optimized histogram-based fast approximation with O(n) complexity instead of O(n^2).
The right panel provides a synchronized hex view that:
- Shows bytes at the current cursor position
- Highlights the visible hex region on the Hilbert curve visualization
- Displays offset in both hex and decimal
- Shows ASCII representation alongside hex values
- Scrolls through the file with the visualization
Download the latest release for your platform from the Releases page:
apeiron-macos-arm64- macOS Apple Silicon (M1/M2/M3)apeiron-macos-x86_64- macOS Intelapeiron-linux-x86_64- Linux x86_64apeiron-windows-x86_64.exe- Windows x86_64
Requirements:
- Rust 1.70+ (install via rustup)
- On Linux:
libxcb,libxkbcommon,libgtk-3
# Clone the repository
git clone https://github.com/anomalyco/apeiron.git
cd apeiron
# Build release version (with LTO optimization)
cargo build --release
# Run
./target/release/apeiron# Debian/Ubuntu
sudo apt-get install libxcb-render0-dev libxcb-shape0-dev libxcb-xfixes0-dev libxkbcommon-dev libgtk-3-dev
# Fedora
sudo dnf install libxcb-devel libxkbcommon-devel gtk3-devel
# Arch Linux
sudo pacman -S libxcb libxkbcommon gtk3- Open a file: Drag and drop any binary file onto the window, or click "Open File..."
- Navigate:
- Scroll to zoom in/out
- Click and drag to pan
- Hover over regions to inspect bytes
- Switch modes: Use the Mode dropdown in the toolbar
- Reset view: Click "Reset View" to fit the visualization to the window
- Analyze: Review the entropy and complexity metrics in the right panel
| Action | Control |
|---|---|
| Zoom | Scroll wheel |
| Pan | Click and drag |
| Inspect | Hover over pixels |
| Open file | Drag & drop or "Open File..." button |
| Reset view | "Reset View" button |
| Help | "Help" button |
The right panel shows detailed information about the currently hovered byte position:
- File Type: Auto-detected via magic bytes (PE, ELF, Mach-O, ZIP, PDF, etc.)
- File Size: Human-readable size
- Offset (Hex): Current byte position in hexadecimal
- Offset (Dec): Current byte position in decimal
- Entropy: Shannon entropy (0-8 bits) with visual bar
- Interpretation: Low / Medium / High entropy classification
- Complexity: Compression ratio percentage
- Interpretation: Simple / Structured / Complex / Random
- Interactive hex dump with ASCII representation
- Scrollable through entire file
- Current position highlighted
- Region outline synced with visualization
Shannon entropy is calculated over a sliding window using SIMD-accelerated histogram counting:
H = -Σ p(x) * log₂(p(x))
where p(x) is the probability of byte value x in the window. Result ranges from 0 (uniform) to 8 bits (maximum entropy).
Optimizations:
- 4-way parallel histogram counting to avoid cache contention
- Cache-aligned (64-byte) histogram buffers
- True SIMD log2 approximation using IEEE 754 bit manipulation
- Dual accumulators for instruction-level parallelism
Complexity is approximated using a tiered compression system that adapts to file/chunk size:
| Tier | Size Range | Algorithm | Throughput |
|---|---|---|---|
| Streaming | <4KB | Zstd -1 | ~300 MB/s |
| 1 | 4KB-1MB | XZ -9 | ~0.9 MB/s |
| 2 | 1-64MB | XZ -6 | ~1.1 MB/s |
| 3 | 64MB-1GB | Zstd -19 | ~1.2 MB/s |
| 4 | 1-16GB | Zstd -8 | ~36 MB/s |
| 5 | 16-100GB | Zstd -1 | ~233 MB/s |
Pre-computed on demand when switching to Kolmogorov mode (sampled every 64 bytes with 128-byte windows).
Optimizations:
- XZ (LZMA2) for maximum compression ratio on small/medium data
- Zstd for high throughput on large files
- Background streaming computation with progress updates
- Lazy computation: only computed when user switches to KOL mode
JSD between window distribution P and file distribution Q:
JSD(P||Q) = ½ D_KL(P||M) + ½ D_KL(Q||M)
where M = ½(P + Q) and D_KL is Kullback-Leibler divergence.
Optimizations:
- SIMD f64x4 for 256-element distribution operations
- Fused mixture + KL computation reducing memory passes
- Dual accumulators for better ILP
Refined Composite Multi-Scale Entropy using a fast histogram-based approximation:
- Compute byte histograms for pattern counting (O(n) instead of O(n²))
- Sample sparse scales [1, 3, 6] instead of all scales 1-6
- Aggregate into complexity score
Optimizations:
- Histogram-based pattern matching (~50-100x speedup)
- O(n × 3) complexity instead of O(n² × 36)
- Lazy computation: only computed when user switches to MSE mode
The Hilbert curve dimension is chosen as the smallest power of 2 where n² >= file_size. This ensures all bytes can be mapped while maintaining the locality-preserving property.
Optimizations:
- Precomputed lookup tables for dimensions 64, 128, 256, 512 (O(1) access)
- Lazy initialization with
OnceLock - Batch conversion functions for SIMD-friendly processing
When available, visualization rendering uses wgpu compute shaders (WGSL) for parallel pixel generation:
- Hilbert: Computes d2xy transform and byte analysis per pixel
- Digraph: Parallel frequency counting with atomic operations
- Phase Space: Trajectory accumulation with position coloring
- Similarity Matrix: Chi-squared distance computation
Falls back to CPU (with rayon parallelization) for modes requiring CPU-side computation (KOL, JSD, MSE) or when GPU is unavailable.
Files larger than 100MB use a two-phase rendering approach:
- Coarse pass: ~10K hierarchical samples for instant preview
- Fine pass: Full precision sequential computation in background
The main thread reads computed values lock-free while the background thread refines data progressively.
- Large Files: 100MB+ files handled efficiently via viewport-aware rendering and memory-mapped I/O
- Lazy Computation: Kolmogorov and RCMSE maps computed on-demand when switching to those modes
- GPU Acceleration: Significant speedup for Hilbert, Digraph, Phase Space, and Similarity Matrix modes
- Portable SIMD: AVX2 on x86_64, NEON on ARM64 via the
widecrate - Texture Caching: Smart regeneration thresholds prevent excessive recomputation during navigation
- Memory Efficient: Streaming hex view renders only visible rows; mmap for file access
Apeiron automatically detects common file types via magic bytes:
| Category | Formats |
|---|---|
| Executables | PE (EXE/DLL), ELF, Mach-O (all variants) |
| Archives | ZIP, RAR, GZIP, BZIP2, 7-Zip, XZ |
| Images | PNG, JPEG, GIF, BMP, TIFF |
| Documents | |
| Media | MP4/MOV, WAV/AVI (RIFF), MP3 |
| Databases | SQLite |
| Other | Java CLASS, WebAssembly (WASM) |
- Malware Analysis: Identify packed/encrypted sections, detect suspicious entropy patterns
- Firmware Analysis: Find compressed regions, locate file systems, identify anomalies
- Forensics: Detect hidden data, identify file fragments, find injected content
- Reverse Engineering: Understand binary structure, locate interesting regions
- Data Recovery: Locate file boundaries in raw disk images
- Security Research: Analyze encryption patterns, study packing techniques
- CTF Competitions: Quickly identify steganography, hidden data, or unusual structures
- Lyda, R., & Hamrock, J. (2007). "Using Entropy Analysis to Find Encrypted and Packed Malware." IEEE Security & Privacy, 5(2), 40-45.
- Costa, M., et al. (2002). "Multiscale entropy analysis of complex physiologic time series." Physical Review Letters, 89(6).
- Hilbert, D. (1891). "Über die stetige Abbildung einer Linie auf ein Flächenstück." Mathematische Annalen, 38, 459-460.
MIT License - See LICENSE for details.