Parallel Processing

Parallel Processing Guide

Overview

The memoria.py script supports parallel processing when using the --originals flag, allowing you to process multiple export directories simultaneously for significant performance improvements.

New in this version: Automatic upload queuing ensures Immich uploads happen sequentially while processing continues in parallel. See Upload Queuing for details.

Usage

Basic Parallel Processing

# Process 2 exports in parallel
./memoria.py --originals /path/to/exports -o /path/to/output --parallel-exports 2

# Process 4 exports in parallel
./memoria.py --originals /path/to/exports -o /path/to/output --parallel-exports 4

With Worker Adjustment

# 2 exports with 7 workers each = 16 total processes (good for 16-core CPU)
./memoria.py --originals /path/to/exports -o /path/to/output --parallel-exports 2 --workers 7

# 4 exports with 3 workers each = 16 total processes (balanced)
./memoria.py --originals /path/to/exports -o /path/to/output --parallel-exports 4 --workers 3

Performance Considerations

CPU Core Allocation

The script automatically calculates total process count and warns if you're over-subscribing:

Total Processes = parallel_exports × (workers_per_export + 1)

Example on 16-core system:

--parallel-exports 2 --workers 7 → 2 × 8 = 16 processes ✓ Good
--parallel-exports 4 --workers 3 → 4 × 4 = 16 processes ✓ Good
--parallel-exports 4 --workers 15 → 4 × 16 = 64 processes ✗ Over-subscribed

Over-Subscription Warnings

If you exceed CPU cores by 50%, the script will display:

WARNING: Running 4 exports with 15 workers each
         will create ~64 processes on 16 CPU cores
         This may cause performance degradation due to over-subscription.

SUGGESTION: Try --parallel-exports 4 --workers 3
            or reduce --parallel-exports to 1

When to Use Parallel Processing

Good use cases:

Processing multiple large export directories
I/O-bound operations (file copying, EXIF reading/writing)
System with fast NVMe SSD storage
Plenty of available RAM

When sequential is better:

Single export directory (use --workers for internal parallelism)
Limited RAM (each process needs 100MB-1GB depending on export size)
HDD storage (parallel I/O can cause thrashing)
Network storage (bandwidth limitations)

Recommended Configurations

Conservative (safest)

# 2 exports in parallel, moderate workers per export
./memoria.py --originals /exports -o /output --parallel-exports 2 --workers 7

Balanced (recommended)

# 4 exports in parallel, fewer workers per export
./memoria.py --originals /exports -o /output --parallel-exports 4 --workers 3

Aggressive (I/O-heavy workloads)

# 4 exports with 4 workers (mild over-subscription is OK for I/O bound work)
./memoria.py --originals /exports -o /output --parallel-exports 4 --workers 4

Expected Performance Gains

Example scenario: Processing 5 exports, each taking 10 minutes sequentially

Configuration	Total Time	Speedup
Sequential (default)	50 minutes	1.0x
`--parallel-exports 2`	~25-28 minutes	1.8-2.0x
`--parallel-exports 4`	~15-18 minutes	2.8-3.3x
`--parallel-exports 5`	~12-15 minutes	3.3-4.2x

Actual speedup depends on:

Disk I/O throughput
CPU availability
Memory bandwidth
Export sizes and types
Network speed (if uploading to Immich)

Technical Details

Implementation

Uses ProcessPoolExecutor from concurrent.futures
Each export runs in a separate process
Processors are reloaded in each worker process
Output is organized per-export to avoid conflicts
Detection cache is disabled in parallel mode

Memory Usage

Each parallel export process: ~100MB-1GB base + export data
Example: 4 parallel exports ≈ 0.4-4GB additional RAM

Output Handling

Each export gets its own subdirectory: output/export-name/
Progress is printed as exports complete
Final summary shows all results

Troubleshooting

Performance is worse with parallel processing

Reduce --parallel-exports (try 2 instead of 4)
Increase --workers per export
Check disk I/O utilization (might be saturated)
Check RAM usage (might be swapping)

"Too many open files" error

Increase system limits:

ulimit -n 4096

Interleaved output

This is normal with parallel processing. Each export's output is still grouped together, but multiple exports may print simultaneously.

Out of memory

Reduce --parallel-exports
Reduce --workers
Process exports sequentially

Additional Examples

Auto-calculate workers

# Script automatically uses (CPU_COUNT - 1) workers per export
./memoria.py --originals /exports -o /output --parallel-exports 2

With Immich upload

# Parallel processing with automatic Immich uploads
./memoria.py --originals /exports -o /output \
  --parallel-exports 2 --workers 7 \
  --immich-url http://localhost:2283 \
  --immich-key YOUR_API_KEY

Verbose logging

# Parallel processing with verbose logs
./memoria.py --originals /exports -o /output \
  --parallel-exports 2 --verbose

Parallel Processing

Parallel Processing Guide

Overview

Usage

Basic Parallel Processing

With Worker Adjustment

Performance Considerations

CPU Core Allocation

Over-Subscription Warnings

When to Use Parallel Processing

Recommended Configurations

Conservative (safest)

Balanced (recommended)

Aggressive (I/O-heavy workloads)

Expected Performance Gains

Technical Details

Implementation

Memory Usage

Output Handling

Troubleshooting

Performance is worse with parallel processing

"Too many open files" error

Interleaved output

Out of memory

Additional Examples

Auto-calculate workers

With Immich upload

Verbose logging

Related Documentation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Memoria Documentation

Getting Started

Platform Guides

Advanced Topics

Reference

Tools & Development

Clone this wiki locally