Skip to content

[BUG] zstandard library is missing. CLAP analysis doesn't use fallback WAV files. #362

@sorcery88

Description

@sorcery88

I have some m4a files that are failing to read, and fallback processing is failing.

Here's an example as installed,

audiomuse-ai-worker  | [INFO]-[11-03-2026 22-19-26]-Downloaded 'Father, Guide Me, Teach Me; Can't Turn Me Around; It's A Shame;' to '/app/temp_audio/gLwaaiNqQGKTJRtXpjGmOa.m4a'
audiomuse-ai-worker  | [INFO]-[11-03-2026 22-19-26]-Starting analysis for: gLwaaiNqQGKTJRtXpjGmOa.m4a
audiomuse-ai-worker  | /app/tasks/analysis.py:212: UserWarning: PySoundFile failed. Trying audioread instead.
audiomuse-ai-worker  |   audio, sr = librosa.load(file_path, sr=target_sr, mono=True, duration=AUDIO_LOAD_TIMEOUT)
audiomuse-ai-worker  | /usr/local/lib/python3.12/dist-packages/librosa/core/audio.py:184: FutureWarning: librosa.core.audio.__audioread_load
audiomuse-ai-worker  |  Deprecated as of librosa version 0.10.0.
audiomuse-ai-worker  |  It will be removed in librosa version 1.0.
audiomuse-ai-worker  |   y, sr_native = __audioread_load(path, offset, duration, dtype)
audiomuse-ai-worker  | [WARNING]-[11-03-2026 22-19-26]-Direct librosa load failed for gLwaaiNqQGKTJRtXpjGmOa.m4a: float division by zero. Attempting fallback conversion.
audiomuse-ai-worker  | [ERROR]-[11-03-2026 22-19-26]-Fallback loading method also failed for gLwaaiNqQGKTJRtXpjGmOa.m4a: `zstandard` is required to use zstd compression
audiomuse-ai-worker  | [WARNING]-[11-03-2026 22-19-26]-Could not load a valid audio signal for gLwaaiNqQGKTJRtXpjGmOa.m4a after all attempts. Skipping track.
audiomuse-ai-worker  | [WARNING]-[11-03-2026 22-19-26]-Skipping track Father, Guide Me, Teach Me; Can't Turn Me Around; It's A Shame; by Dedicated Men Of Zion as analysis returned None.

I tried "pip install zstandard" to overcome the most obvious issue. Processing gets further - we get MusicCNN analysis. CLAP doesn't seem to use the generated WAV file from earlier though so still fails.

audiomuse-ai-worker  | [INFO]-[12-03-2026 00-14-49]-Downloaded 'Father, Guide Me, Teach Me; Can't Turn Me Around; It's A Shame;' to '/app/temp_audio/gLwaaiNqQGKTJRtXpjGmOa.m4a'
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-14-50]-Starting analysis for: gLwaaiNqQGKTJRtXpjGmOa.m4a
audiomuse-ai-worker  | /app/tasks/analysis.py:212: UserWarning: PySoundFile failed. Trying audioread instead.
audiomuse-ai-worker  |   audio, sr = librosa.load(file_path, sr=target_sr, mono=True, duration=AUDIO_LOAD_TIMEOUT)
audiomuse-ai-worker  | /usr/local/lib/python3.12/dist-packages/librosa/core/audio.py:184: FutureWarning: librosa.core.audio.__audioread_load
audiomuse-ai-worker  |  Deprecated as of librosa version 0.10.0.
audiomuse-ai-worker  |  It will be removed in librosa version 1.0.
audiomuse-ai-worker  |   y, sr_native = __audioread_load(path, offset, duration, dtype)
audiomuse-ai-worker  | [WARNING]-[12-03-2026 00-14-50]-Direct librosa load failed for gLwaaiNqQGKTJRtXpjGmOa.m4a: float division by zero. Attempting fallback conversion.
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-14-55]-Fallback: Pre-processing gLwaaiNqQGKTJRtXpjGmOa.m4a to a smaller WAV for safe loading...
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-14-57]-Fallback: Converted gLwaaiNqQGKTJRtXpjGmOa.m4a to temporary WAV for robust loading.
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-14-59]-CUDA provider not available - using CPU only
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-15-07]-SUCCESSFULLY ANALYZED 'Father, Guide Me, Teach Me; Can't Turn Me Around; It's A Shame; by Dedicated Men Of Zion' (ID: gLwaaiNqQGKTJRtXpjGmOa):
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-15-07]-  - Tempo: 125.00, Energy: 0.1007, Key: D minor
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-15-07]-  - Top Moods: {'rock': 0.5587335228919983, 'blues': 0.5374235510826111, 'soul': 0.5321893692016602, 'Hip-Hop': 0.5240907669067383, 'classic rock': 0.5204477906227112}
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-15-07]-  - Other Features: danceable:0.35,aggressive:0.11,happy:0.36,party:0.32,relaxed:0.73,sad:0.74
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-15-07]-  - Starting CLAP analysis for Father, Guide Me, Teach Me; Can't Turn Me Around; It's A Shame; by Dedicated Men Of Zion...
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-15-07]-Lazy-loading CLAP audio model on first use...
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-15-07]-Loading CLAP audio model from /app/model/clap_audio_model.onnx...
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-15-07]-CLAP Audio: Using ONNX Runtime automatic thread management
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-15-07]-CUDA provider not available - using CPU only
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-15-08]-✓ CLAP audio model loaded successfully (~268MB)
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-15-08]-✓ CLAP audio model initialized successfully (for music analysis)
audiomuse-ai-worker  | /app/tasks/clap_analyzer.py:578: UserWarning: PySoundFile failed. Trying audioread instead.
audiomuse-ai-worker  |   audio_data, sr = librosa.load(audio_path, sr=SAMPLE_RATE, mono=True, duration=AUDIO_LOAD_TIMEOUT)
audiomuse-ai-worker  | /usr/local/lib/python3.12/dist-packages/librosa/core/audio.py:184: FutureWarning: librosa.core.audio.__audioread_load
audiomuse-ai-worker  |  Deprecated as of librosa version 0.10.0.
audiomuse-ai-worker  |  It will be removed in librosa version 1.0.
audiomuse-ai-worker  |   y, sr_native = __audioread_load(path, offset, duration, dtype)
audiomuse-ai-worker  | [ERROR]-[12-03-2026 00-15-08]-Failed to load audio for CLAP analysis (/app/temp_audio/gLwaaiNqQGKTJRtXpjGmOa.m4a) with AUDIO_LOAD_TIMEOUT=600: float division by zero
audiomuse-ai-worker  | Traceback (most recent call last):
audiomuse-ai-worker  |   File "/app/tasks/clap_analyzer.py", line 578, in analyze_audio_file
audiomuse-ai-worker  |     audio_data, sr = librosa.load(audio_path, sr=SAMPLE_RATE, mono=True, duration=AUDIO_LOAD_TIMEOUT)
audiomuse-ai-worker  |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
audiomuse-ai-worker  |   File "/usr/local/lib/python3.12/dist-packages/librosa/core/audio.py", line 193, in load
audiomuse-ai-worker  |     y = resample(y, orig_sr=sr_native, target_sr=sr, res_type=res_type)
audiomuse-ai-worker  |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
audiomuse-ai-worker  |   File "/usr/local/lib/python3.12/dist-packages/librosa/core/audio.py", line 633, in resample
audiomuse-ai-worker  |     ratio = float(target_sr) / orig_sr
audiomuse-ai-worker  |             ~~~~~~~~~~~~~~~~~^~~~~~~~~
audiomuse-ai-worker  | ZeroDivisionError: float division by zero
audiomuse-ai-worker  | [INFO]-[12-03-2026 00-15-09]-✓ CLAP model(s) unloaded from memory (~268MB freed + GPU memory released)

Are my files very weird? Here's what ffmpeg -i says for the same file above:

ffmpeg version 4.1.8 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 8.5.0 (GCC)
  configuration: --prefix=/usr --incdir='${prefix}/include/ffmpeg' --arch=i686 --target-os=linux --cross-prefix=/usr/local/i686-pc-linux-gnu/bin/i686-pc-linux-gnu- --enable-cross-compile --enable-optimizations --enable-pic --enable-gpl --enable-shared --disable-static --disable-stripping --enable-version3 --enable-encoders --enable-pthreads --disable-protocols --disable-protocol=rtp --enable-protocol=file --enable-protocol=pipe --disable-muxer=image2 --disable-muxer=image2pipe --disable-swscale-alpha --disable-ffplay --disable-ffprobe --disable-doc --disable-devices --disable-bzlib --disable-altivec --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libmp3lame --disable-vaapi --disable-cuvid --disable-nvenc --disable-decoder=aac --disable-decoder=aac_fixed --disable-encoder=aac --disable-decoder=amrnb --disable-decoder=ac3 --disable-decoder=ac3_fixed --disable-encoder=zmbv --disable-encoder=dca --disable-decoder=dca --disable-encoder=ac3 --disable-encoder=ac3_fixed --disable-encoder=eac3 --disable-decoder=eac3 --disable-encoder=truehd --disable-decoder=truehd --disable-encoder=hevc_vaapi --disable-decoder=hevc --disable-muxer=hevc --disable-demuxer=hevc --disable-parser=hevc --disable-bsf=hevc_mp4toannexb --x86asmexe=yasm --cc=/usr/local/i686-pc-linux-gnu/bin/i686-pc-linux-gnu-wrap-gcc --enable-yasm --enable-libx264 --enable-encoder=libx264
  libavutil      56. 22.100 / 56. 22.100
  libavcodec     58. 35.100 / 58. 35.100
  libavformat    58. 20.100 / 58. 20.100
  libavdevice    58.  5.100 / 58.  5.100
  libavfilter     7. 40.101 /  7. 40.101
  libswscale      5.  3.100 /  5.  3.100
  libswresample   3.  3.100 /  3.  3.100
  libpostproc    55.  3.100 / 55.  3.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '20210116-dedicated-men-of-zion-tiny-desk-home-concert.m4a':
  Metadata:
    major_brand     : M4A
    minor_version   : 512
    compatible_brands: isomiso2
    title           : Father, Guide Me, Teach Me; Can't Turn Me Around; It's A Shame;
    artist          : Dedicated Men Of Zion
    album_artist    : Various Artists
    album           : Tiny Desk Concerts 2021
    encoder         : Lavf58.20.100
    track           : 6
  Duration: 00:13:00.54, start: 0.000000, bitrate: 113 kb/s
    Stream #0:0(eng): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, 112 kb/s (default)
    Metadata:
      handler_name    : SoundHandler

Expected behavior
zstandard is installed so fallback processing works
CLAP analysis uses the fallback .wav so it can also succeed

Environment

  • Docker image - ghcr.io/neptunehub/audiomuse-ai:latest
  • Ubuntu based docker compose install, WSL worker container also experiencing the same issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions