Skip to content

Getting the exact number of frames from a stream #62

@mre-ableton

Description

@mre-ableton

Hi There !

We're investigating how to handle loading wav files whose size - as declared in the RIFF header - is incorrect and a lot larger than the data actually contained in the file.

In such cases, we would like to crop the sound to the actual data and ideally, be able to determine the exact size without having to read the whole file upfront.

As expected, ifstream::num_frames will return the value read in the header.

However, the documentation states:

num_frames may differ from the actual number of frames in the stream
as this information relies on the codec. The only way to obtain the exact
number of frames is by seeking to the end of stream and retrieving the frame position.

So, we're trying to use ifstream::frame_tellg but it seems to return the same value as num_frame.

Here's an example code that evaluates the number of frame using num_frames, frame_seekg and by reading the data,.

  // Build a ifstream
  const auto filePath = makeTestFilePath("BB3_100_drum_break_paprika.wav");
  auto stream = audio::ifstream{filePath.string()};

  // Get the number of frames reported
  const auto reportedFrameNum = stream.info().num_frames();

  // Position the stream at the end
  stream.frame_seekg(0, std::ios_base::end);
  const auto seekedNumFrame = size_t(stream.frame_tellg());

  // Read the data until exhaustion to get the actual data contained in the file
  const auto dataFrameNum = [&]() {

    stream.frame_seekg(0, std::ios_base::beg);
    size_t frameCount{0};

    constexpr std::size_t kNiMediaReadChunkSize = 4096 / sizeof(float);
    const auto samplesPerChunk =
      std::min<std::size_t>(kNiMediaReadChunkSize, reportedFrameNum);
    std::vector<float> data(samplesPerChunk * stream.info().num_channels(), 0.0f);

    while (stream.read((char*)data.data(), std::streamsize(samplesPerChunk)))
    {
      frameCount += stream.frame_gcount();
    }

    return frameCount;
  }();

  std::cout << "Number of frames returning by num_frames " << reportedFrameNum
            << std::endl;
  std::cout << "Number of frames in the file " << dataFrameNum << std::endl;
  std::cout << "Number of frames from seeking " << seekedNumFrame << std::endl;

  CHECK(seekedNumFrame == dataFrameNum);

The output will be

Number of frames returning by num_frames 423360
Number of frames in the file 98090
Number of frames from seeking 423360

our expectation would be that seeking would also return 98090.

Is this the api we're supposed to use ? is there another way ?

Cheers.

PS: Here's the file this test was ran with BB3_100_drum_break_paprika.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions