Skip to content

Http support#44

Merged
mbsantiago merged 11 commits intomacaodha:mainfrom
kaviecos:http_support
May 16, 2025
Merged

Http support#44
mbsantiago merged 11 commits intomacaodha:mainfrom
kaviecos:http_support

Conversation

@kaviecos
Copy link
Contributor

Suggestion for implementing support for more ways to load audio data.
This should be backwards compatible.

There is one additional change to the api.process_file and du.process_file from the once we already discussed.

I've had to add a "file_id" argument. It is optional and uses the basename approach as a fallback if file_id is not provided. The reason is that we cannot get the basename of a byte array.

Copy link
Collaborator

@mbsantiago mbsantiago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kaviecos. Thanks so much for the PR. It looks pretty good and I left a few comments.


_file_id = file_id
if _file_id is None:
_file_id = os.path.basename(path) if isinstance(path, str) else "unknown"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be better to provide a unique ID. One option is to use the uuid.uuid4 function of the standard library, another is to use the md5 or sha256 hash of the file contents. I would lean on using the md5 hash, as it uniquely depends on the file contents, meaning that the same id would be used if the same file is uploaded/processed again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking at an implementation of _generate_id(path) like this:

def _generate_id(path:  Union[
        str, int, os.PathLike[Any], sf.SoundFile, audioread.AudioFile, BinaryIO
    ]) -> str:
    """ Generate an id based on the path.
    
    If the path is a str or PathLike it will parsed as the basename. 
    This should ensure backwards compatibility with previous versions.     
    """
    if isinstance(path, str) or isinstance(path, os.PathLike):
        return os.path.basename(path)
    elif isinstance(path, BinaryIO):
        hasher = hashlib.md5()
        path.seek(0)
        while chunk := path.read(8192):
            hasher.update(chunk)
        path.seek(0)
        return hasher.hexdigest()
    else:
        return str(uuid.uuid4())

But I don't really like the md5 part, as there is some inconsitency in the behaviour. I'd prefer either; raise an exception if path is not string or PathLike (as we cannot get the basename in this case). This would make it clear to users that they have to provide an id in this case.
or return a uuidv4 - this, as you mention, has the downside of returning different ids on different runs of the same data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I could add a function in api.py that takes a URL and calls du.process_file like this (file_id defaults to URL):

def process_url(
    url: str,
    model: DetectionModel = MODEL,
    config: Optional[ProcessingConfiguration] = None,
    device: torch.device = DEVICE,
    file_id: str | None = None
) -> du.RunResults:
    if config is None:
        config = CONFIG

    if file_id is None:
        file_id = url

    response = requests.get(url)
    response.raise_for_status()
    raw_audio_data = response.content

    return du.process_file(
        io.BytesIO(raw_audio_data),
        model,
        config,
        device,
        file_id
    )   

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add tests that :

  • processes a file loaded in a buffer using the modified process_file function, and ideally compares the output when using just the path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did. I also found a bug in get_samplerate. When the BytesIO object is read by librosa the cursor is moved, so when we call get_samplerate later we have to "seek(0)" to reset the cursor. I don't really like this, but I'm not sure I see a way around it.

@kaviecos
Copy link
Contributor Author

@mbsantiago Hi again. I found a better way to solve the get_samplerate issue.

I moved the implementation of load_audio to load_audio_data. load_audio_data returns the file_sample_rate which means we don't need the extra call to get_samplerate if we use load_audio_data.
The load_audio function retains its signature and code duplication is avoided since load_audio simply calls load_audio_data and discards the file_sample_rate before returning the exact same data as it did before.

And thanks a lot for taking your time to help!

@kaviecos
Copy link
Contributor Author

I fixed the syntax errors for Python3.9.

@mbsantiago
Copy link
Collaborator

Hi @kaviecos, sorry for the delay in the review. Had a couple of things cooking on the side, so had to take some time off, but I'm back. Thanks again for all the changes. They are all looking pretty good. Review in detail and merge soon after.

@kaviecos
Copy link
Contributor Author

No worries.
If you agree I'll remove the md5sum used for generating the id - I still think it's a bit messy. Using a uuidv4 is a nice fallback, or alternatively make it a requirement to provide an id for none-filename arguments and raise an error in that case. In the end this a detail that I believe is subject to personal preference.

@kaviecos
Copy link
Contributor Author

@mbsantiago Any chance you'll have time to look at this? We'll soon(ish) be looking at running a lot of files through batdetect2 and it would be nice to be on the main branch :)

Copy link
Collaborator

@mbsantiago mbsantiago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kaviecos,
Sorry for the big delay in getting back to this — I got sidetracked with a bunch of things that came up. I think the changes you’ve made look great, and based on my tests and review, I’m confident they won’t break any existing code. I’m happy to go ahead and merge them.

I’ve left a couple of very minor comments that I think could be helpful down the line. If you have a bit of time to take a look, I’d really appreciate it — let me know. Otherwise, we can go ahead and merge as is.

Huge thanks again for your work — I hope it’s useful for your use case, and I’m sure it’ll benefit others as well!

Cheers,
Santiago

def process_file(
audio_file: str,
path: Union[
str, int, os.PathLike[Any], sf.SoundFile, audioread.AudioFile, BinaryIO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe BinaryIO covers the sf.SoundFile and audioread.AudioFile cases as they comply with the BinaryIO interface, so no need to include them :).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've copied the definition from librosa.load(). I think it makes sense to use the same definition, though I'm not sure how int can be useful. But I'll remove SoundFile and AudioFile if you prefer.

path: Union[
       str, int, os.PathLike[Any], sf.SoundFile, audioread.AudioFile, BinaryIO
],

audio_file : str
Path to audio file.
path : Union[
str, int, os.PathLike[Any], sf.SoundFile, audioread.AudioFile, BinaryIO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would need updating. On a similar note, why is int expected here?

def load_audio(
audio_file: str,
path: Union[
str, int, os.PathLike[Any], sf.SoundFile, audioread.AudioFile, BinaryIO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a good idea to make a Type alias for the Path type somewhere and reuse in all these functions for consistency and easy modification.

# convert results to a dictionary in the right format
results = convert_results(
file_id=os.path.basename(audio_file),
file_id=_file_id,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I like the separation to a dedicated function


return results

def _generate_id(path: Union[
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

return results

def _generate_id(path: Union[
str, int, os.PathLike[Any], sf.SoundFile, audioread.AudioFile, BinaryIO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type alias

assert len(results["pred_dict"]["annotation"]) == 0

def test_process_file_file_id_defaults_to_basename():
"""Test that no detections are made above the nyquist frequency."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove or update the docstrings

assert id == basename

def test_bytesio_file_id_defaults_to_md5():
"""Test that no detections are made above the nyquist frequency."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove or update the docstrings

Copy link
Collaborator

@mbsantiago mbsantiago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kaviecos

@mbsantiago mbsantiago merged commit c10903a into macaodha:main May 16, 2025
4 checks passed
@kaviecos
Copy link
Contributor Author

@mbsantiago Sorry to bother you again, but I was just thinking if we should add an example for process_url to the README?

from batdetect2 import api
import io
import requests

AUDIO_URL = "https://anon.erda.au.dk/share_redirect/AHLqawaq1Y/BIOBD01_20240626_231634.wav"

# Process a whole file
results = api.process_url(AUDIO_URL)

# Or, load audio and compute spectrograms
audio = api.load_audio(io.BytesIO(requests.get(AUDIO_URL).content))
spec = api.generate_spectrogram(audio)

# And process the audio or the spectrogram with the model
detections, features, spec = api.process_audio(audio)
detections, features = api.process_spectrogram(spec)

(the URL in my example is temporary so we should provide another in the example, or just have AUDIO_URL = "<insert your url here>")

Thanks a lot for taking your time to review and accept this feature :)

@mbsantiago
Copy link
Collaborator

Hi again @kaviecos! For sure, that's a great suggestion. Please feel free to make a new PR with changes. Would be nice to do it that way to keep track of your contributions, but if you don't have time I can add it later. Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants