Open
Conversation
This commit introduces a new Android library module, `ekascribe_sdk`, establishing the foundational structure for audio processing and analysis.
### Key Changes:
- **Module Scaffolding:**
- Added the `ekascribe_sdk` Android library module.
- Created the basic directory structure, including `build.gradle.kts`, `AndroidManifest.xml`, and Proguard rules.
- Included the new module in `settings.gradle.kts`.
- **Build & Dependency Configuration:**
- Configured `build.gradle.kts` for the new module, setting up `compileSdk`, `minSdk`, and Java 17 compatibility.
- Added initial dependencies for `androidx.core.ktx`, `androidx.appcompat`, and `gson`.
- **Initial Audio & Manager Classes:**
- Introduced core data models for audio processing: `AudioData`, `AudioSampleRate`, and `AudioFrameSize`.
- Created placeholder classes for key components:
- `VADAnalyserImpl` and its interface `VoiceActivityAnalyser` for voice activity detection.
- `AudioDataManager` and `SessionManager` as singletons for future state management.
This commit integrates the Silero VAD (Voice Activity Detection) library to provide more robust speech detection capabilities. - **Dependency Integration:** - Added the `com.konovalov.vad:silero-vad` dependency (`libs.silero`) to the project. - **VAD Implementation:** - `VADAnalyserImpl` is updated to use the Silero `Vad` engine. - The implementation now uses the `vad.isSpeech()` method to determine if incoming audio frames contain speech. - It is configurable for VAD mode, sample rate, and frame size, with sensible defaults provided.
This commit refactors the audio analysis components by introducing a common interface and preparing for the integration of new analysis models.
### Key Changes:
- **`VoiceActivityAnalyser` Interface:**
- A new interface, `VoiceActivityAnalyser`, is introduced to abstract the audio analysis logic. It defines a single method, `analyseAudioData`, for processing `AudioData`.
- **Refactored VAD Implementation:**
- `VADAnalyserImpl` is updated to implement the new `VoiceActivityAnalyser` interface.
- The `VadSilero` instance is now lazily initialized on its first use, ensuring it is only created when needed.
- **New `AudioQualityAnalyserImpl`:**
- A new placeholder class, `AudioQualityAnalyserImpl`, has been created.
- It also implements the `VoiceActivityAnalyser` interface, paving the way for future audio quality analysis features.
- **Dependency Updates:**
- The ONNX Runtime dependency (`onnxruntime-android`) has been added to support upcoming machine learning model integrations.
- The Silero VAD dependency (`libs.silero`) has been changed from `api` to `implementation` to better encapsulate it within the SDK.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
feat: Integrate Silero VAD for voice activity detection
This commit integrates the Silero VAD (Voice Activity Detection) library to provide more robust speech detection capabilities.
Dependency Integration:
com.konovalov.vad:silero-vaddependency (libs.silero) to the project.VAD Implementation:
VADAnalyserImplis updated to use the SileroVadengine.vad.isSpeech()method to determine if incoming audio frames contain speech.