EkaScribe Architecture Changes by divyesh11 · Pull Request #60 · eka-care/Eka-Scribe-Android

divyesh11 · 2025-11-05T09:30:34Z

feat: Integrate Silero VAD for voice activity detection

This commit integrates the Silero VAD (Voice Activity Detection) library to provide more robust speech detection capabilities.

Dependency Integration:
- Added the com.konovalov.vad:silero-vad dependency (libs.silero) to the project.
VAD Implementation:
- VADAnalyserImpl is updated to use the Silero Vad engine.
- The implementation now uses the vad.isSpeech() method to determine if incoming audio frames contain speech.
- It is configurable for VAD mode, sample rate, and frame size, with sensible defaults provided.

This commit introduces a new Android library module, `ekascribe_sdk`, establishing the foundational structure for audio processing and analysis. ### Key Changes: - **Module Scaffolding:** - Added the `ekascribe_sdk` Android library module. - Created the basic directory structure, including `build.gradle.kts`, `AndroidManifest.xml`, and Proguard rules. - Included the new module in `settings.gradle.kts`. - **Build & Dependency Configuration:** - Configured `build.gradle.kts` for the new module, setting up `compileSdk`, `minSdk`, and Java 17 compatibility. - Added initial dependencies for `androidx.core.ktx`, `androidx.appcompat`, and `gson`. - **Initial Audio & Manager Classes:** - Introduced core data models for audio processing: `AudioData`, `AudioSampleRate`, and `AudioFrameSize`. - Created placeholder classes for key components: - `VADAnalyserImpl` and its interface `VoiceActivityAnalyser` for voice activity detection. - `AudioDataManager` and `SessionManager` as singletons for future state management.

This commit integrates the Silero VAD (Voice Activity Detection) library to provide more robust speech detection capabilities. - **Dependency Integration:** - Added the `com.konovalov.vad:silero-vad` dependency (`libs.silero`) to the project. - **VAD Implementation:** - `VADAnalyserImpl` is updated to use the Silero `Vad` engine. - The implementation now uses the `vad.isSpeech()` method to determine if incoming audio frames contain speech. - It is configurable for VAD mode, sample rate, and frame size, with sensible defaults provided.

This commit refactors the audio analysis components by introducing a common interface and preparing for the integration of new analysis models. ### Key Changes: - **`VoiceActivityAnalyser` Interface:** - A new interface, `VoiceActivityAnalyser`, is introduced to abstract the audio analysis logic. It defines a single method, `analyseAudioData`, for processing `AudioData`. - **Refactored VAD Implementation:** - `VADAnalyserImpl` is updated to implement the new `VoiceActivityAnalyser` interface. - The `VadSilero` instance is now lazily initialized on its first use, ensuring it is only created when needed. - **New `AudioQualityAnalyserImpl`:** - A new placeholder class, `AudioQualityAnalyserImpl`, has been created. - It also implements the `VoiceActivityAnalyser` interface, paving the way for future audio quality analysis features. - **Dependency Updates:** - The ONNX Runtime dependency (`onnxruntime-android`) has been added to support upcoming machine learning model integrations. - The Silero VAD dependency (`libs.silero`) has been changed from `api` to `implementation` to better encapsulate it within the SDK.

divyesh11 added 2 commits November 4, 2025 19:39

divyesh11 self-assigned this Nov 5, 2025

divyesh11 added the work-in-progress label Nov 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EkaScribe Architecture Changes#60

EkaScribe Architecture Changes#60
divyesh11 wants to merge 3 commits intomainfrom
divyesh/ekascribe_architecture_changes

divyesh11 commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

divyesh11 commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant