A Java library for voice-enabled conversational AI. Speak to an LLM through your browser and get intelligent spoken responses.
This project is in early alpha stage. Features are functional but may contain bugs and APIs may change.
This project was vibe coded, serving as a PoC for PhoneBlock -> "Enhance PhoneBlock-AB with Local AI for Intelligent Scam Call Conversations"
git clone https://github.com/SchulteDev/ConversationalAI4J.git
cd ConversationalAI4J
# Complete voice AI system
# First start NEEDS A FEW MINUTES to download the models
docker-compose up --build
# → http://localhost:8080// Text chat
ConversationalAI ai = ConversationalAI.builder()
.withOllamaModel("llama3.2:3b")
.build();
String response = ai.chat("Hello!");
// Voice chat
ConversationalAI voiceAI = ConversationalAI.builder()
.withOllamaModel("llama3.2:3b")
.withSpeech()
.build();
byte[] audioResponse = voiceAI.voiceChat(audioBytes);- Click microphone button to start recording
- Speak your message
- Click microphone again to stop and send
- AI responds with both text and synthesized speech
Pipeline: Browser Audio → FFmpeg Decoding → Whisper.cpp → Ollama → Piper TTS
Browser Compatibility: Works with modern browsers (Chrome, Firefox, Safari) using MediaRecorder API. Supports WebM/Opus and WAV formats with server-side FFmpeg decoding.
| Variable | Default | Purpose |
|---|---|---|
OLLAMA_BASE_URL |
http://localhost:11434 |
Ollama server |
SPEECH_ENABLED |
false |
Enable voice features |
WHISPER_MODEL_PATH |
/app/models/whisper/ggml-base.en.bin |
Whisper STT model path |
PIPER_MODEL_PATH |
/app/models/piper/en_US-amy-medium.onnx |
Piper TTS model path |
PIPER_CONFIG_PATH |
/app/models/piper/en_US-amy-medium.onnx.json |
Piper TTS config path |
See CONTRIBUTING.md for development setup and ARCHITECTURE.md for technical details.