RemoAI-QH

A comprehensive AI Personal Assistant with voice input, 24/7 listening capable using AnythingLLM and Whisper, featuring human like conversation experience and NPU acceleration capabilities.

Table of Contents

Purpose
Implementation
Setup
Usage
Troubleshooting
Contributing
Code of Conduct

Purpose

Remois an extensible AI personal Assistant platform designed for privacy-first, local AI interactions. The application integrates a friendly human like conversations through voice recognition,24/7 listening, chat, text-to-speech capabilities, and NPU acceleration for optimal performance. Built with AnythingLLM for LLM functionality and OpenAI Whisper for speech recognition, it provides a complete conversational AI experience.

Key features include:

AI Persona: Friendly and engaging Remo personality which doesn't behave like any other chatbots, llms over there. As a personal assistant it understands rather working like a query based model.
Voice Integration: Real-time speech-to-text and text-to-speech
NPU Acceleration: Optimized for Snapdragon X Elite and other NPU-enabled hardware
Privacy-First: Local processing
Cross-Platform: Electron-based desktop application

Implementation

This application was designed to be platform-agnostic with optimizations for NPU-enabled hardware. Performance may vary on different hardware configurations.

Hardware

Machine: Dell lattitude 7455
Chip: Snapdragon X Elite, Intel, AMD
OS: Windows 11
Memory: 32 GB

Software

Node.js Version: 16.0.0+
Python Version: 3.8+
AnythingLLM LLM Provider: AnythingLLM NPU (or Qualcomm QNN for older versions)
AnythingLLM Chat Model: Llama 3.2 8B Chat 8K
Frontend: Electron with modern web technologies
Backend: Python Flask API with unified endpoints

Architecture

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   Electron UI   │◄──►│   Flask API      │◄──►│   AnythingLLM   │
│                 │    │                  │    │                 │
│ • Voice UI      │    │ • Chat Client    │    │ • LLM Provider  │
│ • Audio I/O     │    │ • Whisper API    │    │ • Workspace     │
│                 │    │ • TTS Service    │    │ • Memory        │
└─────────────────┘    └──────────────────┘    └─────────────────┘

Setup Prerequisites

System Requirements

Node.js 16+ - Download here
Python 3.8+ - Download here
Git - Download here
Audio System - Working microphone and speakers/headphones

Core Dependencies 5. AnythingLLM - Download and setup AnythingLLM 6. OpenAI Whisper - For speech-to-text functionality 7. eSpeak/eSpeak-ng - For text-to-speech functionality

Platform-Specific Dependencies

Windows:

Visual C++ Build Tools (for PyAudio compilation)
Chocolatey or winget (for eSpeak installation)

Step-by-Step Setup

Phase 1: System Dependencies Installation

Install Platform-Specific Dependencies

Windows:

# Install Visual C++ Build Tools (required for PyAudio)
# Download from: https://visualstudio.microsoft.com/visual-cpp-build-tools/

# Install Chocolatey (if not already installed)
Set-ExecutionPolicy Bypass -Scope Process -Force
[System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072
iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))

# Install eSpeak via Chocolatey
choco install espeak -y

Phase 2: AnythingLLM Setup

Install and setup AnythingLLM
- Download and install AnythingLLM from https://anythingllm.com/
- Choose AnythingLLM NPU when prompted to choose an LLM provider to target the NPU
- Choose a model of your choice (recommended: Llama 3.2 8B Chat 8K)
- Create a workspace by clicking "+ New Workspace"
Generate an API key
- Click the settings button on the bottom of the left panel
- Open the "Tools" dropdown
- Click "Developer API"
- Click "Generate New API Key"
- Copy and save your API key

Phase 3: RemoAI-QH Installation

Clone and setup the repository

# Clone the repository
git clone https://github.com/RemoAI-LLC/RemoAI-QH.git
cd RemoAI-QH

Install Node.js dependencies

# Install frontend dependencies
npm install

Install Python dependencies

# This will create virtual environment and install all Python packages
npm run setup:python

# Alternative: Manual Python setup
cd llm
python -m venv npu-chatbot-env

# Activate virtual environment
# Windows:
npu-chatbot-env\Scripts\activate
# macOS/Linux:
source npu-chatbot-env/bin/activate

# Install Python dependencies
pip install -r requirements.txt

Install and verify eSpeak

# Run the automatic eSpeak installer
python tts/install_espeak.py

# Test eSpeak installation
espeak --version
# Should output: eSpeak text-to-speech: version 1.51 or similar

Install and verify Whisper

# Whisper is installed via requirements.txt, but you can verify:
cd openai-whisper
python test_whisper.py

Phase 4: Configuration

Configure the application

Edit llm/config.yaml with your settings:

api_key: "your-anythingllm-api-key-here"
listen_api_key: "your-listen-api-key-here"
model_server_base_url: "http://localhost:3001/api/v1"
workspace_slug: "your-workspace-slug"
stream: true
stream_timeout: 60

Get your workspace slug

# Run from the llm directory
cd llm
python src/workspaces.py
# Find your workspace and copy its slug from the output
# Add the slug to the workspace_slug variable in config.yaml

Phase 5: Testing and Verification

Test the complete setup

# Test the model server authentication
python llm/src/auth.py

# Test persona system
npm run persona:test

# Test TTS functionality
npm run tts:test

# Test Whisper integration
cd openai-whisper
python test_whisper.py

Start the application

# Start both backend and frontend
npm start

# Or start components individually:
# Backend only:
npm run start:backend-only

# Frontend only:
npm run start:frontend-only

Usage

You have multiple options to interact with the AI chatbot:

Desktop Application (Recommended)

# Start the full application (backend + frontend)
npm start

Troubleshooting

Common Issues

Audio Issues:

Ensure microphone permissions are granted
Check that eSpeak is properly installed: espeak --version
Verify audio drivers are up to date

API Connection Issues:

Verify AnythingLLM is running on http://localhost:3001
Check API key configuration in llm/config.yaml
Ensure workspace slug is correct

NPU Performance Issues:

Verify NPU drivers are installed
Check hardware compatibility
Monitor system resources during operation

Python Environment Issues:

Ensure virtual environment is activated
Reinstall dependencies: pip install -r requirements.txt
Check Python version compatibility (3.8+)

Getting Help

If you encounter issues not covered here:

Check the Issues page
Create a new issue with detailed error information
Include system specifications and error logs

Contributing

We welcome contributions to RemoAI-QH! Here's how you can help:

How to Contribute

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes and test thoroughly
Commit your changes: git commit -m 'Add amazing feature'
Push to the branch: git push origin feature/amazing-feature
Open a Pull Request

Development Guidelines

Follow existing code style and conventions
Add tests for new features
Update documentation as needed
Ensure all tests pass before submitting

Areas for Contribution

Voice Recognition: Improve accuracy and performance
TTS Integration: Add more voice options and languages
UI/UX: Enhance the user interface
NPU Optimization: Improve hardware acceleration
Documentation: Help improve guides and examples

Code of Conduct

Our Pledge

We are committed to providing a welcoming and inclusive environment for everyone, regardless of:

Age, body size, disability, ethnicity
Gender identity and expression
Level of experience, education
Nationality, personal appearance
Race, religion, sexual orientation

Expected Behavior

Use welcoming and inclusive language
Be respectful of differing viewpoints
Accept constructive criticism gracefully
Focus on what's best for the community
Show empathy towards other community members

Unacceptable Behavior

Harassment, trolling, or inappropriate comments
Personal attacks or political discussions
Public or private harassment
Publishing private information without permission
Other conduct inappropriate in a professional setting

Enforcement

Project maintainers are responsible for clarifying standards and taking appropriate action for any behavior they deem inappropriate. This may include warnings, temporary bans, or permanent bans.

License

MIT License - see LICENSE file for details.

Acknowledgments

OpenAI Whisper for speech recognition
AnythingLLM for LLM integration and NPU acceleration
Electron for cross-platform desktop application framework
Flask for Python API server
Gradio for web interface components

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
app-ui		app-ui
llm		llm
openai-whisper		openai-whisper
scripts		scripts
tts		tts
.gitignore		.gitignore
License		License
README.md		README.md
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RemoAI-QH

Troubleshooting

Common Issues

Getting Help

Contributing

How to Contribute

Development Guidelines

Areas for Contribution

Code of Conduct

Our Pledge

Expected Behavior

Unacceptable Behavior

Enforcement

License

Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

RemoAI-LLC/RemoAI-QH

Folders and files

Latest commit

History

Repository files navigation

RemoAI-QH

Troubleshooting

Common Issues

Getting Help

Contributing

How to Contribute

Development Guidelines

Areas for Contribution

Code of Conduct

Our Pledge

Expected Behavior

Unacceptable Behavior

Enforcement

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages