Using the OpenAI Realtime API can incur significant costs. I strongly advise against using personal savings to try this tool, as the expenses may and will outweigh the benefits. Proceed with caution and monitor your usage.
OpenAI Advanced Voice Python Assistant is a Python-based assistant that leverages OpenAI's Realtime API to facilitate real-time audio interactions. Designed for personal experimentation, this assistant supports live audio conversations, transcriptions of both user and assistant speech, and integrates tool calling capabilities for extended functionality. The project is inspired by and based on the openai-realtime-py repository. Tested on Win11, aims to be compatible with Linux.
- Real-time Voice Interaction: Live audio conversations with the assistant.
- Speech Transcription: Transcription of both user and assistant speech.
- Tool Calling: Invoke predefined tools during conversations.
- Interruptions Support: Interrupt the assistant while it is speaking.
- Debug Logging: Optionally log all incoming messages for debugging purposes.
- Python 3.7 or higher
- Virtual Environment (recommended)
-
Clone the Repository
git clone https://github.com/your-username/openai-advanced-voice-python.git cd openai-advanced-voice-python -
Create a Virtual Environment (optional)
-
On Windows:
python -m venv venv venv\Scripts\activate
-
On Linux:
python3 -m venv venv source venv/bin/activate
-
-
Install Dependencies
pip install -r requirements.txt
-
Environment Variables
-
Copy
.env.exampleto.env:cp .env.example .env
-
Open the
.envfile and configure the following variables:OPENAI_API_KEY=your-openai-keyOPENAI_API_KEY: Your OpenAI API key.
-
-
Configuration File
-
Copy
config.yaml.exampletoconfig.yaml:cp config.yaml.example config.yaml
-
Open the
config.yamlfile and adjust the settings as you see fit.
-
-
Activate Virtual Environment (optional)
-
On Windows:
venv\Scripts\activate
-
On Linux:
source venv/bin/activate
-
-
Run the Assistant
python realtime_assistant.py
-
Interact
- Speak into your microphone to communicate with the assistant.
- The assistant will respond in real-time with audio and text transcriptions.
The assistant supports tool calling, allowing it to perform specific functions during the conversation. Available tools include:
-
write_to_console
- Description: Writes a specified message to the console.
- Parameters:
message(string) – The message to write.
-
save_to_file
- Description: Saves user-specified content to a file.
- Parameters:
file_name(string): Name of the file without extension.file_extension(string): File extension (e.g., txt, md).file_content(string): Content to write into the file.
Additional tools can be added by extending the TOOLS list in the configuration.
If debug_mode is enabled in the config.yaml file, all incoming messages (excluding certain binary types) will be logged to incoming_messages.log. This is useful for debugging and monitoring the assistant's interactions.
By enabling allow_interruptions in the config.yaml file, the assistant allows users to interrupt its speech by keeping the microphone active.
OpenAI Advanced Voice Python Assistant is intended for personal use and experimentation. It is not production-ready and may not handle all edge cases correctly. Use at your own risk.
Additionally, using the OpenAI Realtime API can be costly. Be aware of rapid credit spending and monitor your usage to avoid unexpected charges.
This project is licensed under the MIT License.