Welcome to the Speech2Model project! This program converts spoken descriptions into detailed 3D models using the Meshy.ai API and a custom speech-to-text pipeline.
The application enables you to:
- Transcribe your speech using a microphone.
- Generate detailed 3D modeling prompts with the help of a language model.
- Create and download 3D models directly from Meshy.ai.
- View the generated models immediately.
- Live Transcription: Use your microphone to describe the 3D model.
- Intelligent Prompt Generation: Automatically enhance vague descriptions into detailed prompts for 3D modeling.
- Real-time Feedback: Generate and download your 3D models in real time.
Ensure you have the following installed:
- Ollama with LLM (For sanitizing and improving 3D prompt)
- Python 3.8 or later
- Required Python libraries (install via pip):
pip install speechrecognition requests
git clone https://github.com/jsammarco/Speech2Model.git
cd Speech2Model- Obtain your Meshy.ai API key.
- Replace
YOUR MESHY.AI KEYin the code with your actual API key.
Run the script to start the live transcription:
python speech2model.py- Describe Your Model: Speak into your microphone to describe the 3D model.
- Initiate Model Creation: Say "Create Model" when you're done describing.
- Prompt Generation: The program generates a detailed modeling prompt.
- Model Download: The 3D model is created and saved as a
.glbfile. - View the Model: The model opens automatically with your system's default viewer.
- Say: "A futuristic car with sleek design and neon lights."
- Say: "Create Model."
- The program processes your input and generates the model.
speech2model.py: Main program file..glb: Generated 3D model files will be saved here.
Modify the following parameters as needed:
- Meshy.ai API key: Replace in the
api_keyvariable. - Polling Interval: Adjust
poll_intervalfor checking model status. - Default Output File: Change
output_fileto specify where models are saved.
- Ensure your microphone is working and permissions are granted.
- Verify your Meshy.ai API key is valid.
- Check your internet connection for API requests.
Feel free to fork this repository and submit pull requests to contribute improvements.
This project is licensed under the MIT License.
For any questions or issues, please reach out via the GitHub repository: Speech2Model
