-
Notifications
You must be signed in to change notification settings - Fork 4
Getting Started
- Install missing packages:
sudo apt-get install espeaksudo apt-get install python-espeaksudo apt-get install python-pyaudio- You can test speak by running this command
espeak -v en "Hello i am espeak"
-
Install sphinxbase
-
Download the sphinxbase package from here: https://sourceforge.net/projects/cmusphinx/files/sphinxbase/5prealpha/
-
Set up sphinxbase by following the tutorial here. - Do not install from the github source files
-
Install pocketsphinx
-
Download the pocketsphinx package from here: https://sourceforge.net/projects/cmusphinx/files/pocketsphinx/5prealpha/
-
Set up pocketsphinx by following the tutorial here.
-
Set up the speech package
-
Add the following line to your ~/.bashrc file: -
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib -
"Install" the package by adding a link to the speech package from your workspace. - git clone the
hlpr_speech repointo ~/Software - e.g.ln -s ~/Software/hlpr_speech/ ~/YOUR_WORKSPACE_NAME/src/ -
Run
catkin_makefrom your workspace folder. -
Creating a dictionary for the speech commands
-
The speech_listener and speech_gui have a parameter where you specify a .yaml file containing the mapping between keywords and speech commands. The default is kps.yaml file, or if you make your own .yaml file put it in the hlpr_speech_recognition/data/ directory.
-
The speech_recognizer makes use of pocketsphinx and needs three files in the hlpr_speech_recognition/data directory. First is a list of each speech phrase to be recognized in a text file, this is expected to be kps.txt. If you want to add new commands, add them here and then do step three to create the other files necessary.
-
Go to the following link and generate the .dic and .lm file. Replace the previous files with the new ones and rename them 6858.dic and 6858.lm respectively.
http://www.speech.cs.cmu.edu/tools/lmtool-new.html
-
Running speech with mic input:
roslaunch hlpr_speech_recognition speech_rec_w_mic.launch -
Running speech with gui input:
roslaunch hlpr_speech_recognition speech_rec_w_gui.launch
***** NOTE: Only launch speech_recognizer or speech_gui. Do not launch them at the same time as they have the same functionality
See the main hlpr_documentation repository for information on getting started with the Speech Testing GUI.
If either rosrun command doesn't work, then try the following:
-
Check if your PYTHONPATH is set up right.
-
Go to hlpr_speech_recognition/bin/ folder and make the nodes executable by typing:
chmod u+x speech_recognizerchmod u+x speech_listenerchmod u+x speech_gui