Lesson 12: Voice system-sphinx

Robot voice system is divided into two parts of speech recognition and speech synthesis, ROS voice recognition package using sphinx, and voice synthesis using source_play

1.Speech Recognition

In the ROS voice recognition package using the sphinx, sphinx is not installed in ROS kinetic this version , you need to manually install.

the installation process is as follows:

1.1 installation

1.1.1First install the following dependent packages

sudo apt-get install ros-kinetic-audio-common
sudo apt-get install libasound2
sudo apt-get install gstreamer0.10-*
sudo apt-get install python-gst0.10

1.1.2.install libsphinxbase1

Since Diego uses a raspberry platform, please download the armhf version
After the implementation of the download

sudo dpkg -i libsphinxbase1_0.8-6_amdhf.deb

1.1.3.install libpocketsphinx1

Also download armhf version, after the completion of the implementation of the download

sudo dpkg -i libpocketsphinx1_0.8-5_amdhf.deb

1.1.4.install gstreamer0.10-pocketsphinx

Also download armhf version, after the completion of the implementation of the download

sudo dpkg -i gstreamer0.10-pocketsphinx_0.8-5_amdhf.deb

1.1.5.install pocketsphinx

Go to the working directory and clone the git directory

cd ~/catkin_ws/src
Git clone https://github.com/mikeferguson/pocketsphinx

1.1.6.Download English voice packs pocketsphinx-hmm-en-tidigits (0.8-5)


In the package pocketsphinx below to build a model directory, store the voice model file

cd ~/catkin_ws/src/pocketsphinx
mkdir model

Will download a good voice file, after decompression, the model file under which all the files copied to ~ / catkin_ws / src / pocketsphinx / model under


Create a new launch folder in the ~ / catkin_ws / src / pocketsphinx directory and create the diego_voice_test.launch file

cd ~/catkin_ws/src/pocketsphinx
mkdir launch
vi diego_voice_test.launch



  <node name="recognizer" pkg="pocketsphinx" type="recognizer.py" output="screen">
    <param name="lm" value="$(find pocketsphinx)/model/lm/en/tidigits.DMP"/>
    <param name="dict" value="$(find pocketsphinx)/model/lm/en/tidigits.dic"/>
    <param name="hmm" value="$(find pocketsphinx)/model/hmm/en/tidigits"/>


1.3.Modify recognizer.py

In the def init (self): function to increase the hmm parameter read

def __init__(self):
        # Start node

        self._device_name_param = "~mic_name"  # Find the name of your microphone by typing pacmd list-sources in the terminal
        self._lm_param = "~lm"
        self._dic_param = "~dict"
        self._hmm_param = "~hmm" 

        # Configure mics with gstreamer launch config
        if rospy.has_param(self._device_name_param):
            self.device_name = rospy.get_param(self._device_name_param)
            self.device_index = self.pulse_index_from_name(self.device_name)
            self.launch_config = "pulsesrc device=" + str(self.device_index)
            rospy.loginfo("Using: pulsesrc device=%s name=%s", self.device_index, self.device_name)
        elif rospy.has_param('~source'):
            # common sources: 'alsasrc'
            self.launch_config = rospy.get_param('~source')
            self.launch_config = 'gconfaudiosrc'

        rospy.loginfo("Launch config: %s", self.launch_config)

        self.launch_config += " ! audioconvert ! audioresample " \
                            + '! vader name=vad auto-threshold=true ' \
                            + '! pocketsphinx name=asr ! fakesink'

        # Configure ROS settings
        self.started = False
        self.pub = rospy.Publisher('~output', String)
        rospy.Service("~start", Empty, self.start)
        rospy.Service("~stop", Empty, self.stop)

        if rospy.has_param(self._lm_param) and rospy.has_param(self._dic_param):
            rospy.logwarn("lm and dic parameters need to be set to start recognizer.")

在def start_recognizer(self):函数hmm参数的代码,如下

    def start_recognizer(self):
        rospy.loginfo("Starting recognizer... ")

        self.pipeline = gst.parse_launch(self.launch_config)
        self.asr = self.pipeline.get_by_name('asr')
        self.asr.connect('partial_result', self.asr_partial_result)
        self.asr.connect('result', self.asr_result)
        self.asr.set_property('configured', True)
        self.asr.set_property('dsratio', 1)

        # Configure language model
        if rospy.has_param(self._lm_param):
            lm = rospy.get_param(self._lm_param)
            rospy.logerr('Recognizer not started. Please specify a language model file.')

        if rospy.has_param(self._dic_param):
            dic = rospy.get_param(self._dic_param)
            rospy.logerr('Recognizer not started. Please specify a dictionary.')
        if rospy.has_param(self._hmm_param):
            hmm = rospy.get_param(self._hmm_param)
            rospy.logerr('Recognizer not started. Please specify a hmm.')

        self.asr.set_property('lm', lm)
        self.asr.set_property('dict', dic)
        self.asr.set_property('hmm', hmm)       

        self.bus = self.pipeline.get_bus()
        self.bus_id = self.bus.connect('message::application', self.application_message)
        self.started = True

1.4.start shpinx

roslaunch pocketsphinx diego_voicd_test.launch

Now you can talk to your robot, pay attention to the words in the voice model dictionary

Sphinx for a specific voice environment recognition is good, but once the environment changes, with different noise, the recognition rate will be significantly reduced, which is now the voice recognition technology is facing common problems

2. Speech synthesis

In the ROS has been integrated in a complete voice synthesis package source_play, only supports English voice synthesis, the implementation of the following order, you can test

rosrun sound_play soundplay_node.py
rosrun sound_play say.py "hi, i am diego."

In this section, the voice system has been built, in the subsequent chapters will use the voice system to build other applications.

Leave a Reply

Scroll to top
%d bloggers like this: