A pyaudio based audio_common with text to speech for ROS 2
MIT License
This repositiory provides a set of ROS 2 packages for audio. It provides a Python version to capture and play audio data using pyaudio.
$ cd ~/ros2_ws/src
$ git clone https://github.com/mgonzs13/audio_common.git
$ cd ~/ros2_ws
$ rosdep install --from-paths src --ignore-src -r -y
$ pip3 install -r audio_common/requirements.txt
$ colcon build
You can create a docker image to test audio_common. Use the following common inside the directory of audio_common.
$ docker build -t audio_common .
After the image is created, run a docker container with the following command.
$ docker run -it --device /dev/snd audio_common
To use a shortcut, you may use following command:
$ make docker_run
Node to obtain audio data from a microphone and publish it into the audio
topic.
format: Specifies the audio format to be used for capturing. Common values are pyaudio.paInt16
(16-bit format) or other formats supported by PyAudio. Default: pyaudio.paInt16
channels: The number of audio channels to capture. Typically, 1
for mono and 2
for stereo. Default: 1
rate: The sample rate that is is how many samples per second should be captured. Default: 16000
chunk: The size of each audio frames. Default: 4096
device: The ID of the audio input device. A value of -1
indicates that the default audio input device should be used. Default: -1
frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: ""
audio_common_msgs/msg/AudioStamped
Node to play the audio data obtained from the audio
topic.
channels: The number of audio channels to capture. Typically, 1
for mono and 2
for stereo. Default: 1
device: The ID of the audio input device. A value of -1
indicates that the default audio input device should be used. Default: -1
audio_common_msgs/msg/AudioStamped
Node to play the music from a audio file in wav
format.
chunk_time: Time, in milliseconds, that last each audio chunk. Default: 50
frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: ""
audio_common_msgs/msg/AudioStamped
Node to generate audio from a text (TTS).
chunk: The size of each audio frames. Default: 4096
frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: ""
audio: Topic publisher to send the audio data generated by the TTS. Type: audio_common_msgs/msg/AudioStamped
say: Action to generate audio data from a text. Type: audio_common_msgs/action/TTS
$ ros2 run audio_common audio_capturer_node
$ ros2 run audio_common audio_player_node
$ ros2 run audio_common tts_node
$ ros2 run audio_common audio_player_node
$ ros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World'}"
$ ros2 run audio_common music_node
$ ros2 run audio_common audio_player_node
$ ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{audio: 'elevator'}"