☰
Current Page
Main Menu
Home
Home
Editing VoiceRecognition
Edit
Preview
H1
H2
H3
default
Set your preferred keybinding
default
vim
emacs
markdown
Set this page's format to
Markdown
Plain Text
Pod
RDoc
reStructuredText
Rendering unavailable for
AsciiDoc
BibTeX
Creole
MediaWiki
Org-mode
Textile
Help 1
Help 1
Help 1
Help 2
Help 3
Help 4
Help 5
Help 6
Help 7
Help 8
Autosaved text is available. Click the button to restore it.
Restore Text
--- title: VoiceRecognition --- # Objectives * Voice Control of Media [Front Ends][1] and [Automation Hub][2]. * Resilient to noise interference * Low bandwith * Low latency # Ideas ## Simple Offline Control * Pocketsphinx with very limited vocabulary * Every command is keyword triggered * Quick timeouts ## Hybrid Pocketsphinx Google API * Recognise trigger using pocketsphinx * Acknowledge with beeps * Need to manage mixer controls * Pass commands to online STT engine (<http://wit.ai>) * Process and control Kodi and openHAB * Fall-back to Simple Offline Control # Hardware * [PS3][3] Eye * Integrated into [RaspBMC][4] / OSMC # Software * <http://blog.hekkers.net/2014/04/16/home-automation-and-voice-control/> * * * # Prerequisites ## Support packages sudo apt-get install alsa-utils python-pip python-yaml python-dateutil python-pyaudio sudo pip install apscheduler # need never versions, apt versions are too old ## ALSA playback sudo modprobe snd\_usb\_audio # USB mic, loads as card1 on RPi (after snd-bcm2835) [[$[Get Code]]][6] THIS DOES NOT WORK: options snd-usb-audio index=0 options snd-bcm2835 index=1 Don't even bother trying to force index=1 for `snd-bcm2835`, it doesn't support the index parameter: osmc@osmc:~$ /sbin/modinfo snd-bcm2835 filename: /lib/modules/4.3.3-3-osmc/kernel/sound/arm/snd-bcm2835.ko alias: platform:bcm2835_alsa license: GPL description: Alsa driver for BCM2835 chip author: Dom Cobley srcversion: 46AE410DEA6D239DB70D2C9 alias: of:N*T*Cbrcm,bcm2835-audio* depends: snd-pcm,snd intree: Y vermagic: 4.3.3-3-osmc preempt mod_unload modversions ARMv6 parm: force_bulk:Force use of vchiq bulk for audio (bool) Let `snd-bcm2835` be card0 and load `snd-usb-audio` as card1: osmc@osmc:~$ cat /etc/modprobe.d/jasper.conf options snd-usb-audio index=1 Then configure defaults in `.asoundrc` accordingly. osmc@osmc:~$ arecord -l **** List of CAPTURE Hardware Devices **** card 1: CameraB409241 [USB Camera-B4.09.24.1], device 0: USB Audio [USB Audio] Subdevices: 1/1 Subdevice #0: subdevice #0 ## Audio configuration for [PS3][3] Eye The [PS3][3] Eye is a camera with a 4-channel array mic. Local `~/.asoundrc` ## Suggested by http://julius.sourceforge.jp/forum/viewtopic.php?f=9&t=66 pcm.array { type hw card 0 } pcm.array_gain { type softvol slave { pcm "array" } control { name "Mic Gain" count 2 } min_dB -10.0 max_dB 5.0 } pcm.cap { type plug slave { pcm "array_gain" channels 4 } route_policy sum } pcm.!default { type asym playback.pcm { type plug slave.pcm { @func getenv vars [ ALSAPCM ] default "hw:0,0" } } capture.pcm { type plug slave.pcm "cap" } } # Jasper Project : <http://jasperproject.github.io/> Passive STT : pocketsphinx Active STT : wit.ai TTS : Flite Integrates STT and TTS systems. Python-based. ## Configuration `~/.jasper/profile.yml` ... stt_passive_engine: sphinx stt_engine: witai witai-stt: access_token: A0VERY0LONG0ALPHA0NUMERIC0STRING tts_engine: flite-tts flite-tts: voice: slt ... For split active and passive STT we need pocketsphinx and related packages. ## RPi2 Installation For RPi2 (armv7) we can use packages from Debian experimental: sudo su -c "echo 'deb http://ftp.debian.org/debian experimental main contrib non-free' > /etc/apt/sources.list.d/experimental.list" sudo apt-get update sudo apt-get -t experimental install cmuclmtk phonetisaurus m2m-aligner mitlm libfst-tools libfst1-plugins-base libfst-dev ## RPi1 Installation For RPi1 (armv6) we can't use packages from Debian experimental so must build from source or install from elsewhere. ### Install cognomen packages # add repo sudo su -c "echo 'deb http://cognomen.co.uk/apt/debian jessie main' > /etc/apt/sources.list.d/cognomen.list" # import pgp key gpg --keyserver keyserver.ubuntu.com --recv FC88E181D61C9391C4A49682CF36B219807AA92B && gpg --export --armor keymaster@cognomen.co.uk | sudo apt-key add - # update sudo apt-get update sudo apt-get install pocketsphinx pocketsphinx-hmm-en-hub4wsj python-pocketsphinx python-yaml phonetisaurus m2m-aligner mitlm libfst-tools libfst1-plugins-base libfst-dev cmuclmtk python-semantic ## Building RPi1 dependencies from source ### Trying to Cross Compile Don't need `crosstool-ng` can use prebuilt raspberrypi-tools x86-32 linaro cross compiler. ### Naïve openfst cross-compile export PATH=~/src/raspberrypi-tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin:$PATH ./configure --host arm-linux-gnueabihf --enable-compact-fsts --enable-const-fsts --enable-far --enable-lookahead-fsts --enable-pdt make -j 8 Cross compilation works but [Debian RaspberryPi Packaging][12] doesn't. ### Build natively apt-get source phonetisaurus m2m-aligner mitlm openfst # for each dpkg-buildpackage -us -uc -rfakeroot ### Install to repo On the system with the signing keys: sshfs yuggoth:/ yuggoth-ssh cd yuggoth-ssh/var/www/data/cognomen.co.uk/apt/debian for i in *.deb do reprepro includedeb jessie "$i" done # Other methods ## wit.ai Standalone Not used by jasper. sudo apt-get install libsox2 wget https://github.com/wit-ai/witd/releases/download/v0.1/witd-armv6 chmod a+x witd-armv6 ./witd-armv6 ## Voice Command for RPi * <http://stevenhickson.blogspot.co.uk/2013/06/voice-command-v30-for-raspberry-pi.html> ## CMU Sphinx, PocketSphinx, KodiVC * <http://cmusphinx.sourceforge.net/wiki/raspberrypi> sudo apt-get install build-essential sshfs automake libtool [[$[Get Code]]][19] [RaspBMC][4]/Kodi uses pulseaudio so use that for kodivc. sudo apt-get install bison libpulse-dev [[$[Get Code]]][20] KodiVC: [github][21] ## Google Voice API V1 API probably doesn't work any more. V2 needs at least a new API key (limited to 50 calls per day). ### Old script From <http://blog.oscarliang.net/raspberry-pi-voice-recognition-works-like-siri/> : ```shell #!/bin/bash echo "Recording... Press Ctrl+C to Stop." arecord -D "plughw:1,0" -q -f cd -t wav | ffmpeg -loglevel panic -y -i - -ar 16000 -acodec flac file.flac > /dev/null 2>&1 echo "Processing..." wget -q -U "Mozilla/5.0" --post-file file.flac --header "Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-us&client=chromium" | cut -d\" -f12 >stt.txt echo -n "You Said: " cat stt.txt rm file.flac > /dev/null 2>&1 ``` # Resources * <http://www.rmnd.net/speech-recognition-on-raspberry-pi-with-sphinx-racket-and-arduino/> * <http://blog.oscarliang.net/raspberry-pi-voice-recognition-works-like-siri/> [1]: FrontEnd [2]: /Network/AutomationHub [3]: PS3 [4]: RaspBMC [5]: VoiceRecognition?action=sourceblock&num=1 [6]: VoiceRecognition?action=sourceblock&num=2 [8]: VoiceRecognition?action=sourceblock&num=3 [10]: VoiceRecognition?action=sourceblock&num=4 [11]: VoiceRecognition?action=sourceblock&num=5 [12]: /Tech/DebianRaspberryPiPackaging [13]: VoiceRecognition?action=sourceblock&num=6 [14]: VoiceRecognition?action=sourceblock&num=7 [15]: VoiceRecognition?action=sourceblock&num=8 [19]: VoiceRecognition?action=sourceblock&num=9 [20]: VoiceRecognition?action=sourceblock&num=10 [21]: https://github.com/kempniu/kodivc [22]: VoiceRecognition?action=sourceblock&num=11 <!-- vim: filetype=markdown -->
Uploading file...
Sidebar
# SideBar * [Home][1] * [Projects][2] * * * <!-- --> * [Code][3] * [Tech][4] * [Network][5] * [MediaCentre][6] * [UAV][7] * * * <!-- --> * [Travel][8] * [Music][9] * [Horse Riding][10] * [Study][11] * [Games][12] * [Other Activities][13] * * * <!-- --> * [Car][14] * [House][15] * [Watch][16] * [Clothing][17] * [Miscellany][18] * * * [1]: /Home [2]: /Projects [3]: /Code/Code [4]: /Tech/Tech [5]: /Network/Network [6]: /MediaCentre/MediaCentre [7]: /UAV/UAV [8]: /Travel/Travel [9]: /Music/Music [10]: /HorseRiding/HorseRiding [11]: /Study/Study [12]: /Games/Games [13]: /Do/Do [14]: /Car/Car [15]: /House/House [16]: /Watch/Watch [17]: /Clothing/Clothing [18]: /Miscellany/Miscellany <!-- vim: filetype=markdown -->
Edit message:
Cancel