☰
Current Page
Main Menu
Home
Home
Editing
VoiceRecognition
Edit
Preview
h1
h2
h3
Keybinding
default
vim
emacs
Markup
Markdown
Plain Text
Pod
RDoc
reStructuredText
AsciiDoc
BibTeX
Creole
MediaWiki
Org-mode
Textile
Help 1
Help 1
Help 1
Help 2
Help 3
Help 4
Help 5
Help 6
Help 7
Help 8
Autosaved text is available. Click the button to restore it.
Restore Text
--- title: VoiceRecognition --- # Objectives * Voice Control of Media [Front Ends][1] and [Automation Hub][2]. * Resilient to noise interference * Low bandwith * Low latency # Ideas ## Simple Offline Control * Pocketsphinx with very limited vocabulary * Every command is keyword triggered * Quick timeouts ## Hybrid Pocketsphinx Google API * Recognise trigger using pocketsphinx * Acknowledge with beeps * Need to manage mixer controls * Pass commands to online STT engine (<http://wit.ai>) * Process and control Kodi and openHAB * Fall-back to Simple Offline Control # Hardware * [PS3][3] Eye * Integrated into [RaspBMC][4] / OSMC # Software * <http://blog.hekkers.net/2014/04/16/home-automation-and-voice-control/> * * * # Prerequisites ## Support packages sudo apt-get install alsa-utils python-pip python-yaml python-dateutil python-pyaudio sudo pip install apscheduler # need never versions, apt versions are too old ## ALSA playback sudo modprobe snd\_usb\_audio # USB mic, loads as card1 on RPi (after snd-bcm2835) [[$[Get Code]]][6] THIS DOES NOT WORK: options snd-usb-audio index=0 options snd-bcm2835 index=1 Don't even bother trying to force index=1 for `snd-bcm2835`, it doesn't support the index parameter: osmc@osmc:~$ /sbin/modinfo snd-bcm2835 filename: /lib/modules/4.3.3-3-osmc/kernel/sound/arm/snd-bcm2835.ko alias: platform:bcm2835_alsa license: GPL description: Alsa driver for BCM2835 chip author: Dom Cobley srcversion: 46AE410DEA6D239DB70D2C9 alias: of:N*T*Cbrcm,bcm2835-audio* depends: snd-pcm,snd intree: Y vermagic: 4.3.3-3-osmc preempt mod_unload modversions ARMv6 parm: force_bulk:Force use of vchiq bulk for audio (bool) Let `snd-bcm2835` be card0 and load `snd-usb-audio` as card1: osmc@osmc:~$ cat /etc/modprobe.d/jasper.conf options snd-usb-audio index=1 Then configure defaults in `.asoundrc` accordingly. osmc@osmc:~$ arecord -l **** List of CAPTURE Hardware Devices **** card 1: CameraB409241 [USB Camera-B4.09.24.1], device 0: USB Audio [USB Audio] Subdevices: 1/1 Subdevice #0: subdevice #0 ## Audio configuration for [PS3][3] Eye The [PS3][3] Eye is a camera with a 4-channel array mic. Local `~/.asoundrc` ## Suggested by http://julius.sourceforge.jp/forum/viewtopic.php?f=9&t=66 pcm.array { type hw card 0 } pcm.array_gain { type softvol slave { pcm "array" } control { name "Mic Gain" count 2 } min_dB -10.0 max_dB 5.0 } pcm.cap { type plug slave { pcm "array_gain" channels 4 } route_policy sum } pcm.!default { type asym playback.pcm { type plug slave.pcm { @func getenv vars [ ALSAPCM ] default "hw:0,0" } } capture.pcm { type plug slave.pcm "cap" } } # Jasper Project : <http://jasperproject.github.io/> Passive STT : pocketsphinx Active STT : wit.ai TTS : Flite Integrates STT and TTS systems. Python-based. ## Configuration `~/.jasper/profile.yml` ... stt_passive_engine: sphinx stt_engine: witai witai-stt: access_token: A0VERY0LONG0ALPHA0NUMERIC0STRING tts_engine: flite-tts flite-tts: voice: slt ... For split active and passive STT we need pocketsphinx and related packages. ## RPi2 Installation For RPi2 (armv7) we can use packages from Debian experimental: sudo su -c "echo 'deb http://ftp.debian.org/debian experimental main contrib non-free' > /etc/apt/sources.list.d/experimental.list" sudo apt-get update sudo apt-get -t experimental install cmuclmtk phonetisaurus m2m-aligner mitlm libfst-tools libfst1-plugins-base libfst-dev ## RPi1 Installation For RPi1 (armv6) we can't use packages from Debian experimental so must build from source or install from elsewhere. ### Install cognomen packages # add repo sudo su -c "echo 'deb http://cognomen.co.uk/apt/debian jessie main' > /etc/apt/sources.list.d/cognomen.list" # import pgp key gpg --keyserver keyserver.ubuntu.com --recv FC88E181D61C9391C4A49682CF36B219807AA92B && gpg --export --armor keymaster@cognomen.co.uk | sudo apt-key add - # update sudo apt-get update sudo apt-get install pocketsphinx pocketsphinx-hmm-en-hub4wsj python-pocketsphinx python-yaml phonetisaurus m2m-aligner mitlm libfst-tools libfst1-plugins-base libfst-dev cmuclmtk python-semantic ## Building RPi1 dependencies from source ### Trying to Cross Compile Don't need `crosstool-ng` can use prebuilt raspberrypi-tools x86-32 linaro cross compiler. ### Naïve openfst cross-compile export PATH=~/src/raspberrypi-tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin:$PATH ./configure --host arm-linux-gnueabihf --enable-compact-fsts --enable-const-fsts --enable-far --enable-lookahead-fsts --enable-pdt make -j 8 Cross compilation works but [Debian RaspberryPi Packaging][12] doesn't. ### Build natively apt-get source phonetisaurus m2m-aligner mitlm openfst # for each dpkg-buildpackage -us -uc -rfakeroot ### Install to repo On the system with the signing keys: sshfs yuggoth:/ yuggoth-ssh cd yuggoth-ssh/var/www/data/cognomen.co.uk/apt/debian for i in *.deb do reprepro includedeb jessie "$i" done # Other methods ## wit.ai Standalone Not used by jasper. sudo apt-get install libsox2 wget https://github.com/wit-ai/witd/releases/download/v0.1/witd-armv6 chmod a+x witd-armv6 ./witd-armv6 ## Voice Command for RPi * <http://stevenhickson.blogspot.co.uk/2013/06/voice-command-v30-for-raspberry-pi.html> ## CMU Sphinx, PocketSphinx, KodiVC * <http://cmusphinx.sourceforge.net/wiki/raspberrypi> sudo apt-get install build-essential sshfs automake libtool [[$[Get Code]]][19] [RaspBMC][4]/Kodi uses pulseaudio so use that for kodivc. sudo apt-get install bison libpulse-dev [[$[Get Code]]][20] KodiVC: [github][21] ## Google Voice API V1 API probably doesn't work any more. V2 needs at least a new API key (limited to 50 calls per day). ### Old script From <http://blog.oscarliang.net/raspberry-pi-voice-recognition-works-like-siri/> : ```shell #!/bin/bash echo "Recording... Press Ctrl+C to Stop." arecord -D "plughw:1,0" -q -f cd -t wav | ffmpeg -loglevel panic -y -i - -ar 16000 -acodec flac file.flac > /dev/null 2>&1 echo "Processing..." wget -q -U "Mozilla/5.0" --post-file file.flac --header "Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-us&client=chromium" | cut -d\" -f12 >stt.txt echo -n "You Said: " cat stt.txt rm file.flac > /dev/null 2>&1 ``` # Resources * <http://www.rmnd.net/speech-recognition-on-raspberry-pi-with-sphinx-racket-and-arduino/> * <http://blog.oscarliang.net/raspberry-pi-voice-recognition-works-like-siri/> [1]: FrontEnd [2]: /Network/AutomationHub [3]: PS3 [4]: RaspBMC [5]: VoiceRecognition?action=sourceblock&num=1 [6]: VoiceRecognition?action=sourceblock&num=2 [8]: VoiceRecognition?action=sourceblock&num=3 [10]: VoiceRecognition?action=sourceblock&num=4 [11]: VoiceRecognition?action=sourceblock&num=5 [12]: /Tech/DebianRaspberryPiPackaging [13]: VoiceRecognition?action=sourceblock&num=6 [14]: VoiceRecognition?action=sourceblock&num=7 [15]: VoiceRecognition?action=sourceblock&num=8 [19]: VoiceRecognition?action=sourceblock&num=9 [20]: VoiceRecognition?action=sourceblock&num=10 [21]: https://github.com/kempniu/kodivc [22]: VoiceRecognition?action=sourceblock&num=11 <!-- vim: filetype=markdown -->
Uploading file...
Sidebar
# SideBar * [Home][1] * [Projects][2] * * * <!-- --> * [Code][3] * [Tech][4] * [Network][5] * [MediaCentre][6] * [UAV][7] * * * <!-- --> * [Travel][8] * [Music][9] * [Horse Riding][10] * [Study][11] * [Games][12] * [Other Activities][13] * * * <!-- --> * [Car][14] * [House][15] * [Watch][16] * [Clothing][17] * [Miscellany][18] * * * [1]: /Home [2]: /Projects [3]: /Code/Code [4]: /Tech/Tech [5]: /Network/Network [6]: /MediaCentre/MediaCentre [7]: /UAV/UAV [8]: /Travel/Travel [9]: /Music/Music [10]: /HorseRiding/HorseRiding [11]: /Study/Study [12]: /Games/Games [13]: /Do/Do [14]: /Car/Car [15]: /House/House [16]: /Watch/Watch [17]: /Clothing/Clothing [18]: /Miscellany/Miscellany <!-- vim: filetype=markdown -->
Edit message:
Cancel