VoiceRecognition

Keybinding Markup

---
title: VoiceRecognition
---

# Objectives

*   Voice Control of Media [Front Ends][1] and [Automation Hub][2]. 
*   Resilient to noise interference 
*   Low bandwith 
*   Low latency

# Ideas

## Simple Offline Control

*   Pocketsphinx with very limited vocabulary 
*   Every command is keyword triggered 
*   Quick timeouts

## Hybrid Pocketsphinx Google API

*   Recognise trigger using pocketsphinx 
*   Acknowledge with beeps 
    *   Need to manage mixer controls 
*   Pass commands to online STT engine (<http://wit.ai>) 
*   Process and control Kodi and openHAB 
*   Fall-back to Simple Offline Control

# Hardware

*   [PS3][3] Eye 
*   Integrated into [RaspBMC][4] / OSMC

# Software

*   <http://blog.hekkers.net/2014/04/16/home-automation-and-voice-control/>

* * *

# Prerequisites

## Support packages

sudo apt-get install alsa-utils python-pip python-yaml python-dateutil python-pyaudio
    sudo pip install apscheduler # need never versions, apt versions are too old

## ALSA playback

sudo modprobe snd\_usb\_audio # USB mic, loads as card1 on RPi (after snd-bcm2835)

[[$[Get Code]]][6]

THIS DOES NOT WORK:

options snd-usb-audio index=0
    options snd-bcm2835 index=1

Don't even bother trying to force index=1 for `snd-bcm2835`, it doesn't support the index parameter:

osmc@osmc:~$ /sbin/modinfo snd-bcm2835
    filename:       /lib/modules/4.3.3-3-osmc/kernel/sound/arm/snd-bcm2835.ko
    alias:          platform:bcm2835_alsa
    license:        GPL
    description:    Alsa driver for BCM2835 chip
    author:         Dom Cobley
    srcversion:     46AE410DEA6D239DB70D2C9
    alias:          of:N*T*Cbrcm,bcm2835-audio*
    depends:        snd-pcm,snd
    intree:         Y
    vermagic:       4.3.3-3-osmc preempt mod_unload modversions ARMv6 
    parm:           force_bulk:Force use of vchiq bulk for audio (bool)

Let `snd-bcm2835` be card0 and load `snd-usb-audio` as card1:

osmc@osmc:~$ cat /etc/modprobe.d/jasper.conf 
    options snd-usb-audio index=1

Then configure defaults in `.asoundrc` accordingly.

osmc@osmc:~$ arecord -l
    **** List of CAPTURE Hardware Devices ****
    card 1: CameraB409241 [USB Camera-B4.09.24.1], device 0: USB Audio [USB Audio]
      Subdevices: 1/1
      Subdevice #0: subdevice #0

## Audio configuration for [PS3][3] Eye

The [PS3][3] Eye is a camera with a 4-channel array mic.

Local `~/.asoundrc`

## Suggested by http://julius.sourceforge.jp/forum/viewtopic.php?f=9&t=66
    pcm.array {
      type hw
      card 0
    }
    
    pcm.array_gain {
      type softvol
      slave {
        pcm "array"
      }
      control {
        name "Mic Gain"
        count 2
      }
      min_dB -10.0
      max_dB 5.0
    }
    
    pcm.cap {
      type plug
      slave {
        pcm "array_gain"
        channels 4
      }
      route_policy sum
    }
    
    pcm.!default {
        type asym
    
        playback.pcm {
         type plug
          slave.pcm {
            @func getenv
            vars [ ALSAPCM ]
            default "hw:0,0"
          }
        }
        capture.pcm {
            type plug
            slave.pcm "cap"
        }
    }

# Jasper

Project
:   <http://jasperproject.github.io/>

Passive STT
:   pocketsphinx

Active STT
:   wit.ai

TTS
:   Flite

Integrates STT and TTS systems. Python-based.

## Configuration

`~/.jasper/profile.yml`

...
    stt_passive_engine: sphinx
    stt_engine: witai
    witai-stt:
      access_token: A0VERY0LONG0ALPHA0NUMERIC0STRING
    tts_engine: flite-tts
    flite-tts:
      voice: slt
    ...

For split active and passive STT we need pocketsphinx and related packages.

## RPi2 Installation

For RPi2 (armv7) we can use packages from Debian experimental:

sudo su -c "echo 'deb http://ftp.debian.org/debian experimental main contrib non-free' > /etc/apt/sources.list.d/experimental.list"
    sudo apt-get update
    sudo apt-get -t experimental install cmuclmtk phonetisaurus m2m-aligner mitlm libfst-tools libfst1-plugins-base libfst-dev

## RPi1 Installation

For RPi1 (armv6) we can't use packages from Debian experimental so must build from source or install from elsewhere.

### Install cognomen packages

# add repo
    sudo su -c "echo 'deb http://cognomen.co.uk/apt/debian jessie main' > /etc/apt/sources.list.d/cognomen.list"
    # import pgp key
    gpg --keyserver keyserver.ubuntu.com --recv  FC88E181D61C9391C4A49682CF36B219807AA92B && gpg --export --armor keymaster@cognomen.co.uk | sudo apt-key add -
    # update
    sudo apt-get update
    sudo apt-get install pocketsphinx pocketsphinx-hmm-en-hub4wsj python-pocketsphinx python-yaml phonetisaurus m2m-aligner mitlm libfst-tools libfst1-plugins-base libfst-dev cmuclmtk python-semantic

## Building RPi1 dependencies from source

### Trying to Cross Compile

Don't need `crosstool-ng` can use prebuilt raspberrypi-tools x86-32 linaro cross compiler.

### Naïve openfst cross-compile

export PATH=~/src/raspberrypi-tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin:$PATH
    ./configure --host arm-linux-gnueabihf --enable-compact-fsts --enable-const-fsts --enable-far --enable-lookahead-fsts --enable-pdt
    make -j 8

Cross compilation works but [Debian RaspberryPi Packaging][12] doesn't.

### Build natively

apt-get source phonetisaurus m2m-aligner mitlm openfst
    # for each
    dpkg-buildpackage -us -uc -rfakeroot

### Install to repo

On the system with the signing keys:

sshfs yuggoth:/ yuggoth-ssh
    cd yuggoth-ssh/var/www/data/cognomen.co.uk/apt/debian
    for i in *.deb
    do
        reprepro includedeb jessie "$i"
    done

# Other methods

## wit.ai Standalone

Not used by jasper.

sudo apt-get install libsox2
    wget https://github.com/wit-ai/witd/releases/download/v0.1/witd-armv6
    chmod a+x witd-armv6
    ./witd-armv6

## Voice Command for RPi

*   <http://stevenhickson.blogspot.co.uk/2013/06/voice-command-v30-for-raspberry-pi.html>

## CMU Sphinx, PocketSphinx, KodiVC

*   <http://cmusphinx.sourceforge.net/wiki/raspberrypi>

sudo apt-get install build-essential sshfs automake libtool

[[$[Get Code]]][19]

[RaspBMC][4]/Kodi uses pulseaudio so use that for kodivc.

sudo apt-get install bison libpulse-dev

[[$[Get Code]]][20]

KodiVC: [github][21]

## Google Voice API

V1 API probably doesn't work any more. V2 needs at least a new API key (limited to 50 calls per day).

### Old script

From <http://blog.oscarliang.net/raspberry-pi-voice-recognition-works-like-siri/> :

```shell
#!/bin/bash

echo "Recording... Press Ctrl+C to Stop."
arecord -D "plughw:1,0" -q -f cd -t wav | ffmpeg -loglevel panic -y -i - -ar 16000 -acodec flac file.flac  > /dev/null 2>&1

echo "Processing..."
wget -q -U "Mozilla/5.0" --post-file file.flac --header "Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-us&client=chromium" | cut -d\" -f12  >stt.txt

echo -n "You Said: "
cat stt.txt

rm file.flac  > /dev/null 2>&1

```

# Resources

*   <http://www.rmnd.net/speech-recognition-on-raspberry-pi-with-sphinx-racket-and-arduino/> 
*   <http://blog.oscarliang.net/raspberry-pi-voice-recognition-works-like-siri/>

[1]: FrontEnd
 [2]: /Network/AutomationHub
 [3]: PS3
 [4]: RaspBMC
 [5]: VoiceRecognition?action=sourceblock&num=1
 [6]: VoiceRecognition?action=sourceblock&num=2
 [8]: VoiceRecognition?action=sourceblock&num=3
 [10]: VoiceRecognition?action=sourceblock&num=4
 [11]: VoiceRecognition?action=sourceblock&num=5
 [12]: /Tech/DebianRaspberryPiPackaging
 [13]: VoiceRecognition?action=sourceblock&num=6
 [14]: VoiceRecognition?action=sourceblock&num=7
 [15]: VoiceRecognition?action=sourceblock&num=8
 [19]: VoiceRecognition?action=sourceblock&num=9
 [20]: VoiceRecognition?action=sourceblock&num=10
 [21]: https://github.com/kempniu/kodivc
 [22]: VoiceRecognition?action=sourceblock&num=11

Uploading file...

Sidebar

Edit message:

Cancel

Editing VoiceRecognition

Sidebar