The Raspberry Pi Powered Speaking Doorbell – Part 3: Text to Speech

In Part 1 we looked at a simple input circuit to isolate our Raspberry Pi from our doorbell circuit and in part 2 we looked at making a camera overlay appear in Kodi. Next, we’ll look at building the text to speech server.

Please note that the following blog post uses code snippets from my Github project. You will need to clone or download the full source code to run the examples.

With my home setup, I have a dedicated media PC in the lounge which runs Kodi on Windows. It is connected to a Yamaha receiver which is permanently on. The doorbell circuit, however, is connected to a Raspberry Pi. In my case, it makes sense to have the audio for text to speech play over the media PC. But how do we trigger text to speech on the media PC from the Raspberry Pi when someone presses the doorbell?

To solve this problem, I built a simple text to speech handler using the Tornado Web Server – this web server runs on the media PC in the lounge. When the doorbell switch is pressed, the Raspberry Pi simply performs an HTTP request to the text to speech server, which then outputs the given text as speech over the Yamaha receiver.


from lib import handler
import pyttsx

class TextToSpeechHandler(handler.Handler):
    def post(self):
        text = self.get_argument('text')

        engine = pyttsx.init()

        engine.setProperty('rate', self._registry['config'].getint(
                    'text_to_speech.rate'))

        engine.setProperty('volume', self._registry['config'].getfloat(
                    'text_to_speech.volume'))

        voices = engine.getProperty('voices')
        for voice in voices:
            if voice.id.lower().find(self._registry['config'].get(
                    'text_to_speech.voice').lower()) != -1:
                engine.setProperty('voice', voice.id)

        engine.say(text)
        engine.runAndWait()

We define a handler “TextToSpeechHandler” which accepts HTTP posts and will convert an argument called “text” to speech. The handler inherits from my base Handler class (which contains some functionality which is common to all my handlers), which in turn inherits from the standard Tornado handler.

For text to speech, we’ll use the pyttsx package. I have made 3 parameters configurable here – The rate, which is how fast the text is spoken, the volume, and the voice to use (it performs a partial text match on the voice configured). I have the following configuration set up:

#System finds first voice ID that contains the below text, case insensitive
text_to_speech.voice = Hazel

#Speed of speech. 100 is "normal" speed
text_to_speech.rate = 130

#Volume. 1.0 is full volume, 0.0 is no volume.
text_to_speech.volume = 1.0

An example script that performs an HTTP post to the text to speech server:

from lib.bootstrap import Bootstrap
import requests

bootstrap = Bootstrap('default', ['config', 'log'])
registry = bootstrap.bootstrap()

text_to_speech_hosts = registry["config"].get('text_to_speech.hosts')

text = 'There is a visitor at the front door.'

for host_and_port in text_to_speech_hosts:
    url = 'http://' + host_and_port  + '/text_to_speech'
    payload = {"text": text}
    requests.post(url, payload)

We define the text we want to convert to speech in the “text” variable. Then, for all text to speech servers that are configured, we perform an HTTP post with the text as a JSON encoded string.

In the next part, we’ll look at some other utility libraries before digging into the actual doorbell code.

The Raspberry Pi Powered Speaking Doorbell – Part 2: Kodi Camera Overlay

In Part 1, we looked at building a simple input circuit to electrically isolate and protect our Raspberry Pi from damage when connecting it to our doorbell switch. Next, we’ll look at displaying a camera overlay directly in Kodi / XBMC and triggering the display from a Python script.

IP cameras have become really cheap and easy to come by, giving us many options to integrate video into our home automation systems. The front door is a great place to install a camera!

ip

The Digital Lifestyle has a very good write up on how to install and configure an add-on to display the camera overlay (especially if you have more than one camera) directly in Kodi itself.

We can then use some Python code to trigger the add-on using Kodi’s API.

Prerequisites: In Kodi, make sure “Allow control of Kodi via HTTP” is set to ON in Settings -> Services -> Webserver.

I wrote a small XBMC / Kodi client to make it easier to make calls to the API. Here is the method for triggering an add-on:

    def execute_addon(self, addon_id):
        payload = self._get_payload("Addons.ExecuteAddon",
{'addonid': addon_id})
        self._do_post_request(payload)

And here is the code for instantiating the client and executing the add-on:


xbmc_client = xbmc.XbmcClient(host, port)
xbmc_client.execute_addon("script.frontdoorcam")

On line 1, we instantiate the XBMC client by specifying the host and port. The port is configured in Kodi’s Settings -> Services -> Webserver -> Port.

The only parameter we are passing to the execute_addon method is the ID of the add-on we wish to run. Set it to the add-on ID that you specified in the addon.xml file.

That’s it. Next, we’ll look at how to build a text to speech server using Tornado.

The Raspberry Pi Powered Speaking Doorbell

Doorbells are so last century. While watching an episode of the futuristic, sci-fi thriller Extant, I realized that the future of the doorbell is now! In the episode, a voice, presumably the product of some smart home automation system, announces that a visitor is at the front door. Easily achievable with a Raspberry Pi and some Python code!

 

 

Our future-is-now speaking doorbell uses a Raspberry Pi with a simple input circuit wired to our existing doorbell button. When a visitor presses the doorbell, the Raspberry Pi does a number of things: Firstly, it pauses the currently playing video and displays an on-screen message on both of our media center PC’s. Then, using a text-to-speech converter on the living room media PC, it announces that there is a visitor at the front door. And finally, an on-screen video from an IP camera mounted at the front door is displayed on both TV’s.

>> Next post – Part 1: The Input Circuit