This article is the 12th in a series on home automation Instructables documenting how to create and integrate an IoT Retro Speech Synthesis Device into an existing home automation system including all the necessary software functionality to enable the successful deployment within a domestic environment.
Picture 1 shows the completed IoT speech synth device and Picture 2 shows all the component parts used in the prototype which were form factor reduced to go into the final product.
The video shows the device in action (during testing).
Introduction
As mentioned above this Instructable details how to make an IoT Retro Speech Synthesis Device and is based around the General Instruments SP0256-AL2.
It's primary purpose is add 'old school' voice synthesis to an IoT network. Why 'old school' you may ask? Well, because I was around in the 80's when these things were first manufactured and I interfaced one to my BBC Micro so for me there's some degree of nostalgia surrounding the SP0256-AL2.
I much prefer the challenge of trying to figure out what on earth is being said by this Dalek sounding voice than listening to the dulcet tones of a hipster Amazon echo or Siri. Where's the challenge in that I ask you?
Oh, and not to mention I also have a 'bag load' of 'SP0256-AL2' ICs lying around.
The device is also capable of reading local temperature and humidity so further extends the ambient instrumenting of my existing IoT infrastructure hooking into the MQTT/OpenHAB based IoT network detailed in this series on home automation (HA), building on reused code taken from here.
At it's heart is an ESP8266-07 which is responsible for MQTT communications and controlling all system functionality (SD card access, led control, temperature/humidity sensing, volume control, speech synthesis).The device is fully configurable via text files stored on a local SD card, though calibration and network security parameters can also be programmed via remote MQTT publications.
What parts do I need?
See the bill of materials here
What software do I need?
What tools do I need?
What skills do I need?
Topics Covered
Series Links
To Part 11 : IoT Desktop Console. Part : 11 IoT, Home Automation
Picture 1 above shows the front of the Retro Speech Synthesiser and picture 2 the rear.
Enclosure Front
Enclosure Rear
The Retro Speech Synth device comprises two PCBs;
Retro Speech Synth IoT Board
This board allows for either the direct soldering of an ESP8266-07/12/12E/13 or 0.1" pitch sockets accommodating an ESP8266 carrier PCB.
The board was designed to expand it's I/O over an I2C connection and can support either 3v3 or 5v supply levels via Q1, Q2, R8-13.
Connection to the board is achieved via one of two headers J2 and J4, An 8-way DIL IDC ribbon or 5-way JST/Molex.
U2 and U3 provision 3.3v and 5v on board supply regulation. Alternatively if greater current capacity is required, off board serial shunt regulators may be attached via connectors J10 and J11 respectively.
Connectors J1 and J3 offer external SD card support over SPI. J1 has been designed for an 8-way Molex and J3 has direct pin for pin compatibility support for an off the shelf SD card PCB with either 3v3 or 5v support.
Retro Speech Synth Board
Control of this board is over an I2C 5v compliant connection via J1, J5 or J6, a 4-way JST/Molex, 8-way DIL IDC or 8-way IDC ribbon connector.
U2 MPC23017 provides the I2C to parallel interface to U3 the SP0256-AL2 and LEDS D1 (Green), D2 (Red) and D3 (Blue). The output of the Speech Synth is fed to audio amp CR1 TBA820M via either analogue pot RV1 or digital pot U1 MCP4561.
Digital Pot U1 is also controlled via 5v compliant I2C.
.
Note : The ESP8266-07 device was chosen as it has an integral IPX RF connector allowing an external WiFi Antenna to be added to the aluminum enclosure.
Pictures 1 and 2 show the completed and wired PCB sub-assemblies located on the aluminum enclosure substrate.
The two PCBs were designed using Kicad v4.0.7, manufactured by JLCPCB and assembled by me and shown above Pics 3 to 13.
Picture 1 shows a Haynes Manual style layout of all the prefabricated parts before final assembly.
Pics 2 ... 5 show various shots during the fabrication of the enclosure with minimal clearances.
This IoT Retro Speech Synthesis Device contains six key software components as shown in pic 1 above.
SD Card
This is the external SD SPI Flash Filing System and is used to hold the following information (see pic 2 above);
mDNS Server
This functionality is invoked when the IoT device has failed to connect to your WiFi network as a WiFi station and instead has become a WiFi access point something akin to a domestic WiFi router. In the case of such a router you would typically connect to it by entering the IP Address of something like 192.168.1.1 (usually printed on a label affixed to the box) directly into your browser URL bar whereupon you would receive a login page to enter the username and password to allow you to configure the device. For the ESP8266-07 in AP mode (Access Point mode) the device defaults to the IP address 192.168.4.1, however with the mDNS server running you only have to enter the human friendly name 'SPEECHSVR.local' into the browser URL bar to see the 'Speech Synth Configuration Home Page'.
MQTT Client
The MQTT client provides all the necessary functionality to; connect to your IoT network MQTT broker, subscribe to the topics of your choice and publish payloads to a given topic. In short it provisions IoT core functionality.
HTTP Web Server
This web server has two purposes;
WiFi Station
This functionality gives the IoT device the capability to connect to a domestic WiFi network using the parameters in the Security Information file, without this your IoT device will not be able to subscribe/publish to the MQTT Broker.
WiFi Access Point
The ability to become a WiFi Access Point is a means by which the IoT device allows you to connect to it and make configuration changes via a WiFi station and a browser (such as Safari on the Apple iPad). This access point broadcasts an SSID = "SPEECHSYN" + the last 6 digits of the MAC address of the IoT device. The password for this closed network is imaginatively named 'PASSWORD'
Preamble
To successfully compile this source code you will need a local copy of the code and libraries outlined below in Step 12, References Used. If you are not sure how to install an Arduino library go here.
Overview
The software makes use of the state-machine as shown in pic 1 above (full copy of source in my GitHub repository here). There are 5 main states as outlined below;
The events controlling transitions between states are described in pic 1 above. Transitions between states is also governed by the following SecVals parameters;
As mentioned above if the IoT device is unable to connect as a WiFi Station to the WiFi network who's SSID and P/W is defined in secvals.txt held on the SD Card the IoT device will become an Access Point. Once connected to this access point it will serve up the 'Speech Synth Configuration Home Page' as shown above in Pic 2 (by entering either 'SPEECHSVR.local' or 192.168.4.1 into your browsers URL address bar). This home page allows the reconfiguration of the IoT Retro Speech Synthesis Device via an HTTP browser.
Remote Access whilst in the ACTIVE state
Once connected to the MQTT Broker it is also possible to both re-calibrate and reconfigure the device via MQTT topic publications. The file calvals.txt has R/W access and secvals.txt has write only access exposed.
Also as mentioned above, once in the active mode it is possible to access the Speech Synth via an HTTP interface by entering 'SPEECHSVR.local' or 192.168.4.1 into your browsers URL address bar. This HTTP based interface allows for basic control of the Speech Synth. Pics 3, 4 and 5 show the web pages available.
User debug
During the boot sequence the IoT device green System led at the rear of the enclosure gives the following debug feedback;
IoT Retro Speech Synthesis Device Functionality in ACTIVE State
Once in the ACTIVE state the ESP8266 enters a continual loop calling the following functions; timer_update(), checkTemperatureAndHumidity() and handleSpeech(). The net result of which has been designed to present the user with an HTTP or MQTT interface, seamlessly service it's on-board speech processor with phonemes on demand and publish local ambient parametric values over MQTT.
A comprehensive list of all topic subscriptions and publications including payload values is included in the source code.
When the IoT device powers up, as part of the boot sequence two files named 'cavals1.txt' and 'cavals2.txt' are read from the SD Card.
The contents of these files are calibration constants as indicated above in pic 1.
These calibration constants are used to adjust the readings acquired from the two sensors to bring them into line with a reference device. There is one further value which defines a reporting strategy for each device and is described below along with the procedure followed to calibrate the sensors.
Reporting Strategy
This parameter determines how the remote sensor reports any ambient parametric changes local to it. If a value of 0 is selected the remote sensor will publish any change it sees in the temperature or humidity each time the respective sensor is read (approx every 10 seconds). Any other value will delay the publication of a change by 1...60 minutes. Modifying this parameter allows for optimisation of MQTT network traffic. It should be noted temperature and humidity data from the DHT22 is read alternately due to limitations of the sensor.
Temperature calibration
To calibrate the temperature sensor I followed the same process as outlined here step 4, again using a simple y=mx+c relationship. I used IoT Temperature, Humidity Sensor #1 as the reference device. Values from the sensor are in degrees celcius.
Humidity Calibration
As I possess no means to accurately record or even control local ambient humidity, to calibrate the sensor I used a similar approach to that above here step 4, again using Sensor #1 as reference. However the above said, I have recently found an excellent article on the web describing how to calibrate humidity sensors. I may well try this approach sometime in the future. Values from the sensor are in %age of relative humidity.
As mentioned in an earlier Instructable (here) I settled on the topic naming convention outlined in pic 1 above.
Namely, 'AccessMethod/DeviceType/WhichDevice/Action/SubDevice' It's not perfect but it does allow for useful filters to be applied to see all sensor outputs for a given parametric topic thus allowing for easy comparison as in pic 2 above with MQTTSpy.
This project is the first instance where a single device contains more than one originating source of the same type of publication. ie. Two temperature/humidity sensors, from internal and external sub-devices.
.
It also supports reasonably extensible logical groupings of functionality within a given IoT device.
.
In implementing these topics in software I used hard coded topic strings with fixed, embedded numerical identifiers for each device as opposed to dynamically generating the topics at run time so as to save on RAM and keep performance high.
.
Note : If you're not sure how to use MQTTSpy see here 'Setting Up an MQTT Broker. Part 2 : IoT, Home Automation'
By and large, for my hobby projects, where possible I tend to build a representative hardware prototype against which the software is developed I rarely have any issues when integrating the software into the final platform hardware.
However, on this occasion I came across a strange intermittent fault whereby some phonemes would sound out but others would not.
After some initial debugging of the Speech Synth PCB using an Arduino Uno to source phonemes and prove this board was working, I took a scope to the I2C lines between the IoT PCB and the Speech Synth PCB. See Pic 1 above.
You can clearly see the 'saw tooth'/exponential edge to the I2C signal on the traces.
This is usually an indication the I2C pull up values are too high preventing the line voltage from recovering fast enough in an open drain circuit.
As a 'work around' I paralleled the two smt pull up resistors R12 and R13 with 10Ks to give 4K7 and sure enough the Speech Synth 'burst into life'
This type of failure is the opposite to what can happen when debugging these types of projects. In general most of the I2C based modules purchased from Ebay tend to come with 10K or 4K7 pull ups already fitted. If you intend to use >5 I2C modules, each with 4K7 pull ups, then the overall load is 940R which will be too great for the output stage of the master. The fix would be to de-solder all but one set of pull up resistors on each module. Preferably the one physically furthest away from the master.
A useful tip and worth keeping in mind when designing electronics with I2C devices.
Testing was carried out using two methodologies; Manual and Automated.
The first, manual, and generally used during initial code development was using MQTT Spy to exercise all of the available subscribed topics and check the published responses (depicted in pic 2 above). As this a manual process it can be time consuming and prone to errors as code development progresses, although manual execution does enable 100% coverage.
MQTTSpy was chosen for manual testing because it is an excellent tool to hand format a given payload and publish it to any topic with ease. It also displays a clear, time stamped log which is very useful for debugging (pic 3 above).
The second, automated approach was adopted as the source code became more complex (>3700 lines). Increased complexity means longer manual testing cycles and more complex tests. In order to improve the reliability, determinism and quality of tests, automated testing was used via a python test executive (pic 1). See Step #10 in this Instructable on how automated testing was introduced. A full copy of the automated tests used in this Instructable is available here.
.
A video of the automated test sequence in operation is shown above. The sequence executes the following steps;
.
Although it took a lot of effort with files and drills etc. especially for the speaker grille, I think the outcome is aesthetically pleasing and packs into a nice, small enclosure. I could have made it smaller but it would have needed to go onto one PCB and I deliberately broke it into two so I could re-use the PCBs at a later date for other projects. So it's a happy compromise.
The software works well, the IoT device has been in stable operation for quite some time now without any issues.
I've been monitoring the temperature and humidity via Grafana and comparing with a co-located device. The two ambient values have been correlating well, implying the calibration is reasonable (or at least they are similar).
I stopped short of implementing word command ('WFD/SpeechTH/1/Word/Command') because I ran out of time and needed to move on. I may well re-visit this if and when I set up a MySQL database. Right now I'm using InfluxDB.
The following sources were used to put this Instructable together;
Source code for the IoT Retro Speech Synthesis Device (this contains a copy of everything)
PubSubClient.h
DHT.h
Adafruit_AM2320.h/Adafruit_Sensor.h
MCP4561_DIGI_POT.h
Adafruit_MCP23017.h
For fun
PCB Manufacture
Installing Additional Arduino Libraries
How to Check and Calibrate a Humidity Sensor
SP0256-AL2 Datasheet
Speech Chips Shop
.