extrawurst's blog

Alexa meets german developer

• reading, programming, dlang, and alexa

About 2 weeks ago I suddenly got my inventation to order an echo dot. The echo device family is the entry point to using the Alexa natural language processing service from Amazon. That is like having Apple Siri at home, or Microsofts Cortana.

I have to admit I was not entirely sure about this tech befor I got it, simply because of the practical usefulness that I was not able to imagine befor starting to put my hands on it.

TL;DR

My quick summary: This kind of speech control interface to modern technology (or VUI - Voice User Interface - as Amazon calls it) will revulutionize the way we interact with devices in the near future. I did not expect it to be as far as it is already. There is still things to do like keeping context in the ‘conversations’ but for me the experience was mindblowing: control your home using Alexa, control your music streaming with alexa, have her wake you up in the morning and stop the playback of chillout music at night when I sleep, have her manage my todo and shopping lists … and lots more.

Developing for Alexa

Only after I had bootet up Alexa the first time a colleague of mine told me he developed stuff for extending Alexa himself. Amazon did a great job allowing nerds like me to extend Alexa with custom skills in virtually every language because interfacing with it you only have to access and provide http and json (docs). This allowed me to use my favorite hobby programming language: dlang

The first skill I developed so far is already on github: alexa-openwebif

I will cover details on the development of custom skills using D in future post.

Food for thought for the guys at the Amazon Alexa team

Despite my enthusiasm there is a lot of topics that I have critique to raise. It is nothing major but I am hoping they will be tackled before the roll out in Germany starts big (right now you only can subscribe to get invited).

no multi language support and foreign skills

Right now you have to switch your language in the alexa app, you cannot just talk english or tell alexa to switch to english. Even if you change the language of your device you cannot access the existing thousands of skills in the U.S., you have to switch your whole amazon account country to the U.S.. It is not clear to me why you want to make it that tough to use the original U.S. skills in germany, after all english is a very commonly spoken language here.

This makes it for developers even harder to find out what skills it is actually worth developing from scratch. U.S. skills will likely be translated at some point.

built in slot types not in german

As you can see in the slot-type-reference most of the predefined slots that Amazon provides are not supported in German yet. This makes it very hard to convert skills and for German devs it is harder to create the same user experience that U.S. users benefit from. This has to change befor the open rollout of Alexa happens in Germany. This maybe one of the reasons why there is almost no skill in German yet.

updating skill through api

As a developer I am a fan of automating as much as possible. There is already a lot of functionality I can automate when it comes to alexa skill development because AWS is so CLI friendly. The Amazon developer console on the other hand is just a web portal. It would be great if the whole update lifecycle of a skill could be automated:

lang tags in ssml

Defining the way Alexa is supposed to speak out text the System supports SSML as a markup language. What I would like to have is a language markup (like <language lang="english">) to give Alea a hint how to pronounce certain foreign words.

custom settings per skill

Alexa skills are plain web APIs in the cloud. As a skill developer you dont have a mean to save any state for the user accessing it on the device. The user is used to be able to configure Alexa in their Alexa App - this only applys to settings Amazon specifies (like language, volume and such). I would like to see a Skill-Settings-Panel that lets the user specify settings per skill that Amazon then provides to us developers when calling our API. This is for most of the skills more than enough compared to be forced to provide a custom user portal with OAuth account linking and the whole ceremony.

smaller issues

comments powered by Disqus