0
votes

I developed a smart home device cloud service that's based on python, but while trying to integrate it with actions on google, their library for python is "Google Assistant Library for Python is deprecated as of June 28th, 2019, Use the Google Assistant Service instead."

i went to the google assistant service page and it says that python is "You can't launch commercial devices that integrate with the Google Assistant SDK. It's available for experimental and non-commercial uses only."

why is this the case? does it mean it is still in beta for them? should i not bother with python? i really would like to stick with python since I've spent a good amount of time developing on it.

1
It isn't clear what you're trying to do, exactly. Are you trying to build something like a Smart Speaker with the Assistant built in (so it has a microphone and people will talk to the Assistant through the device)? Or are you trying to build a Smart Home device and build your cloud back-end for this device in python (so you can control it through the Assistant)?Prisoner
@Prisoner, the latter. I am trying to control it through assistant, I can already do it through dialogflow but I would like to have it work with actions on google.Hyperian

1 Answers

4
votes

I think you're mixing up a few different, but related, things, some of which have similar or overlapping names. To try and clear things up:

Google Assistant SDK

The Google Assistant SDK and Google Assistant gRPC service enable you to build a device that works like a Google Home. So people would interact directly with your device and use it to control the Assistant.

  • There are python libraries for this because python is used by many hobbyists on their devices.
  • There used to be a more full-featured SDK (for python), but this appears to no longer be supported.
  • Even with the limited support, this is mostly for hobbyists. It sounds like most devices being made for consumers are using other platforms that require you to partner with Google directly.

Actions on Google

This is a broad term, describing ways that you can make something that people will use through the Google Assistant, via devices such as their Smart Speakers or phone.

It can very roughly be broken up into a couple of different approaches, some of which overlap:

Usually when people talk about Actions on Google, however, they're talking about one of the first two items, and often confusing the Action SDK and Dialogflow.

None of these specifically support, nor prohibit, python.

Smart Home Actions

Smart Home Actions are specifically built to work with the set of devices and traits that Google has built conversational experiences for.

There are a number of important distinctions about Smart Home Actions:

  • You don't need to figure exactly what the user can say. That is determined by the devices and traits you support. Google has built the vocabulary and sends you very discretely defined commands rather than broad conversations.
  • Users don't need to specifically invoke your product by name. They configure the connection to your product through the Google Home setup app and then can address your devices more generically.

Your server can be written in any language you wish - Google will send your registered HTTPS endpoint JSON with the commands, and expects you to reply with JSON as well. There is no specific python library for this - but mostly there doesn't need to be. The most difficult part is that you will need to support OAuth for account linking, but that is a bigger issue than what language you're using and goes to your entire platform.

Smart Home Actions can also support the Local Home SDK, which allows commands to be executed on many devices directly, without having to go to your server for processing. This must be written in either TypeScript or JavaScript, so does not support python.

If you are building for a Smart Home device, you should be using Smart Home Actions rather than anything else. The only reason you might not want to is if you have a device type that is so different from the currently supported devices, you need to make a conversational Action with Dialogflow and/or the Action SDK.

Dialogflow

Dialogflow is a product from Google Cloud that provides a Natural Language Processing system for a number of different configurations and integrations.

One way to use it (and the only one I discuss here) is to process and fulfill conversations through the Google Assistant:

  • Users invoke your Action through the Assistant, usually by saying something like "Hey Google, talk to Shakespearean Insult"
  • This invocation, and every step in the conversation afterwards, is converted from speech into text by the Assistant and then sent to your configuration in Dialogflow
  • Dialogflow determines which Intent matches this user input
  • If the matched Intent is configured to do so, it will then forward the request and additional information to a Fulfillment webhook that you have written that is running on a server you control somewhere
  • This fulfillment can then process the input, determine a reply, and send this back to Dialogflow, which will send it back to the Assistant, which will send it to the user

This fulfillment can be written in nearly any programming language you want, including python. The only requirements are that

  1. It can run on a publicly accessible HTTPS server
  2. It can accept JSON in the Dialogflow fulfillment request format and return JSON in the Dialogflow+Action fulfillment response format.

There is no specific library from Google that supports these JSON formats, but they are fairly straightforward if you want to implement it yourself. There have also been python libraries worked on by the community, but I don't know enough about them to advise which are the best ones right now or which ones work with the current protocol.

Action SDK

Sometimes this is called the Conversation API or SDK in the documentation, although usually they call it the Action SDK these days.

This is similar to how Dialogflow works (in fact, Dialogflow uses it), but differs that there is no NLP system that can determine the user's Intent from their speech:

  • Users still invoke the Action with a phrase such as "Hey Google, talk to Shakespearean Insult"
  • This invocation, and every step after, is converted from speech into text by the Assistant
  • The difference, however, is that this text is sent directly to your webhook, along with some other metadata in the conversation request format JSON
  • It is up to you to send it to an NLP/NLU system to get understanding from what the user has done.
    • You may think you can do this with regexps. You can't. But there are many other good NLP/NLU libraries out there that work with python.
  • Your webhook will send a response using JSON which the Assistant will send to the user.

Again, there is no specific Google supported python library to handle this, but there may be community developed libraries that can do so.

Unless you have a very good reason for using the more raw Action SDK (such as existing components that already are using an existing NLP/NLU system), you should probably use Dialogflow.