The current way to authenticate a user through their Google account is to use Google Sign-In for Assistant. Once they log in to your Action, you'll get an id token which you can decode to get their Google ID, which you can use to look up their account in your datastore to get their access/refresh tokens.
Since you need additional scopes, if the user logs in to the Assistant and does not already have the scopes attached to their account, you'll redirect them to a web-based login page where they can log in using Google Sign-In with the scopes you need. In this case, when they log in and authorize access through the web, you will get the auth code which you will need to exchange for the auth token and refresh token and store these.
You do not need to create your own OAuth endpoints for this, although you will need to do a bit of additional work to make sure they get redirected to your website to do the authorization if necessary.
You will only get the auth code once when they log in and authorize you. You will need to exchange this for the auth and refresh tokens and then store these tokens.
Update to explain things a little better.
Looking at the architecture, we see it has a few components. We’ll go into the details of each of these as we go through the process flow:
You have a data store of some sort, where you will store the Auth Token and Refresh Token for the user. I’m going to assume that you’re using Google’s User ID as the index for this data store.
By "Google User ID" in this case, I mean the unique numeric identifier that Google assigns to each account. This is often represented as a string, despite having just numeric digits, since it is usually much longer than most numeric types. In the ID Token, this is the "sub" claim.
In theory, you could use other identifiers that are available from the claims in the ID Token, such as their email address. Unfortunately, not all of these fields are guaranteed to be available - only the "sub" is guaranteed.
You have a web server that will have a few important URLs for our purposes:
- The webhook for your Action fulfillment.
- A login/auth page.
- An endpoint where the javascript on the login page will send you the Auth Code.
The Google Assistant, which may be running on a Google Home or on a mobile device. We also assume that the user will be able to get to a browser to review what they are authorizing.
The Google services that you will be using, including Google’s OAuth service
Let’s start with the case where the user has previously logged in and authorized us to access the service on their behalf. We have their Auth Token and Refresh Token in our data store, indexed against their Google User ID. This is the simple case, but it helps us understand the more complicated case of how all that data gets in there.
The data flow looks something like this:
- The Assistant sends the Action webhook an Intent and possible parameters with it. If this is the first message, it is a welcome intent, but it doesn’t matter. It includes an Identity Token, which we will need to decode and verify. As part of the data we get when we decode it, it includes the User ID for the user.
- Using the User ID…
- ...we get the Auth Token and Refresh Token from the data store.
- With the Auth Token and Refresh Token, we can carry out some action against Google’s services, acting on behalf of the user.
- We’ll get some results back from the service…
- ...which we usually want to pass back to the user in some form.
Easy, right? But what if the user has never used the Assistant to talk to our Action before? And has never authorized us to access their Google services, so we don’t have their tokens? That flow looks more like this:
- The Assistant sends the Action webhook an Intent and possible parameters. This will be the first message, so our welcome intent is triggered. There is no Identity Token.
- The webhook sees there is no Identity Token, so it sends back a message requesting the “Sign In” helper function. Since your project is configured to use Google Sign-In, the Assistant will prompt the user if they can give you their profile information.
- If they say yes, you’ll get another response saying they have signed in and including the Identity Token, which we decode and verify and get their User ID. (If they say no, we’ll get a response saying it failed. How you handle this is another story. I’m going to assume they say yes.)
- Using the User ID…
- ...we try to get the Auth Token and Refresh Token from the data store. But they haven’t authorized us yet. We have authenticated them, but don’t have authorization...
- ...so we send back a message saying they need to visit our website to authorize us to access their Google services. We may require them to switch to a mobile device to do this part and even include a link to the login page.
- They will follow the link on a device that has a screen.
- We’ll send back the login page, which includes a link to Login with Google. We’ve configured this button to also ask for the additional scopes we need to access their services, as well as permission to access the services on their behalf when “offline”.
- They will go through the Google Login dance, the OAuth scopes screen, and will hopefully grant all the permissions we want. (Again, I ignore what happens if they don’t.) I omit what that dance looks like since it doesn’t involve us. Assuming that all goes well, Google gives them an Auth Code, which the javascript on the login page sends to us.
- We call Google’s OAuth servers to verify the Auth Code and use it to get the Auth Token and Refresh Token…
- … which we then store in the data store…
- … and then send back something so the Javascript page can tell the user that they can use our Action normally from now on.
Which they can now do, and it behaves as it did in the earlier, simple, scenario.
That looks complex, but it turns out that we can remove some steps in some cases. If the Google Cloud Project is the same project you use for both your Action as well as the web-based Google Sign-In, then once they authorize the project on the web, all calls to your fulfillment will include the Identity Token. This lets us remove steps 2-6 above, so things look more like this:
- The Assistant sends the Action webhook an Intent and possible parameters. This will be the first message, so our welcome intent is triggered. There is no Identity Token.
- The webhook sees there is no Identity Token, so we send back a message saying they need to visit our website to authorize us to access their Google services. We may require them to switch to a mobile device to do this part and even include a link to the login page. (This is the collapsed steps 2 and 6 from above.)
- They will follow the link on a device that has a screen.
- We’ll send back the login page, which includes a link to Login with Google. We’ve configured this button to also ask for the additional scopes we need to access their services, as well as permission to access the services on their behalf when “offline”.
- They will go through the Google Login dance, the OAuth scopes screen, and will hopefully grant all the permissions we want. (Again, I ignore what happens if they don’t.) I omit what that dance looks like since it doesn’t involve us. Assuming that all goes well, Google gives them an Auth Code, which the javascript on the login page sends to us.
- We call Google’s OAuth servers to verify the Auth Code and use it to get the Auth Token and Refresh Token…
- … which we then store in the data store…
- … and then send back something so the Javascript page can tell the user that they can use our Action normally from now on.
It is also worth noting that if they visit the website before even trying out the Assistant version (ie - because of a search result or whatever they start on step 8 from the second diagram or 4 from that third diagram) and log in, then we will get their Identity Token the first time they visit us through the Assistant, and this will work just like the simple scenario.