Better way to get access to Google Sheet with Python

Question

How to automatize the process of getting access to Google spreadsheets?

Right now we use gspread and oauth2client.service_account to get an access to Google spreadsheet. It works fine, but using OAuth2 credentials makes us manually share every single spreadsheet to "client_email" from credentials json-file.

import gspread
from oauth2client.service_account import ServiceAccountCredentials
scope = [
            'https://www.googleapis.com/auth/spreadsheets',
        'https://www.googleapis.com/auth/drive'
    ]
credentials = 
ServiceAccountCredentials.from_json_keyfile_name('path.json', scope)
    gs = gspread.authorize(credentials)

That works, but how to modify?

So the desired outcome is: somebody shares a spreadsheet with me and I can start to work with it immediately in Python. Is it possible? Maybe we can use some triggers from incoming emails with the information about sharing or something similar?

I'm working on this, but to make sure I understood, do you use another email, or do you all use the same client_email from the credentials? — Jescanellas
Hallo, Jescanellas! Right now we use the same client_email from creds, but it's up to debate if there are other options. — Andrey Ostrovsky
Ok! What you could do then is put those sheets in a Drive folder and share the entire folder with the client_email. If what you want is to run the same python script with every new sheet without changing the code, you could do a script searching for certain tags or titles in emails and then running the same python script with the new info. I will post an answer with the code later. — Jescanellas
Thank you very much for your help! Method with folder is an option, but isn't suitable as people just share some files with our team's members by email adress:( Is it possible to write some script that search through email letters without revealing credentials of team members? Or maybe use another auth. option to connect to google spreadsheets? — Andrey Ostrovsky
The other alternatives to oauth are less secure and not recommended by google. Can't you create a folder only for sheets inside that folder? I'm working on a script that searches for new sheets names in the client_email's inbox. Would this fit your needs? — Jescanellas

Jescanellas Jescanellas · Accepted Answer · 2019-07-10T09:12:13

You can try this script. It has a few sections we can differentiate:

Requesting access to Drive and Gmail. As you see we use the full drive scope instead of drive.file. ~~This is because there is an existing bug that causes drive.file to crash~~(1), so meanwhile we have to use this.

     from __future__ import print_function
        import pickle
        import sys
        import os.path
        from googleapiclient.discovery import build
        from google_auth_oauthlib.flow import InstalledAppFlow
        from google.auth.transport.requests import Request

        # If modifying these scopes, delete the file token.pickle.
        SCOPES = ['https://www.googleapis.com/auth/drive', 
              'https://www.googleapis.com/auth/gmail.modify']

        creds = None
        # The file token.pickle stores the user's access and refresh tokens, and is
        # created automatically when the authorization flow completes for the first
        # time.
        if os.path.exists('token.pickle'):
            with open('token.pickle', 'rb') as token:
                creds = pickle.load(token)
        # If there are no (valid) credentials available, let the user log in.

        if not creds or not creds.valid:
            if creds and creds.expired and creds.refresh_token:
               creds.refresh(Request())
            else:
                flow = InstalledAppFlow.from_client_secrets_file(
                    'credentials.json', SCOPES)
                creds = flow.run_local_server()
            # Save the credentials for the next run
            with open('token.pickle', 'wb') as token:
                pickle.dump(creds, token)
        mail_service = build('gmail', 'v1', credentials=creds)
        drive_service = build('drive','v3', credentials=creds)

Declaring some variables. There is no issue here, just declaring the variables where we will keep the mail ids, the file name and the file name formatted for our needs.

     mail_ids = []
     file_name = []
     name_string = []

Get the emails. We will only take the unread emails from drive-shares-noreply. After this we will mark them as “read” so we won’t take them the next time we execute the script.

    def get_emails(mail_ids):
        user_id = 'me' #Or your email
        query = 'from:drive-shares-noreply@google.com, is:UNREAD' #Will search mails from drive shares and unread
        response = mail_service.users().messages().list(userId=user_id, q=query).execute()
        items = response.get('messages', [])
        if not items:
            print('No unread mails found')
            sys.exit()
        else:
            for items in items:
                mail_ids.append(items['id'])
            for mail_ids in mail_ids:
                mail_service.users().messages().modify(userId=user_id, id=mail_ids, body={"removeLabelIds":["UNREAD"]}).execute() #Marks the mails as read

Get the file names of the emails. The syntax of the Subject of the sharing sheets email is “Filename - Invitation to edit”, so we will take the subject of each email, and we will format the string later.

def get_filename(mail_ids, file_name):
    user_id = 'me'
    headers = []
    for mail_ids in mail_ids:
        response = mail_service.users().messages().get(userId=user_id, id=mail_ids, format="metadata", metadataHeaders="Subject").execute()
        items = response.get('payload', [])
        headers.append(items['headers'])
    length = len(headers)
    for i in range(length):
        file_name.append(headers[i][0]['value'])

def process_name(file_name, name_string):
    for file_name in file_name:
        name_string.append(str(file_name).replace(" - Invitation to edit", ""))

Give permissions to the client_email

def give_permissions(name_string):
    for name_string in name_string:
        body = "'{}'".format(name_string)
        results = drive_service.files().list(q="name = "+body).execute()
        items = results.get('files', [])
        if not items:
            print('No files found.')
            sys.exit()
        else:
            print('Files:')
            for item in items:
                print(u'{0} ({1})'.format(item['name'], item['id']))
                file_id = item['id']

    user_permission = {
        'type': 'user',
        'role': 'writer',
        'emailAddress': 'your_client_email'
    }
    drive_service.permissions().create(body=user_permission, fileId=file_id).execute()

And then we just have to call the functions

get_emails(mail_ids)
get_filename(mail_ids, file_name)
process_name(file_name, name_string)
give_permissions(name_string)

There is no way to trigger this script for each new email received, but you can trigger it with a timer or something like that and it will search for new emails.

(1) The drive.file scope only works with certain files, according to the last update of the documentation

Better way to get access to Google Sheet with Python

1 Answers