0
votes

I have a simple HTML page asking a user to type in and submit the name of an animal of their choosing. I then have a python function that converts the entered common name of animal to it's scientific name, and what level of taxonomy it was able to convert to.

For example: Panda -> Ailuropoda melanoleuca, species

I need help with the next step, which is taking the output of this first python function, and making a query into a pandas dataframe on Google's BigQuery. Below is a screenshot of the table's preview.

enter image description here

Below here is the snippet off my main.py that inlcudes the query-making function, and the function calls at the bottom:

def make_query(taxon, level):
    project_id = "sentinel-system"

    data_frame = pandas_gbq.read_gbq(
        "SELECT * FROM `animal_database.gbif_occurrence` WHERE species=%s LIMIT 10, (taxon)",
        project_id=project_id,
        index_col=level)

    number_of_images = len(data_frame.index)

    credentials = service_account.Credentials.from_service_account_file(
        'Sentinel System-a6746634aad2.json')
    pandas_gbq.context.credentials = credentials
    pandas_gbq.context.project = 'sentinel-system'

    if occurrences > 0:
        print('We found %d images of the animal you searched for!' %(number_of_images))
    else:
        print('Sorry, we couldn''t find any images of the animal you searched for.')
    return 0

taxonomy, level = (common_to_sci('Panda'))
name = taxonomy[-1, -1]
print(name)
print(level)

submission_check = (make_query(name, level))

The issue with this function is two-fold. First, the smaller issue is that running main.py currently shows the error

google.api_core.exceptions.BadRequest: 400 Syntax error: Illegal input character "%" at [1:63]

Where using a python tuple argument is a solution I learned from here, and I'm not sure what this SQL query should look like.

The second, more general issue is, I understand the SQL query needs to reference exactly the same string as is used in a column heading in my BigQuery table. But what if the taxonomy level was not 'species' and was instead 'genus/family/order'? Is there a way to make the SQL query more generalized for cases when 'level' != 'species'?

2

2 Answers

1
votes

After trial and error using @itroulli's answer, reading the documentation they linked, and general tweaking, this was the solution that worked for me.

project_id = "sentinel-system"

table = 'animal_database.gbif_occurrence'

query = 'SELECT * FROM {} WHERE {}=\'{}\''.format(table, level, taxon)

data_frame = pandas_gbq.read_gbq(query, project_id=project_id)

Working with the given table, the SQL query searches for the correct column to search through (given by 'level') and under that column, reads for an animal's scientific name given by 'taxon.'

0
votes

According to the BigQuery documentation regarding pandas_gbq, you should use a separate config variable for parameterized queries:

query = "SELECT * FROM `animal_database.gbif_occurrence` WHERE @species=@taxonomy LIMIT 10"

query_config = {
'query': {
    'parameterMode': 'NAMED',
    'queryParameters': [
        {
            'name': 'species',
            'parameterType': {'type': 'STRING'},
            'parameterValue': {'value': level}
        },
        {
            'name': 'taxonomy',
            'parameterType': {'type': 'STRING'},
            'parameterValue': {'value': taxon}
        }
    ]
}
}