1
votes

I put data into a csv file (called "Essential Data_posts"). In my main, I extract a particular column from this file (called 'Post Texts') so that I can analyze the post texts for sentiment entity analysis using Google Cloud NLP. I then put this analysis in another csv file (called "SentimentAnalysis"). To do this, I put all of the information pertaining to sentiment entity analysis into an array (one for each piece of information).

The problem I am having is that when I execute my code, nothing shows up in SentimentAnalysis file, other than the headers, ex. "Representative Name". When I requested the lengths of all the arrays, I found out that each array had a length of 0, so they didn't have information being added to them.

I am using Ubuntu 21.04 and Google Cloud Natural Language. I am running this all in Terminal, not the Google Cloud Platform. I am also using Python3 and emacs text editor.

from google.cloud import language_v1
import pandas as pd
import csv
import os

#lists we are appending to
representativeName = []
entity = []
salienceScore = []
entitySentimentScore = []
entitySentimentMagnitude = []
metadataNames = []
metadataValues = []
mentionText = []
mentionType = []

def sentiment_entity(postTexts):
    client = language_v1.LanguageServiceClient()
    type_ = language_v1.Document.Type.PLAIN_TEXT
    language = "en"
    document = {"content": post_texts, "type": type_, "language": language}
    encodingType = language_v1.EncodingType.UTF8
    response = client.analyze_entity_sentiment(request = {'document': document, 'encoding type': encodingType})

    #loop through entities returned from the API
    for entity in response.entities:
        representativeName.append(entity.name)
        entity.append(language_v1.Entity.Type(entity.type_).name)
        salienceScore.append(entity.salience)
        entitySentimentScore.append(sentiment.score)
        entitySentimentMagnitude.append(sentiment.magnitude)
    
    #loop over metadata associated with entity 
    for metadata_name, metadata_value in entity.metadata.items():
        metadataNames.append(metadata_name)
        metadataValues.append(metadata_value)

    #loop over the mentions of this entity in the input document
    for mention in entity.mentions:
        mentionText.append(mention.text.content)
        mentionType.append(mention.type_)

#put the lists into the csv file (using pandas)    
data = {
    "Representative Name": representativeName,
    "Entity": entity,
    "Salience Score": salienceScore,
    "Entity Sentiment Score": entitySentimentScore,
    "Entity Sentiment Magnitude": entitySentimentMagnitude,
    "Metadata Name": metadataNames,
    "Metadata Value": metadataValues,
    "Mention Text": mentionText,
    "Mention Type": mentionType  
}

df = pd.DataFrame(data)
df
df.to_csv("SentimentAnalysis.csv", encoding='utf-8', index=False)

def main():
    import argparse

    #read the csv file containing the post text we need to analyze
    filename = open('Essential Data_posts.csv', 'r')

    #create dictreader object
    file = csv.DictReader(filename)

    postTexts = []

    #iterate over each column and append values to list
    for col in file:
    postTexts.append(col['Post Text'])

    parser = arg.parse.ArgumentParser()
    parser.add_argument("--postTexts", type=str, default=postTexts)
    args = parser.parse_args()

    sentiment_entity(args.postTexts)
1

1 Answers

0
votes

I tried running your code and I encountered the following errors:

  1. You did not use the passed parameter postTexts in sentiment_entity() thus this will error at document = {"content": post_texts, "type": type_, "language": language}.
  2. A list cannot be passed to "content": post_texts, it should be string. See Document reference.
  3. In variable request, 'encoding type' should be 'encoding_type'
  4. Local variable entity should not not have the same name with entity = []. Python will try to append values in the local variable entity which is not a list.
  5. Should be entity.sentiment.score and entity.sentiment.magnitude instead of sentiment.score and sentiment.magnitude
  6. Loop for metadata and mention should be under loop for entity in response.entities:

I edited your code and fixed the errors mentioned above. In your main(), I included a step to convert the list postTexts to string so it can be used in your sentiment_entity() function. metadataNames and metadataValues are temporarily commented since I do not have an example that could populate these values.

from google.cloud import language_v1
import pandas as pd
import csv
import os

#lists we are appending to
representativeName = []
entity_arr = []
salienceScore = []
entitySentimentScore = []
entitySentimentMagnitude = []
metadataNames = []
metadataValues = []
mentionText = []
mentionType = []

def listToString(s):
    """ Transform list to string"""
    str1 = " "
    return (str1.join(s))
    
def sentiment_entity(postTexts):
    client = language_v1.LanguageServiceClient()
    type_ = language_v1.Document.Type.PLAIN_TEXT
    language = "en"
    document = {"content": postTexts, "type_": type_, "language": language}
    encodingType = language_v1.EncodingType.UTF8
    response = client.analyze_entity_sentiment(request = {'document': document, 'encoding_type': encodingType})

    #loop through entities returned from the API
    for entity in response.entities:
        representativeName.append(entity.name)
        entity_arr.append(language_v1.Entity.Type(entity.type_).name)
        salienceScore.append(entity.salience)
        entitySentimentScore.append(entity.sentiment.score)
        entitySentimentMagnitude.append(entity.sentiment.magnitude)
        #loop over the mentions of this entity in the input document
        for mention in entity.mentions:
            mentionText.append(mention.text.content)
            mentionType.append(mention.type_)
        #loop over metadata associated with entity
        for metadata_name, metadata_value in entity.metadata.items():
            metadataNames.append(metadata_name)
            metadataValues.append(metadata_value)

    data = {
    "Representative Name": representativeName,
    "Entity": entity_arr,
    "Salience Score": salienceScore,
    "Entity Sentiment Score": entitySentimentScore,
    "Entity Sentiment Magnitude": entitySentimentMagnitude,
    #"Metadata Name": metadataNames,
    #"Metadata Value": metadataValues,
    "Mention Text": mentionText,
    "Mention Type": mentionType
    }

    df = pd.DataFrame(data)
    df.to_csv("SentimentAnalysis.csv", encoding='utf-8', index=False)

def main():
    import argparse

    #read the csv file containing the post text we need to analyze
    filename = open('test.csv', 'r')

    #create dictreader object
    file = csv.DictReader(filename)

    postTexts = []

    #iterate over each column and append values to list
    for col in file:
        postTexts.append(col['Post Text'])
    content = listToString(postTexts) #convert list to string
    print(content)
    sentiment_entity(content)

if __name__ == "__main__":
    main()

test.csv:

col_1,Post Text
dummy,Grapes are good.
dummy,Bananas are bad.

When code is ran, I printed the converted list to string and SentimentAnalysis.csv is generated:

enter image description here

SentimentAnalysis.csv:

Representative Name,Entity,Salience Score,Entity Sentiment Score,Entity Sentiment Magnitude,Mention Text,Mention Type
Grapes,OTHER,0.8335162997245789,0.800000011920929,0.800000011920929,Grapes,2
Bananas,OTHER,0.16648370027542114,-0.699999988079071,0.699999988079071,Bananas,2