I have the following dataset
Text
country file
US file_US The Dish: Lidia Bastianich shares Italian recipes ... - CBS News
file_US Blog - Tasty Yummies
file_US Acne Alternative Remedies: Manuka Honey, Tea Tree Oil ...
file_US Looking back at 10 years of Downtown Arts | Times Leader
IT filename_IT Tornando indietro a ...
filename_IT Questo locale è molto consigliato per le famiglie
...
filename_IT Ci si chiede dove poter andare a mangiare una pizza Melanzana Capriccia ...
filename_IT Ideale per chi ama mangiare vegano
with country and file indices. I want to apply a function which remove stopwords based on the value of the index:
def removing(sent):
if df.loc['US','UK']:
stop_words = stopwords.words('english')
if df.loc['ES']:
stop_words = stopwords.words('spanish')
# (and so on)
c_text = []
for i in sent.lower().split():
if i not in stop_words:
c_text.append(i)
return(' '.join(c_text))
df['New_Column'] = df['Text'].astype(str)
df['New_Column'] = df['New_Column'].apply(removing)
Unfortunately I am getting this error:
----> 6 if df.loc['US']: 7 stop_words = stopwords.words('english') 8 if df.loc['ES']:
/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py in nonzero(self) 1477 def nonzero(self): 1478 raise ValueError( -> 1479 f"The truth value of a {type(self).name} is ambiguous. " 1480 "Use a.empty, a.bool(), a.item(), a.any() or a.all()." 1481 )
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
and I am still not understanding how to fix it. Can you please tell me how I can run the code without getting the error?