0
votes
import csv
import pandas as pd
db = input("Enter the dataset name:")
table = db+".csv"
df = pd.read_csv(table)
df = df.sample(frac=1).reset_index(drop=True)
with open(table,'rb') as f:
    data = csv.reader(f)
    for row in data:
        rows = row
        break
print(rows)

I am trying to read all the columns from the csv file.

ERROR: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 15: invalid start byte

1

1 Answers

1
votes

You need to check encoding of your csv file.

For that you can use print(f) like this,

with open('file_name.csv') as f:
    print(f)

The output is something like this:

<_io.TextIOWrapper name='file_name.csv' mode='r' encoding='utf8'>

Open csv with that encoding like this,

with open(fname, "rt", encoding="utf8") as f:

As mentioned in comments, your encoding is cp1252

so,

with open(fname, "rt", encoding="cp1252") as f:
    ...

and for .read_csv,

df = pd.read_csv(table, encoding='cp1252')