40
votes

I have some difficulty in importing a JSON file with pandas.

import pandas as pd
map_index_to_word = pd.read_json('people_wiki_map_index_to_word.json')

This is the error that I get:

ValueError: If using all scalar values, you must pass an index

The file structure is simplified like this:

{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}

It is from the machine learning course of University of Washington on Coursera. You can find the file here.

4
This is much more a pandas question than it it a JSON question -- you wouldn't have this specific error in any context that didn't involve pandas, but you can get this specific error without JSON being involved. - Charles Duffy
See for instance, stackoverflow.com/questions/17839973/… -- a question with the same error, but no JSON involved. - Charles Duffy
look like you are taking the ML courses from Emily :) - Bryan Fok
It is expecting a list. So if you do like this will work. pd.DataFrame([{"biennials": 522004, "lb915": 116290}]). - Aung

4 Answers

63
votes

Try

ser = pd.read_json('people_wiki_map_index_to_word.json', typ='series')

That file only contains key value pairs where values are scalars. You can convert it to a dataframe with ser.to_frame('count').

You can also do something like this:

import json
with open('people_wiki_map_index_to_word.json', 'r') as f:
    data = json.load(f)

Now data is a dictionary. You can pass it to a dataframe constructor like this:

df = pd.DataFrame({'count': data})
14
votes

You can do as @ayhan mention which will give you a column base format

Method 1

Or you can enclose the object in [ ] (source) as shown below to give you a row format that will be convenient if you are loading multiple values and planing on using matrix for your machine learning models.

df = pd.DataFrame([data])

Method 2

4
votes

I think what is happening is that the data in

map_index_to_word = pd.read_json('people_wiki_map_index_to_word.json')

is being read as a string instead of a json

{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}

is actually

'{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}'

Since a string is a scalar, it wants you to load it as a json, you have to convert it to a dict which is exactly what the other response is doing

The best way is to do a json loads on the string to convert it to a dict and load it into pandas

myfile=f.read()
jsonData=json.loads(myfile)
df=pd.DataFrame(data)
0
votes

For example cat values.json

{
name: "Snow",
age: "31"
}

df = pd.read_json('values.json')

Chances are you might end up with this Error: if using all scalar values, you must pass an index

Pandas looks up for a list or dictionary in the value. Something like cat values.json

{
name: ["Snow"],
age: ["31"]
}

So try doing this. Later on to convert to html tohtml()

df = pd.DataFrame([pd.read_json(report_file,  typ='series')])
result = df.to_html()