1
votes

I am getting this error while importing a JSON dataset from a website.

JSONDecodeError: Expecting value: line 1 column 2 (char 1)

I am working in colaboratory and wanted to import the sarcastic dataset, but since I don't know JSON, I am stuck. I have tried different placements of slash() character and also changing the -o parameter but nothing works correctly...my code[reprex]:=====>

!wget --no-check-certificate \ https://storage.googleapis.com/laurencemoroney-blog.appspot.com/sarcasm.json -o /tmp/sarcasm.json

import json
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

#importing the Sarcasm dataset from !wget --no-check-certificate \ https://storage.googleapis.com/laurencemoroney-blog.appspot.com/sarcasm.json \ 
#-o /tmp/sarcasm.json

with open("/tmp/sarcasm.json", 'r') as f:
  datastore = json.load(f)
  datastore = json.detect_encoding()
  print (datastore)
sentences = []
labels = []
urls = []

I think the problem might be the fact the the data is being imported in HTML format, which has to be converted in JSON format(or something compatible with it). Any help would be appreciated! :)

3

3 Answers

0
votes

I suspect you are saving the log of the transaction(instead of the doc itself) to /tmp/sarcasm.json.

Try --output-document=sarcasm.json instead

wget --no-check-certificate "https://storage.googleapis.com/laurencemoroney-blog.appspot.com/sarcasm.json" --output-document=sarcasm.json
0
votes

There is no need to detect the encoding, json library will take care of it

Remove the below line and try,

datastore = json.detect_encoding()
0
votes

try using -O instead of -o

!wget --no-check-certificate \
    https://storage.googleapis.com/laurencemoroney-blog.appspot.com/sarcasm.json -O /tmp/sarcasm.json