1
votes

I am trying to load some data from a CSV file to Azure table storage row by row using Python. String columns are getting inserted directly but the date column mentioned in the source in the format 2018-02-18T11:29:12.000Z is still loaded as string. This means I am unable to query the records using date column.

Can someone tell me if there is a way to create an entity definition (datatype for columns) for the table and use it to load the records in order to avoid dates loaded with string type?

2
Any progress? Does my answer helps you?Jay Gong
Thanks Jay... I am trying to read csv using Pandas and iterating line by line to insert the data into the table. While doing so, date is converted to string. yet to test your approach, will come back soon.AngiSen
Sure,waiting for you reply.Jay Gong
Jay, i am using your approach and loaded the value 2017-08-06T06:04:11.000Z.. Still, its loaded as a string not date... in your example dataset, both 100 and 'jay1' are loaded as strings.. so i think the default format it accepts is string.. not sure how you got datetimeAngiSen
Hi,please see my update answer!Jay Gong

2 Answers

1
votes

I tried to reproduce your issue but failed. I loaded my csv file to Azure Table Storage and the data column loaded as DataTime Type.

You could refer to my code as below:

my csv file:

'tasksSeattle','001','jay1',100,2018-02-18T11:29:12.000Z
'tasksSeattle','002','jay2',100,2018-02-18T11:29:12.000Z
'tasksSeattle','003','jay3',100,2018-02-18T11:29:12.000Z
'tasksSeattle','004','jay4',100,2018-02-18T11:29:12.000Z
'tasksSeattle','005','jay5',100,2018-02-18T11:29:12.000Z

my python code:

from azure.cosmosdb.table.tableservice import TableService
from azure.cosmosdb.table.models import Entity
import csv
import sys
import codecs

table_service = TableService(connection_string='***')

reload(sys)
sys.setdefaultencoding('utf-8')
filename = "E:/jay.csv"

with codecs.open(filename, 'rb', encoding="utf-8") as f_input:
    csv_reader = csv.reader(f_input)
    for row in csv_reader:
        task = Entity()
        task.PartitionKey = row[0]
        task.RowKey = row[1]
        task.description = row[2]
        task.priority = row[3]
        task.logtime = row[4]
        table_service.insert_entity('tasktable', task)

load result:

enter image description here

Hope it helps you.


Update Answer:

If you observe the Data type options box in the screenshot above, it's not hard to see that only those 8 types are supported by the Table Service Data Model:

  • Edm.Binary
  • Edm.Boolean
  • Edm.DateTime
  • Edm.Double
  • Edm.Guid
  • Edm.Int32
  • Edm.Int64
  • Edm.String

You could use entity.x = EntityProperty(EdmType.STRING, 'y') function which is mentioned here to define data types as you want.

Please refer to my sample code as below:

with codecs.open(filename, 'rb', encoding="utf-8") as f_input:
    csv_reader = csv.reader(f_input)
    for row in csv_reader:
        task = Entity()
        task.PartitionKey = row[0]
        task.RowKey = row[1]
        task.description = row[2]
        task.priority = EntityProperty(EdmType.INT32, row[3])
        task.logtime = EntityProperty(EdmType.DATETIME, row[4])

        table_service.insert_entity('tasktable', task)

Just for summary:

We could convert the string to datetime and get the date fragments as below:

task.startDateTime = datetime(startDateFrag.year,startDateFrag.month,startDateFrag.day,startDateFrag.hour, startDateFrag.minute,startDateFrag.second)

0
votes

As I can't comment on Jay's answer because of lack of reputation, just adding code part on how to do date conversion from string.

from dateutil.parser import parse
from datetime import datetime

try:
    dt = parse(row[4]);
    if(isinstance(dt,datetime.date)):
        task.logtime = EntityProperty(EdmType.DATETIME, dt)
except ValueError as err:
    task.logtime=row[4] 

The try/except block can be used in case you are dynamically reading a json where you are not aware of the datatype of each field.