3
votes

I'm new to Julia, so just wrapping my head around the basics.

I'm trying to read a CSV file into a DataFrame:

abc = CSV.File("ABC.csv")

The format is as follows:

2016-01-04T14:16:00Z,103.71,103.71,103.71,103.71,23300

I had expected Julia to recognise the ISO8601 timestamp and parse it to a DateTime, but it doesn't seem to have done that. The resulting typeof is String.

My questions are thus two fold:

  • How can I get CSV.jl to parse DateTime on import?
  • How can I instantiate a DateTime from a string.
2

2 Answers

2
votes

You need the Dates standard library. Suppose you have the following ABC.csv file:

datetime,x,y,z,w,n
2016-01-04T14:16:00Z,103.71,103.71,103.71,103.71,23300

You already know how to read it:

julia> using CSV

julia> csv = CSV.File("ABC.csv")
1-element CSV.File{false}:
 CSV.Row: (datetime = "2016-01-04T14:16:00Z", x = 103.71, y = 103.71, z = 103.71, w = 103.71, n = 23300)

Notice that you can access a column using:

julia> csv.datetime
1-element PooledArrays.PooledVector{String, UInt32, Vector{UInt32}}:
 "2016-01-04T14:16:00Z"

This format is not exactly the ISO format supported by default. You can convert it to a date time object using:

julia> using Dates

julia> DateTime(csv.datetime[1], "yyyy-mm-ddTHH:MM:SSZ")
2016-01-04T14:16:00

Now that we know how to convert a single entry of the column, we can use Julia's broadcast syntax to apply to all entries:

julia> DateTime.(csv.datetime, "yyyy-mm-ddTHH:MM:SSZ")
1-element Vector{DateTime}:
 2016-01-04T14:16:00

You can then save the resulting column in a new table. In Julia the CSV.jl table is not the same as the most popular DataFrames.jl table. You can easily convert to it before you start your processing pipeline:

julia> using DataFrames

julia> csv |> DataFrame
1×6 DataFrame
 Row │ datetime              x        y        z        w        n     
     │ String                Float64  Float64  Float64  Float64  Int64 
─────┼─────────────────────────────────────────────────────────────────
   1 │ 2016-01-04T14:16:00Z   103.71   103.71   103.71   103.71  23300

In summary, the following script can be used to convert the data:

using DataFrames
using Dates
using CSV

df = CSV.File("ABC.txt") |> DataFrame

df.datetime = DateTime.(df.datetime, "yyyy-mm-ddTHH:MM:SSZ")

You can find more information in the docstring ?DateTime.

An alternative solution may exist where you inform CSV.jl of the correct type using the types keyword option. Check the docstring ?CSV.File.

1
votes

You can also use the dateformat option of the CSV.File function:

julia> DataFrame(CSV.File("file.csv",dateformat="yyyy-mm-ddTHH:MM:SSZ"))
1×6 DataFrame
 Row │ datetime             x        y        z        w        n     
     │ DateTime…            Float64  Float64  Float64  Float64  Int64 
─────┼────────────────────────────────────────────────────────────────
   1 │ 2016-01-04T14:16:00   103.71   103.71   103.71   103.71  23300