4
votes

I am taking some data from a .csv via the .csv type provider and putting it in to a data frame to be used by R.

Here is the import:

#r @"..\packages\R.NET.1.5.5\lib\net40\RDotNet.dll"
#r @"..\packages\RDotNet.FSharp.0.1.2.1\lib\net40\RDotNet.FSharp.dll"
#r @"..\packages\RProvider.1.0.3\lib\RProvider.dll"
#r @"..\packages\FSharp.Data.1.1.10\lib\net40\FSharp.Data.dll"

open System
open RDotNet
open RProvider
open RProvider.``base``
open RProvider.graphics
open FSharp.Data.Csv
open System.IO

let data = CsvFile.Load(@"C:\TEST.csv")
let date = data.Data |> Seq.map(fun r -> r.Columns.[3])
let time = data.Data |> Seq.map(fun r -> r.Columns.[10])
let disposition = data.Data |> Seq.map(fun r -> r.Columns.[11]);

let month = data.Data |> Seq.map(fun r -> r.Columns.[4])
let time2 = data.Data |> Seq.map(fun r -> r.Columns.[7])

When I try and create the data frame, I get the following exception - regardless of the datatype that I am trying to load:

System.Exception: No converter registered for type Microsoft.FSharp.Collections.FSharpList1[[System.Tuple2[[System.String, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.Collections.Generic.IEnumerable`1[[System.String, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]], mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]], mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]] or any of its base types

Here is an example of the DF load.

let namedParameters = [
    "month",month;
    "time",time2;]


namedParameters
|> R.data_frame
|> R.plot

The thing is - there are int datatypees. Is it picking up the header? Do I need to cast? If so, isn't that what the type provider is for? What am I missing? Thanks in advance.

1

1 Answers

6
votes

The R.data_frame overload that you need is the one taking a dictionary of key value pairs. You could either construct this dictionary yourself, or you can use namedParams function that is included in the R type provider library.

To avoid confusion, I renamed your list of pairs to df. Then you can create a frame like this:

let df = 
  [ "month", month
    "time", time2 ]

namedParams df
|> R.data_frame
|> R.plot

PS: Is there any reason why are you using CsvFile to read the data dynamically, instead of using the CSV type provider which should (assuming it works for your input) give you nicer and typed data access to the CSV data?