In a previous question (Working with heterogenous data in a statically typed language), I asked about how F# handles standard tasks in data analysis, such as manipulating an untyped CSV file. Dynamic langauges excel at basic tasks like
data = load('income.csv')
data.log_income = log(income)
In F#, the most elegant approach seems to be the question mark (?) operator. Unfortunately in the process we lose static typing and still need type annotations here and there.
One of the most exciting future feature of F# are Type Providers. With minimal loss of type safety, a CSV type provider could provide types by dynamically examining the file.
But data analysis typically doesn't stop there. We often transform the data and create new datasets, via a pipeline of operations. My question is, can Type Providers help if we mostly manipulate data? For instance:
open CSV // Type provider
let data = CSV(file='income.csv') // Type provider magic (syntax?)
let log_income = log(data.income) // works!
This works but pollutes the global namespace. It is often more natural to think about adding a column, rather than creating a new variable. Is there some way to do?
let data.logIncome = log(data.income) // won't work, sadly.
Do type providers provide an escape from using the (?) operator when the goal is creating new derivative or cleaned-up datasets?
Perhaps something like:
let newdata = colBind data {logIncome = log(data.income)} // ugly, does it work?
Other ideas?