Julia: confusion with error on datatype / DataFrame

Question

New to Julia. Following this blog to do Neural Network:

http://blog.yhathq.com/posts/julia-neural-networks.html

I am confused about data types and error messages in Julia. This is my code (again, following the blog post on Neural Network):

# read in df to train
train_df = readtable("data/winequality-red.csv", separator=';')
# create train and test data splits
y = train_df[:quality]
x = train_df[:, 1:11] # matrix of all except quality
# vector() and matrix() from blog post

n = length(y)
is_train = shuffle([1:n] .> floor(n * .25))

x_train,x_test = x[is_train,:],x[!is_train,:]
y_train,y_test = y[is_train],y[!is_train]

type StandardScalar
  mean::Vector{Float64}
  std::Vector{Float64}
end

# initialize empty scalar
function StandardScalar()
  StandardScalar(Array(Float64, 0), Array(Float64, 0))
end

# compute mean and std of each col
function fit_std_scalar!(std_scalar::StandardScalar, x::Matrix{Float64})
  n_rows, n_cols = size(x_test)
  std_scalar.std = zeros(n_cols)
  std_scalar.mean = zeros(n_cols)

  for i = 1:n_cols
    std_scalar.mean[i] = mean(x[:,i])
    std_scalar.std[i] = std(x[:,i])
  end
end

# further vectorize the transformation
function transform(std_scalar::StandardScalar, x::Matrix{Float64})
  # element wise subtraction of mean and division of std
  (x .- std_scalar.mean') ./ std_scalar.std'
end

# fit and transform
function fit_transform!(std_scalar::StandardScalar, x::Matrix{Float64})
  fit_std_scalar!(std_scalar, x)
  transform(std_scalar, x)
end

# fit scalar on training data and then transform the test
std_scalar = StandardScalar()

n_rows, n_cols = size(x_test)

# cols before scaling
println("Col means before scaling: ")
for i = 1:n_cols
  # C printf function
  @printf("%0.3f ", (mean(x_test[:, i])))
end

I am getting the error:

'.-' has no method matching .-(::DataFrame, ::Array{Float64,2}) in fit_transform! ...

For this code:

x_train = fit_transform!(std_scalar, x_train)
x_test = transform(std_scalar, x_test)

# after transforming
println("\n Col means after scaling:")
for i = 1:n_cols
  @printf("%0.3f ", (mean(x_test[:,i])))
end

I am new to Julia and am just not understanding what the issue is. Vector() and Matrix() do not work from the blog post. I assume that was from an older version of DataFrame.

What I think my issue is: these functions are taking in ::Matrix{Float64} and I am passing in the DataFrame. I assume that deprecated (?) Matrix() would have fixed this? Not sure. How do I analyze this error and pass these functions the correct types (if that is the problem here)?

Thank you!

IainDunning IainDunning · Accepted Answer · 2015-01-28T00:08:21

I believe vector(...) and matrix(...) were both replaced with just array(...), but I can't find an issue number to correspond with that change.

Julia: confusion with error on datatype / DataFrame

2 Answers