I wanted to have a look at the julia language, so I wrote a little script to import a dataset I'm working with. But when I run and profile the script it turns out that it is much slower than a similar script in R. When I do profiling it tells me that all the cat commands have a bad performance.
The files look like this:
#
#Metadata
#
Identifier1 data_string1
Identifier2 data_string2
Identifier3 data_string3
Identifier4 data_string4
//
I primarily want to get the data_strings and split them up into a matrix of single characters. This is a somehow minimal code example:
function loadfile()
f = open("/file1")
first=true
m = Array(Any, 1,0)
for ln in eachline(f)
if ln[1] != '#' && ln[1] != '\n' && ln[1] != '/'
s = split(ln[1:end-1])
s = split(s[2],"")
if first
m = reshape(s,1,length(s))
first = false
else
s = reshape(s,1,length(s))
println(size(m))
println(size(s))
m = vcat(m, s)
end
end
end
end
Any idea why julia might be slow with the cat command or how i can do it differently?
Thanks for any suggestions!