I'm trying to learn to write good julia code. I would like to code up the following statistic.
(note 1{A} = 1 if A true, 0 if A false)
where
and
function cohens_kappa(x::Vector{Int}, k::Int)
support = unique(x)
m = length(support)
n = length(x)
y = BitArray(n, m)
for j in eachindex(support)
y[:,j] = (X .== support[j])
end
num = 0.0
den = 0.0
for j in eachindex(support)
pjjk = sum(y[(1 + k):n, j] & y[1:(n - k), j]) / (n - k)
pj = sum(y[:, j]) / n
num += pjjk - pj ^ 2
den += (1 / m) - pj ^ 2
end
return (num / den)
end
Is this most efficient way to code this up?
EDIT: Thanks for all the suggestions guys. Can you explain why your code is more efficient? I'd like to learn how to continue to write good code in the future.
testing against @user3580870 two examples we have
@time [cohens_kappa(X, k) for k in 1:15]
0.000507 seconds (1.58 k allocations: 269.016 KB)
@time [cohens_kappa2(X, k) for k in 1:15]
0.000336 seconds (166 allocations: 12.375 KB)
@time [cohens_kappa3(X, k) for k in 1:15]
0.000734 seconds (303 allocations: 84.109 KB)
Looks like your second suggestion is not as fast, but it makes less allocations than my original version, so might be faster for very large vectors.



X = rand(1:100,N)and make N large, so run-time is around 0.01secs to make timings better) - Dan Getz