I'm still in the early phase of learning OCaml and am curious to know what the best way to extract maximum performance out of generic code in OCaml is.
As a small experiment, I've written two polymorphic functions: one in C++ and the other in OCaml that find the largest element in a given array.
What I've observed is that while in C++ you pay no penalty for this sort of abstraction, the penalty in OCaml is a whopping one degree of magnitude drop in performance. And as an aside, the C++ solution I quickly concocted is more generic that the OCaml one, but I blame that primarily on my inexperience with the language.
My question is the following: how to write and use polymorphic functions in OCaml without paying the huge performance penalty that I've just observed?
Another thing I observed for this particular problem is that my functional solution in OCaml was slower than the imperative one, whereas the "functional" solution in C++ suffered no penalties compared to the analogue imperative approach.
C++ code was compiled with g++ -std="c++0x" -O3 -o max_cpp max.cpp
, using gcc-4.6.3
and OCaml code with: ocamlopt -o max_ml max.ml
using ocamlopt version 3.12.1.
Both of the generated executables were 32 bit and run on Ubuntu x64 11.04
I submit the code of both programs.
C++ code (not entirely safe, of course ;) )
#include <iostream>
#include <vector>
#include <numeric>
template <typename T> T max (T a, T b) {
return (a>b)? a : b;
}
template <typename I> typename I::value_type find_max (I begin, I end) {
auto max = *begin;
for (begin++;begin!=end;begin++)
if (*begin > max) max = *begin;
return max;
}
template <typename I> typename I::value_type find_max1(I begin, I end) {
return std::accumulate(begin, end, *begin, max< typename I::value_type> );
}
int main(int argc, char ** argv) {
const size_t nElem = atoi(argv[1]);
const size_t nIter = atoi(argv[2]);
std::vector<int> intA(nElem);
std::vector<double> dubA(nElem);
for (int i=0;i<nIter;i++) {
auto res1 = find_max(intA.begin(), intA.end());
auto res2 = find_max(dubA.begin(), dubA.end());
std::cout << "Max int: " << res1 << " " << "Max dub: " << res2 << std::endl;
}
}
OCaml code
let max a b = if a > b then a else b
(* functional version *)
let find_max vector =
Array.fold_right max vector vector.(0)
(* imperative version *)
let find_max1 vector =
let length = Array.length vector in
let max = ref (vector.(0)) in
for i=1 to length-1 do
if vector.(i) > !max then max := vector.(i)
done; max
let nElems = int_of_string Sys.argv.(1)
let nIter = int_of_string Sys.argv.(2)
let _ = let intA = Array.make nElems 0 in
let dubA = Array.make nElems 0.0 in
for i=1 to nIter do
let res1 = find_max intA in
let res2 = find_max dubA in
print_int !res1;
print_newline ();
print_float !res2;
done
However, if I "overload" the function to discriminate ints and floats then the performance of the program even exceeds that of my C++ code by 50%! I wonder why though.
let find_max_int (vector : int array) =
let length = Array.length vector in
let max = ref (vector.(0)) in
for i=1 to length-1 do
if vector.(i) > !max then max := vector.(i)
done; max
let find_max_float (vector : float array) =
let length = Array.length vector in
let max = ref (vector.(0)) in
for i=1 to length-1 do
if vector.(i) > !max then max := vector.(i)
done; max
The timings were performed rather crudely with:
time ./max_cpp 1000000 100
and time ./max_ml 1000000 100
find_max
when C++ already providesstd::max_element
? – Jerry Coffin-inline 1000
for instance). – didierc