5
votes

When I was learning OCaml essentials, I was told that every function in OCaml is actually a function with only one parameter. A multi-argument function is actually a function that takes one argument and returns a function that takes the next argumetn and returns ....

This is currying, I got that.

So my question is:

case 1

if I do

let plus x y = x + y

Inside OCaml when it compiles, will OCaml change it to let plus = fun x -> fun y -> x + y?


or the other way around that

case 2

If I do

let plus = fun x -> fun y -> x + y

OCaml will convert it to let plus x y = x + y?


Which case is true? What's the benifit or optimisation OCaml compiler has done in the correct case?

In addition, if case 2 is true, then what is the point to consider OCaml is doing currying? I mean it actually does the opposite way, right?

This question is actually related to Understand Core's `Fn.const`

4
let plus x y = x + y IS syntactic sugar for let plus = fun x -> fun y -> x + y. The second is how the first is defined in the language. There is no "change" because they are the same. As Jeffrey Scofield's answer says, a "non-curried" version would be let plus (x, y) = x + yuser102008
I just remembered those undocumented parameters of ocamlc (which also happen to work with ocaml!), which makes the compiler dumps representations of the code at different stages of compilations. Quite interesting to compare the way code is being translated throughout the compilation process, I wanted to mention them here (-dparsetree etc.), but it seems it's been mentioned in one of the answers below.didierc

4 Answers

7
votes

Both let plus x y = x + y and let plus = fun x -> fun y -> x + y will be compiled to the same code:

camlPlus__plus:
    leaq    -1(%rax, %rbx), %rax
    ret

Yes, exactly two assembler instructions, without any prologues and epilogues.

OCaml compiler performs several steps of optimizations, and actually "thinks" in a different categories. For example, both functions are represented with the same lambda code:

(function x y (+ x y))

I think, that according to the lambda above, you may think that OCaml compiler transforms to a non-curried version.

Update

I would also like to add a few words about the core's const function. Suppose we have two semantically equivalent representations of the const function:

let const_xxx c = (); fun _ -> c
let const_yyy c _ = c

in a lambda form they will be represented as:

(function c (seq 0a (function param c))) ; const_xxx
(function c param c)                     ; const_yyy

So, as you can see, const_xxx is indeed compiled in a curried form.

But the most interesting question, is why it is worth to write it in a such obscure code. Maybe there're some clues in assembly output (amd64):

camlPlus__const_xxx_1008:
    subq    $8, %rsp
.L101:
    movq    %rax, %rbx                    ; save c into %rbx (it was in %rax)
.L102:  
    subq    $32, %r15                     ; allocate memory for a closure
    movq    caml_young_limit(%rip), %rax  ; check
    cmpq    (%rax), %r15                  ; that we have memory, if not
    jb      .L103                         ; then free heap and go back
    leaq    8(%r15), %rax                 ; load closure address to %rax
    movq    $3319, -8(%rax)
    movq    camlPlus__fun_1027(%rip), %rdi
    movq    %rdi, (%rax)
    movq    $3, 8(%rax)
    movq    %rbx, 16(%rax)                ; store parameter c in the closure
    addq    $8, %rsp             
    ret                                   ; return the closure
.L103:  call    caml_call_gc@PLT
.L104:  jmp .L102

What about const_yyy? It is compiled simply as:

camlPlus__const_yyy_1010:
    ret

Just return the argument. So, it is assumed that the actual point of optimization, is that in const_xxx the closure creation is compiled inside the function and should be fast. On the other hand, const_yyy doesn't expect to be called in a curried way, so if you will call it without all the needed parameters, then compiler needs to add the code that creates a closure in the point of const_yyy partial application (i.e., to perform all the operations in the const_xxx every time you call const_xxx x).

To conclude, const optimization creates a function that is optimized for partial application. Although, it comes with cost. A non-optimized const function will outperform the optimized if they are called with all parameters. (Actually my parameter even droped a call to const_yyy when I applied it with two args.

5
votes

As far as the semantics of the OCaml language is concerned both of those definitions are completely equivalent definitions of a curried function. There's no such thing as a multi-argument function in the semantics of the OCaml language.

However the implementation is a different matter. Specifically the current implementation of the OCaml language supports multi-argument functions in its internal representation. When a curried function is defined a certain way (i.e. as let f x y = ... or let f = fn x -> fn y -> ...), this will be compiled to a multi-argument function internally. However if it is defined differently (like let f x = (); fn y -> ... in the linked question), it will be compiled to a curried function. This is only an optimization and does not affect the semantics of the language in any way. All three ways of defining a curried function are semantically equivalent.

Regarding your specific question about what gets turned into what: Since the transformation isn't from one piece of OCaml code into another piece of OCaml code, but rather from OCaml code to an internal representation, I think the most accurate way to describe it would be to say that the OCaml compiler turns both let plus x y = x + y and let plus = fn x -> fn y -> x + y into the same thing internally, not that it turns one into the other.

4
votes

Both case 1 and case 2 are curried functions. Here is the non-curried version:

let plus (x, y) = x + y
3
votes

Okay, I learned that the native compiler will optimize your code, what I expect it to do. But here is the bytecode compiler:

let plus1 x y = x + y
let plus2 = fun x y -> x + y
let plus3 = function x -> function y -> x + y

treated with ocamlc -c -dinstr temp.ml gives me:

       branch L4
        restart
L1:     grab 1
        acc 1
        push
        acc 1
        addint
        return 2
        restart
L2:     grab 1
        acc 1
        push
        acc 1
        addint
        return 2
        restart
L3:     grab 1
        acc 1
        push
        acc 1
        addint
        return 2

which means the result is exactly the same, it is only a syntax difference. And the arguments are taken one by one.

Btw, one more syntax point: fun can be written with n arguments, function only with one.

From the conceptual point of view I would largely favor function x -> function y -> over the others.