R: Enriched debugging for linear code chains

Question

I am trying to figure out if it is possible, with a sane amount of programming, to create a certain debugging function by using R's metaprogramming features.

Suppose I have a block of code, such that each line uses as all or part of its input the output from thee line before -- the sort of code you might build with pipes (though no pipe is used here).

{
f1(args1)                  -> out1
f2(out1, args2)            -> out2
f3(out2, args3)            -> out3
...
fn(out<n-1>, args<n>)      -> out<n>
}

Where for example it might be that:

f1 <- function(first_arg, second_arg, ...){my_body_code},

and you call f1 in the block as:

f1(second_arg = 1:5, list(a1 ="A", a2 =1), abc = letters[1:3], fav = foo_foo)

where foo_foo is an object defined in the calling environment of f1.

I would like a function I could wrap around my block that would, for each line of code, create an entry in a list. Each entry would be named (line1, line2) and each line entry would have a sub-entry for each argument and for the function output. the argument entries would consist, first, of the name of the formal, to which the actual argument is matched, second, the expression or name supplied to that argument if there is one (and a placeholder if the argument is just a constant), and third, the value of that expression as if it were immediately forced on entry into the function. (I'd rather have the value as of the moment the promise is first kept, but that seems to me like a much harder problem, and the two values will most often be the same).

All the arguments assigned to the ... (if any) would go in a dots = list() sublist, with entries named if they have names and appropriately labeled (..1, ..2, etc.) if they are assigned positionally. The last element of each line sublist would be the name of the output and its value.

The point of this is to create a fairly complete record of the operation of the block of code. I think of this as analogous to an elaborated version of purrr::safely that is not confined to iteration and keeps a more detailed record of each step, and indeed if a function exits with an error you would want the error message in the list entry as well as as much of the matched arguments as could be had before the error was produced.

It seems to me like this would be very useful in debugging linear code like this. This lets you do things that are difficult using just the RStudio debugger. For instance, it lets you trace code backwards. I may not know that the value in out2 is incorrect until after I have seen some later output. Single-stepping does not keep intermediate values unless you insert a bunch of extra code to do so. In addition, this keeps the information you need to track down matching errors that occur before promises are even created. By the time you see output that results from such errors via single-stepping, the matching information has likely evaporated.

I have actually written code that takes a piped function and eliminates the pipes to put it in this format, just using text manipulation. (Indeed, it was John Mount's "Bizarro pipe" that got me thinking of this). And if I, or we, or you, can figure out how to do this, I would hope to make a serious run on a second version where each function calls the next, supplying it with arguments internally rather than externally -- like a traceback where you get the passed argument values as well as the function name and and formals. Other languages have debugging environments like that (e.g. GDB), and I've been wishing for one for R for at least five years, maybe 10, and this seems like a step toward it.

Can you explain why this is tagged with rstudio? Are you intending that this path would be specific to the IDE? While some things are meant for the IDE and not the language in absence of the IDE, however, I'd think this is solely "R". — r2evans

G. Grothendieck G. Grothendieck · Accepted Answer · 2020-01-26T04:40:55

Just issue the trace shown for each function that you want to trace.

f <- function(x, y) {
  z <- x + y
  z
}
trace(f, exit = quote(print(returnValue())))
f(1,2)

giving the following which shows the function name, the input and output. (The last 3 is from the function itself.)

Tracing f(1, 2) on exit 
[1] 3
[1] 3

R: Enriched debugging for linear code chains

1 Answers