Calling printf with a templated functor segfaults (64-bit only, valgrind clean in 32-bit)

Question

I am presently debugging some C++ code written in the late 90's that parses scripts to load data, perform simple operations, and print results etc.

The people who wrote the code used functors to map string keywords in the file it is parsing to actual function calls, and they are templated (with a maximum number of 8 arguments) to handle the myriad of function interfaces that the user might request in their script.

For the most part this all works fine, except that in recent years it started to segfault on some of our 64-bit build systems. Running things through valgrind, to my surprise, I found that the errors appear to be happening inside "printf", which is one of said functors. Here are some code snippets to show how this works.

First, the script that is is parsing contains the following line:

printf( "%5.7f %5.7f %5.7f %5.7f\n", cos( j / 10 ), tan( j / 10 ), sin( j / 10 ), sqrt( j / 10 ) );

where cos, tan, sin, and sqrt are also functors corresponding to libm (this detail is unimportant, if I replace those with fixed numerical values I get the same result).

When it comes to calling printf, it is done in the following way. First, the templated functor:

template<class R, class T1, class T2, class T3, class T4, class T5, class T6, class T7, class T8>
class FType
{
    public :
        FType( const void * f ) { _f = (R (*)(T1,T2,T3,T4,T5,T6,T7,T8))f;  }
        R operator()( T1 a1,T2 a2,T3 a3,T4 a4,T5 a5,T6 a6,T7 a7,T8 a8 )
        { return _f( a1,a2,a3,a4,a5,a6,a7,a8); }

    private :
        R (*_f)(T1,T2,T3,T4,T5,T6,T7,T8);

};

And then the code which calls it is inside another template class - I show the prototype and the relevant piece of code which uses FType (as well as some extra code I put in for debugging):

template<class T1, class T2, class T3, class T4, class T5, class T6, class T7, class T8>
static Token
evalF(
    const void *            f,
    unsigned int            nrargs,
    T1              a1,
    T2              a2,
    T3              a3,
    T4              a4,
    T5              a5,
    T6              a6,
    T7              a7,
    T8              a8,
    vtok &              args,
    const Token &           returnType )
{
  Token     result;

  printf("Count: %i\n",++_count);

  if( _count == 2 ) {
    const char *fmt = *((const char **) &a1);

    result = printf(fmt,a2,a3,a4,a5,a6,a7,a8);

    FType<int, const void*,T2,T3,T4,T5,T6,T7,T8>    f1(f);
    result = f1("Hello, world.\n",a2,a3,a4,a5,a6,a7,a8);
    result = f1("Hello, world2 %5.7f\n",a2,a3,a4,a5,a6,a7,a8);
    result = f1(fmt,a2,a3,a4,a5,a6,a7,a8);
  } else {
    FType<int, T1,T2,T3,T4,T5,T6,T7,T8> f1(f);
    result = f1(a1,a2,a3,a4,a5,a6,a7,a8);
  }
}

I inserted the if(_count == 2) bit (since this function gets called a number of times). Under normal circumstances, it only performs the operations in the else clause; it calls the FType constructor (which templates the return type as int) with "f" which is a functor for printf (verified in the debugger). Once f1 is constructed, it calls the overloaded call operator with all of the templated arguments, and valgrind starts to complain:

==29358== Conditional jump or move depends on uninitialised value(s)
==29358==    at 0x92E3683: __printf_fp (printf_fp.c:406)
==29358==    by 0x92E05B7: vfprintf (vfprintf.c:1629)
==29358==    by 0x92E88D8: printf (printf.c:35)
==29358==    by 0x5348C45: FType<int, void const*, double, double, double, double, void const*, void const*, void const*>::operator()(void const*, double, double, double, double, void const*, void const*, void const*) (Interpreter.cc:321)
==29358==    by 0x51BAB6D: Token evalF<void const*, double, double, double, double, void const*, void const*, void const*>(void const*, unsigned int, void const*, double, double, double, double, void const*, void const*, void const*, std::vector<Token, std::allocator<Token> >&, Token const&) (Interpreter.cc:542)

So, this led to the experiments inside the if() clause. First, I tried calling printf directly with the same arguments (note the typecasting trick with parameter a1 -- the format -- in order to get it to compile; otherwise it complains for many instances of the template where T1 isn't (char *) as printf expects). This works fine.

Next, I tried calling f1 with a replacement format string that has no variables in it (Hello, world). This also works fine.

Then I add in one of the variables (Hello, World2 %5.7f), and then I start to see valgrind errors as above.

If I run this code on a 32-bit system, it is valgrind clean (otherwise same versions of glibc, gcc).

Running on several different Linux systems (all 64-bit), sometimes I get a segfault (e.g. RHEL5.8/libc2.5 and openSUSE11.2/libc-2.10.1), and sometimes I don't (e.g. libc2.15 with Fedora 17 and Ubunutu 12.04), but valgrind always complains in a similar way for all systems, making me think it is a fluke whether it crashes or not.

This all leads me to suspect some sort of bug with glibc in 64-bit, although I would be much happier if someone can find something wrong with this code!

One hunch I had is that it is related, somehow, to the parsing of variable argument lists. How exactly do these play with templates? It's not actually clear to me how this works, because it doesn't know the format string until runtime, so how does it know which particular instances of the template to make when compiling? However, this doesn't explain why everything seems fine in 32-bit.

Update in response to comments

Thank you everyone for this helpful discussion. I think that the answer from awn regarding the %al register is probably the correct explanation, although I have not yet verified it. Regardless, for the benefit of the discussion, here is a full, minimal program that reproduces the error on my 64-bit system that others can play with. If you #define _VOID_PTR at the top, it uses void * pointers to pass around the function pointers as in the original code (and triggers the valgrind errors). If you comment-out the #define _VOID_PTR, it will instead use properly prototyped function pointers as WhosCraig suggested. The problem with this case is that I couldn't simply put int (*f)(const char *, double, double) = &printf; since the compiler complains about the prototypes mismatching (maybe I'm just thick and there is a way to do this? - I'm guessing that this is the problem the original author was trying to get around with the void * pointers). To deal with this specific case, I create this wrap_printf() function with the correct explicit argument list. When I execute this version of the code it is valgrind clean. Unfortunately this doesn't tell us whether it is a void * vs. function pointer storage problem, or something related to the %al register; I think that most evidence points to the latter case, and I suspect that wrapping printf() with a fixed argument list has forced the compiler to do "the right thing":

#include <cstdio>

#define _VOID_PTR  // set if using void pointers to pass around function pointers

template<class R, class T1, class T2, class T3>
class FType
{
public :
#ifdef _VOID_PTR
  FType( const void * f ) { _f = (R (*)(T1,T2,T3))f; }
#else
  typedef R (*FP)(T1,T2,T3);
  FType( R (*f)(T1,T2,T3 )) { _f = f; }
#endif

  R operator()( T1 a1,T2 a2,T3 a3)
  { return _f( a1,a2,a3); }

private :
  R (*_f)(T1,T2,T3);

};

template <class T1, class T2, class T3> int wrap_printf( T1 a1, T2 a2, T3 a3 ) {
  const char *fmt = *((const char **) &a1);
  return printf(fmt, a2, a3);
}

int main( void ) {

#ifdef _VOID_PTR
  void *f = (void *)printf;
#else
  // this doesn't work because function pointer arguments don't match printf prototype:
  // int (*f)(const char *, double, double) = &printf;

  // Use this wrapper instead:
  int (*f)(const char *, double, double) = &wrap_printf;
#endif

  char a1[]="%5.7f %5.7f\n";
  double a2=1.;
  double a3=0;

  FType<int, const char *, double, double> f1(f);

  printf(a1,a2,a3);
  f1(a1,a2,a3);

  return 0;
}

Can you please indicate which line of Interpreter.cc is line number 542? And the same for line 321? — Some programmer dude
You might also want to see if your compiler support variadic templates, as well as std::function. — Some programmer dude
Line 542 is "Hello, world2". I get identical valgrind output if instead it executes the code in the else clause calling f1(). — echapin
For the moment we're trying to stay away from newer standards as much as possible because this is a cross-platform project (various flavours of linux and osx at the moment). However, we'll go down that road if needed... — echapin
You should be aware: Ultimately this code is passing a code pointer through a const void*, then casting it back to a code pointer. The former is allowed by the standard; the latter is not. i.e. func-ptr-to-void-ptr is ok, but void-ptr-to-func-ptr is not. This is ultimately for platforms where code pointers and data pointers have different bit representations, and likely explains why this works on 32-bit, but fails on 64-bit for you. — WhozCraig

awn awn · Accepted Answer · 2013-03-11T16:10:59

By the System V amd64 ABI, which used by 64-bit Linux (and many other Unixes), the functions with fixed number of arguments and with variable number of arguments have a slighly different calling convension.

Cite from the "System V Application Binary Interface AMD64 Architecture Processor Supplement" Draft 0.99.5 [2], chapter 3.2.3 "Parameter Passing":

For calls that may call functions that use varargs or stdargs (prototype-less calls or calls to functions containing ellipsis (...) in the declaration) %al is used as hidden argument to specify the number of vector registers used.

Now, the 3 step sequence:

printf(3) is a such variable-arguments function. Therefore, expects the %al register to be filled properly.
Your FType::_f declared as a pointer to the function with fixed number of arguments. Therefore, compiler doesn't care about %al, when calls something through it.
When printf() is called through FType::_f, it expects properly filled %al (because of 1), but compiler didn't care to fill it (because of 2), and as consequence, the printf() finds a "garbage" in %al.

Usage of "garbage" instead of properly initialized value may easy lead to variety of unwanted results, including segfaults observed by you.

For further information, see:
[1] http://en.wikipedia.org/wiki/X86_calling_conventions#x86-64_calling_conventions
[2] http://x86-64.org/documentation/abi.pdf

Calling printf with a templated functor segfaults (64-bit only, valgrind clean in 32-bit)

2 Answers