3
votes

We have some legacy code that at one point in time long data types were refactored to int data types. During this refactor a number of printf / sprintf format statements were left incorrect as %ld instead of changed to %d. For example:

int iExample = 32;
char buf[200];

sprintf(buf, "Example: %ld", iExample);

This code is compiled on both GCC and VS2012 compilers. We use Coverity for static code analysis and code like in the example was flagged as a 'Printf arg type mismatch' with a Medium level of severity, CWE-686: Function Call With Incorrect Argument Type I can see this would be definitely a problem had the format string been that of an signed (%d) with an unsigned int type or something along these lines.

I am aware that the '_s' versions of sprintf etc are more secure, and that the above code can also be refactored to use std::stringstream etc. It is legacy code however...

I agree that the above code really should be using %d at the very least or refactored to use something like std::stringstream instead.

Out of curiosity is there any situation where the above code will generate incorrect results? As this legacy code has been around for quite some time and appears to be working fine.

UPDATED

  • Removed the usage of the word STL and just changed it to be std::stringstream.
2
It's undefined behaviour, so lots of situations. And string streams aren't part of the STL.chris
@chris agree with the undefined behavior part. But: "string streams aren't part of the STL" uh.... what are you talking about? std::stringstream is absolutely a thing. Unless you meant to argue the semantics of STL vs. Standard Library, but the author never mentioned STL so...aruisdante
In general, see related question: understanding the dangers of sprintfaruisdante
@aruisdante, Yes, it's a bit less pedantic to argue it when referring to parts of the standard library that weren't even in the STL. Anyway, the question was edited.chris
@chris thanks i removed any reference to STL. What sort of undefined behavior are we talking as this stuff appears to be working fine...kmcnamee

2 Answers

4
votes

As far as the standard is concerned, the behavior is undefined, meaning that the standard says exactly nothing about what will happen.

In practice, if int and long have the same size and representation, it will very likely "work", i.e., behave as if the correct format string has been used. (It's common for both int and long to be 32 bits on 32-bit systems).

If long is wider than int, it could still work "correctly". For example, the calling convention might be such that both types are passed in the same registers, or that both are pushed onto the stack as machine "words" of the same size.

Or it could fail in arbitrarily bad ways. If int is 32 bits and long is 64 bits, the code in printf that tries to read a long object might get a 64-bit object consisting of the 32 bits of the actual int that was passed combined with 32 bits of garbage. Or the extra 32 bits might consistently be zero, but with the 32 significant bits at the wrong end of the 64-bit object. It's also conceivable that fetching 64 bits when only 32 were passed could cause problems with other arguments; you might get the correct value for iExample, but following arguments might be fetched from the wrong stack offset.

My advice: The code should be fixed to use the correct format strings (and you have the tools to detect the problematic calls), but also do some testing (on all the C implementations you care about) to see whether it causes any visible symptoms in practice. The results of the testing should be used only to determine the priority of fixing the problems, not to decide whether to fix them or not. If the code visibly fails now, you should fix it now. If it doesn't, you can get away with waiting until later (presumably you have other things to work on).

1
votes

It's undefined and depends on the implementation. On implementations where int and long have the same size, it will likely work as expected. But just try it on any system with 32-bit int and 64-bit long, especially if your integer is not the last format argument, and you're likely to get problems where printf reads 64 bits where only 32 were provided, the rest quite possibly garbage, and possibly, depending on alignment, the following arguments also cannot get accessed correctly.