1
votes

Sometimes, when using macros to generate code, it is necessary to create identifiers that have global scope but which aren't really useful for anything outside the immediate context where they are created. For example, suppose it's necessary to compile-time-allocate an array or other indexed resource into chunks of various sizes.

/* Produce an enumeration of some story-book characters, and allocate some
   arbitrary index resource to them.  Both enumeration and resource indices
   will start at zero.

   For each name, defines HPID_xxxx to be the enumeration of that name.
   Also defines HP_ID_COUNT to be the total number of names, and
   HP_TOTAL_SIZE to be the total resource requirement, and creates an
   array hp_starts[HP_ID_COUNT+1].  Each character n is allocated resources
   from hp_starts[n] through (but not including) hp_starts[n+1].
*/

/* Give the names and their respective lengths */
#define HP_LIST \
  HP_ITEM(FRED, 4) \
  HP_ITEM(GEORGE, 6) \
  HP_ITEM(HARRY, 5) \
  HP_ITEM(RON, 3) \
  HP_ITEM(HERMIONE, 8) \
  /* BLANK LINE REQUIRED TO ABSORB LAST BACKSLASH */

#define HP_ITEM(name, length) HPID_##name,
typedef enum { HP_LIST HP_ID_COUNT} HP_ID;
#undef HP_ITEM

#define HP_ITEM(name, length) ZZQ_##name}; enum {ZZQX_##name=ZZQ_##name+(length)-1,
enum { HP_LIST HP_TOTAL_SIZE};
#undef HP_ITEM

#define HP_ITEM(name, length) ZZQ_##name,
const unsigned char hp_starts[] = { HP_LIST HP_TOTAL_SIZE};
#undef HP_ITEM

#include "stdio.h"

void main(void)
{
  int i;

  printf("ID count=%d Total size=%d\n",HP_ID_COUNT,HP_TOTAL_SIZE);
  for (i=0; HP_ID_COUNT > i; i++) /* Reverse conditional to avoid lt sign */
    printf("  %2d=%3d/%3d\n", i, hp_starts[i], hp_starts[i+1]-hp_starts[i]);
  printf("IDs are: \n");
#define HP_ITEM(name, length) printf("  %2d=%s\n",HPID_##name, #name);
  HP_LIST
#undef HP_ITEMS

}

Is there any normal convention for naming such identifiers to minimize the likelihood of conflicts, and also to minimize any confusion they might generate? In the above scenario, identifiers ZZQ_xxx will be the same as hp_starts[HPID_xxx], and might in some contexts be useful, though their primary purpose is to build the array and serve as placeholders in computing other ZZQ values and HP_TOTAL_SIZE. Identifiers ZZQX_xxx are useless, however; their sole purpose is to serve as placeholders when set the enumeration values for the succeeding items. Is there any good way to name such things?

Incidentally, I develop for small microcontrollers were RAM is at a greater premium than code space. Code is simulated by compiling on Microsoft VC++, but for production is compiled using a cross-compiler in straight C; code must thus compile in both C and C++.

Are there any other preprocessor tricks people can recommend for similar tasks?

2

2 Answers

2
votes

Is there any normal convention for naming such identifiers to minimize the likelihood of conflicts, and also to minimize any confusion they might generate?

It all boils down to the prefix you want to use. Ideally, one would want all the symbols to be easily associated with the list (HP_LIST) they are related to.

So why not to put the symbols under the same HP_ prefix? E.g. prefix HP__ZZQX_, to differentiate between the useful and the useless symbols.

N.B. I have worked on a project where one of the shared libraries is already using (internally) zzqx_ prefix, it was always showing up in the application's symbol table at the end. In the race for unlikely-to-be-used names, apparently many people take the same route (end of the latin alphabet) and end up with precisely same names. The opposite of the desired result. That is why I think that namespaces (or in C the symbol prefixes) should not be hidden/burried in the defines, but rather explicitly defined (e.g. easy to find and extract).

And as something concrete, here is your source enhanced with the hack around ## to generate the names using the prefix given as a preprocessor define:


/* the hack is needed to force the LIST_NAME to be expanded. 
   automatically adds underscores. yes, it's ugly */
#define LIST_SYMBOL_1(n1,n2,n3) n1##_##n2##_##n3
#define LIST_SYMBOL_0(n1,n2,n3) LIST_SYMBOL_1(n1,n2,n3)
#define LIST_SYMBOL(pref,name)  LIST_SYMBOL_0(LIST_NAME,pref,name)

/* give the name to the list. used by the LIST_SYMBOL(). */
#define LIST_NAME   HP

/* Give the names and their respective lengths */
#define HP_LIST \
  HP_ITEM(FRED, 4) \
  HP_ITEM(GEORGE, 6) \
  HP_ITEM(HARRY, 5) \
  HP_ITEM(RON, 3) \
  HP_ITEM(HERMIONE, 8) \
  /* BLANK LINE REQUIRED TO ABSORB LAST BACKSLASH */

#define HP_ITEM(name, length) HPID_##name,
typedef enum { HP_LIST HP_ID_COUNT} HP_ID;
#undef HP_ITEM

#define HP_ITEM(name, length)   LIST_SYMBOL(ZZQ,name)}; \
    enum {LIST_SYMBOL(ZZQX,name)=LIST_SYMBOL(ZZQ,name)+(length)-1,
enum { HP_LIST HP_TOTAL_SIZE};
#undef HP_ITEM

#define HP_ITEM(name, length) LIST_SYMBOL(ZZQ,name),
const unsigned char hp_starts[] = { HP_LIST HP_TOTAL_SIZE};
#undef HP_ITEM

#include <stdio.h>

void main(void)
{
  int i;
  printf("ID count=%d Total size=%d\n",HP_ID_COUNT,HP_TOTAL_SIZE);
  for (i=0; i<HP_ID_COUNT ; i++) /* bring the < back, SO is smart enough */
    printf("  %2d=%3d/%3d\n", i, hp_starts[i], hp_starts[i+1]-hp_starts[i]);
  printf("IDs are: \n");
#define HP_ITEM(name, length) printf("  %2d=%s\n",HPID_##name, #name);
  HP_LIST
#undef HP_ITEMS
}

Edit 1. My prefered approach is to put the data into a proper text file, e.g.:

FRED
GEORGE
HARRY
RON
HERMIONE

(note that you do not need length anymore) and write a script (or even a trivial C program) to generate source code from the text file, creating the necessary header (with the enum + declaration of the data) and source file (with the data). Modify the Makefile to run the script before compiling any sources and add the generated source files to the list of compiled sources.

That has HUGE advantage that the generated code is a plain code and can be indexed as such (unless you love the fun of "where that darn id came from?"). The internal constants simply do not appear anymore in the source code since script handles them. And no fugly preprocessor magic anymore.

0
votes

You might check the preprocessor macros in the boost project. They have clever preprocessor counters. Together with __LINE__ you may use this to generate unique identifiers that only depend on the line where they are expanded. This could help you to avoid a redefinition of your HP_ITEM macro.