5
votes

I've researched code of some library and noticed that calls to calloc there are followed by memset for block allocated by calloc. I've found this question with quite comprehensive answer on differences between calloc and malloc + memset and calling memset for just before allocated storage:

Why malloc+memset is slower than calloc?

What i still can't understand is why one would want to do so. What are benefits of this operations?

The code example from mentioned above library:

light_pcapng_file_info *light_create_default_file_info()
{
    light_pcapng_file_info *default_file_info = calloc(1, sizeof(light_pcapng_file_info));
    memset(default_file_info, 0, sizeof(light_pcapng_file_info));
    default_file_info->major_version = 1;
    return default_file_info;
}

The code of allocated structure(each array contains 32 elements):

typedef struct _light_pcapng_file_info {
    uint16_t major_version;
    uint16_t minor_version;
    char *file_comment;
    size_t file_comment_size;
    char *hardware_desc;
    size_t hardware_desc_size;
    char *os_desc;
    size_t os_desc_size;
    char *user_app_desc;
    size_t user_app_desc_size;
    size_t interface_block_count;
    uint16_t link_types[MAX_SUPPORTED_INTERFACE_BLOCKS];
    double timestamp_resolution[MAX_SUPPORTED_INTERFACE_BLOCKS];

} light_pcapng_file_info;

EDIT:

In addition to accepted answer i'd like to provide some info my colleague pointed me to. There was a bug in glibc which, sometimes, prevented calloc from zeroing out memory. Here's the link: https://bugzilla.redhat.com/show_bug.cgi?id=1293976

Actual bug report text in case link gets moved:

glibc: calloc() returns non-zero'ed memory

Description of problem:

At Facebook we had an app that started hanging and crashing weirdly when going from glibc-2.12-1.149.el6.x86_64 to glibc-2.12-1.163.el6.x86_64. Turns out this patch

glibc-rh1066724.patch

Introduced the problem.

You added the following bit to _int_malloc()

  /* There are no usable arenas.  Fall back to sysmalloc to get a chunk from
     mmap.  */
  if (__glibc_unlikely (av == NULL))
    {
      void *p = sYSMALLOc (nb, av);
      if (p != NULL)
       alloc_perturb (p, bytes);
      return p;
    }

But this isn't ok, alloc_perturb unconditionally memset's the front byte to 0xf, unlike upstream where it checks to see if perturb_byte is set. This needs to be changed to

if (p != NULL && && __builtin_expect(perturb_byte, 0))
   alloc_perturb (p, bytes);
return p;

The patch I've attached fixes the problem for me.

This problem is exacerbated by the fact that any sort of lock contention on the arena's results in us falling back on mmap()'ing a new chunk. This is because we check to see if the uncontended arena we check is corrupt, and if it is we loop through, and if we loop to the beginning we know we didn't find anything. Except if our initial arena isn't actually corrupt we'll still return NULL, so we fall back on this mmap() thing more often, which really makes things unstable.

Please get this fixed as soon as possible, I'd even go so far as to call it a possible security issue.

3
calloc already 0-initializes the memory, so using memset to do it again is pointless (unless you're dealing with a buggy calloc implementation).Sander De Dycker
Perhaps the library author didn't use calloc at first, but malloc? Perhaps the author of the library doesn't really know what calloc is supposed to do? Perhaps there are a myriad of other reasons? The only one who can answer is the original author. All we can do is guess.Some programmer dude
The code was written by some rookie. And that's it... nothing to ponder over.Lundin
Reading other code, to learn from it, is important, but it does have this ugly downside: there's a lot of staggeringly badly written code out there, and this is one little example. There is absolutely no point in calling memset after calloc.Steve Summit

3 Answers

7
votes

Calling memset() ensures that the OS actually does the virtual memory mapping. As noted in the answers on the question you linked, calloc() can be optimized so that the actual memory mapping is deferred.

The application might have reasons to not defer the actual creation of the virtual memory mapping - such as using the buffer to read from a very high-speed device, although in the case of using memset() to zero out the memory, using calloc() instead of malloc() does seem redundant.

3
votes

Nobody is perfect, that's all. Yes, memset to zero following a calloc is extravagant. If you want an explicit memset in order to guarantee you are in possession of the memory that you've asked for, then it should follow a malloc instead.

New code has about 20 bugs per 1,000 lines and the laws of probability imply that not all of them are weeded out. Plus this is not really a bug, as there's no ill-behaviour to be observed.

1
votes

I would call it a bug, since it's doing pointless work, calloc() is specified to return already-cleared memory so why clear it again? Perhaps a failed refactoring, when someone switched from malloc().

If the code is open source and/or in a repository you have access to, I would check the commit history for those lines and see what has been going on. With a bit of luck, there's a commit message that tells the motivation ...