4
votes

I have the following test program:

#include <string.h>
int q(int *p) {
    int x;
    memcpy(&x,p,sizeof(int));
    x+=12;
    memcpy(p,&x,sizeof(int));
    return p[0];
}

When I compile this using GCC 4.7.2 for arm-linux-gnueabihf, the compiler suspects that the pointer accesses may be unaligned, annotating the loads and stores in the assembly output, for example:

    ldr     r0, [r0, #0]    @ unaligned

If I compile with -mno-unaligned-access, the compiler does not emit direct loads and stores at all but calls the library memcpy instead. But in fact, the pointers in this case should never be unaligned. Is this an overlook in gcc, or am I mistaken?

4
please post full disassembly of the compiled code.old_timer
It does a better job with int q(int *restrict p) { p[0]+=12; return p[0];}. mov r3,r0, ldr r0,[r0],add r0,r0,#12,str r0,[r3], blx. Why do you want to code this? It is true that it should be aligned. As others have noted, you are confusing it with memcpy(). At the very least, x = p[0]; seems semi-sane. Why are you using memcpy() as an assignment? The next person will come along and proclaim that *p is a byte-swapped version. Maybe the larger problem will help?artless noise
The point of using memcpy is to be compatible with strict-aliasing rules. The given int * pointer may be derived from a pointer to an another type.Juho Östman
@Juho Well, that is exactly the reason that gcc is generating memcpy, because if int*p is a char*p in disguise then it may not be aligned at all. So I'm not even sure the compiler could decide at all (apart from whole program analysis), that int*p is aligned. And considering your motivation gcc is right.Bryan Olivier
If a pointer is cast to int*, it is a promise to the compiler that the result is properly aligned. The compiler is not required to support unaligned access through int*, gcc, for example, has packed structs for that.Juho Östman

4 Answers

2
votes

There are few things better optimized in C compilers than loads and stores of int values, which are by design a natural size for the machine.

Write the function as

int q(int *p) {
    return *p += 12;
}

which avoids two calls to a library routine which you are otherwise counting on the optimizer to inline and reduce to a simple load and store, and expresses the intent of modifying the integer-value parameter in-place and returning the result.

Using memcpy to assign integers obfuscates the intent.

If this question is a result of reducing a larger problem to a least size example of the confusion, then my implementation may not help directly. But even if the type of p is some_complex_struct * rather than int *, the advice would still apply. The assignment operator works. Use it in preference to memcpy where it makes sense.

2
votes

I think gcc is indeed confused by the int* being cast to a void* in the call to memcpy and assumes the worst for such a pointer. It could have tried to see if the underlying pointer is properly aligned. Did you try higher optimization levels? It may be that at higher levels gcc becomes smarter.

It is also possible that gcc doesn't guarantee alignment of int pointers in all its code, but that would be unwise and is unlikely.

The compiler is allowed to assume correct alignment of int*p because of clause 6.2.3.2 ad 7:

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned 68) for the referenced type, the behavior is undefined.

Note 68) is about transitivity of being correctly aligned.

1
votes

If your linux kernel version before to 2.6.28. GCC will throw this Warning. -munaligned-access supports the accesses memory on unaligned addresses. This requires the kernel of those systems to enable such accesses Alternatively, unaligned accesses are not supported, all code has to be compiled with -mno-unaligned-access. Upstream Linux kernel releases have automatically and unconditionally supported unaligned accesses as emitted by GCC due to this option being active since version 2.6.28.

1
votes

This is the solution I came up with, implementing a couple of alternatives for data field access:

// #define USE_MEMCPY
// #define USE_PACKED
#ifdef __cplusplus
template <typename T> void SET(T *__attribute__((may_alias)) p, T val) {
    *p=val;
}
template <typename T> T GET(T *__attribute__((may_alias)) p) {
    return *p;
}
#else
#ifdef USE_MEMCPY
#include <string.h>
#define _SET(p,val,line) \
  ({ typeof(val) _temp_##line = (val); \
       memcpy((void*)(p),(void*)&_temp_##line,sizeof(_temp_##line)); }) 
#define _GET(p,line) \
  ({ typeof(*(p)) _temp_##line; \
       memcpy((void*)&_temp_##line,(void*)(p),sizeof(_temp_##line)); \
       _temp_##line; })

#define SET(p,val) _SET(p,val,__LINE__)
#define GET(p) _GET(p,__LINE__)
#else /* no memcpy */
#ifdef USE_PACKED
#define SET(p,val) (((struct { typeof(val) x __attribute__((packed)); } __attribute__((may_alias))*)p)->x=(val))
#define GET(p) (((struct { typeof(*p) x __attribute__((packed)); } __attribute__((may_alias))*)p)->x)
#else
#define SET(p,val) (*((typeof(val) __attribute__((may_alias))*)p)=(val))
#define GET(p) (*((typeof(*p) __attribute__((may_alias))*)p))
#endif
#endif
#endif

Then I can write the function like this:

int q(int *p) {
    SET(p,GET(p)+12);
    return p[0];
}