In this disassembly from VC++ a function call is being made. The compiler MOVs the local pointers to a register before pushing them:
memcpy( nodeNewLocation, pNode, sizeCurrentNode );
0041A5DA 8B 45 F8 mov eax,dword ptr [ebp-8]
0041A5DD 50 push eax
0041A5DE 8B 4D 0C mov ecx,dword ptr [ebp+0Ch]
0041A5E1 51 push ecx
0041A5E2 8B 55 D4 mov edx,dword ptr [ebp-2Ch]
0041A5E5 52 push edx
0041A5E6 E8 67 92 FF FF call 00413852
0041A5EB 83 C4 0C add esp,0Ch
Why not just push them directly? ie
push dword ptr [ebp-8]
Also, if you are going to do a separate push, why not do it manually. In other words, instead of doing "push eax" above, do
mov [esp], eax
Etc. the advantage of this is that after doing the 3 movs you can do a single subtract to set the new stack pointer, instead of implicitly subtracting three times with the pushes.
UPDATE---Release version
This is the same code compiled for release:
; 741 : memcpy( nodeNewLocation, pNode, sizeCurrentNode );
00087 8b 45 f8 mov eax, DWORD PTR _sizeCurrentNode$[ebp]
0008a 8b 7b 04 mov edi, DWORD PTR [ebx+4]
0008d 50 push eax
0008e 56 push esi
0008f 57 push edi
00090 e8 00 00 00 00 call _memcpy
00095 83 c4 0c add esp, 12 ; 0000000cH
Definitely more efficient than the debug version, but it is still doing a MOV/PUSH combo.