TLDR
There may be other factors, but certainly a big contributor is likely to be Delphi's default memory manager. It's designed to be a little wasteful of space in order to reduce how often memory is reallocated.
Considering memory manager overhead
When you have a straight-forward memory manager (you might even call it 'naive'), your loop concatenating strings would actually be more like:
//pseudo-code
for I := 1 to 50000 do
begin
if CanReallocInPlace(Str) then
//Great when True; but this might not always be possible.
ReallocMem(Str, Length(Str) + Length(Txt))
else
begin
AllocMem(NewStr, Length(Str) + Length(Txt))
Copy(Str, NewStr, Length(Str))
FreeMem(Str)
end;
Copy(Txt, NewStr[Length(NewStr)-Length(Txt)], Length(Txt))
end;
Notice that on every iteration you increase the allocation. And if you're unlucky, you very often have to:
- Allocate memory in a new location
- Copy the existing 'string so far'
- Finally release the old string
Delphi (and FastMM)
However, Delphi has switched from the default memory manager used in it's early days to a previously 3rd party one (FastMM) that's designed run faster primarily by:
- (1) Using a sub-allocator i.e. getting memory from the OS a 'large' page at a time.
- Then performing allocations from the page until it runs out.
- And only then getting another page from the OS.
- (2) Aggressively allocating more memory than requested (anticipating small growth).
- Then it becomes more likely the a slightly larger request can be reallocated in-place.
- These techniques can thought it's not guaranteed increase performance.
- But it definitely does waste space. (And with unlucky fragmentation, the wastage can be quite severe.)
Conclusion
Certainly the simple app you wrote to demonstrate the performance greatly benefits from the new memory manager. You run through a loop that incrementally reallocates the string on every iteration. Hopefully with as many in-place allocations as possible.
You could attempt to circumvent some of FastMM's performance improvements by forcing additional allocations in the loop. (Though sub-allocation of pages would still be in effect.)
So simplest would be to try an older Delphi compiler (such as D5) to demonstrate the point.
FWIW: String Builders
You said you "don't want to use the String Builder". However, I'd like to point out that a string builder obtains similar benefits. Specifically (if implemented as intended): a string builder wouldn't need to reallocate the substrings all the time. When it comes time to finally build the string; the correct amount of memory can be allocated in a single step, and all portions of the 'built string' copied to where they belong.