I've figured that calling realloc too many times wont be so smart.
How have you figured it out? Because the only way to really know is to measure the performance.
Your strategy may need to differ based on how you are reading the data from the user. If you are using getchar()
you probably don't want to use realloc()
to increase the buffer size by one char each time you read a character. However, a good realloc()
will be much less inefficient than you think even in these circumstances. The minimum block size that glibc will actually give you in response to a malloc()
is, I think, 16 bytes. So going from 0 to 16 characters and reallocing each time doesn't involve any copying. Similarly for larger reallocations, a new block might not need to be allocated, it may be possible to make the existing block bigger. Don't forget that even at its slowest, realloc()
will be faster than a person can type.
Most people don't go for that strategy. What can by typed can be piped so the argument that people don't type very fast doesn't necessarily work. Normally, you introduce the concept of capacity. You allocate a buffer with a certain capacity and when it gets full, you increase its capacity (with realloc()
) by adding a new chunk of a certain size. The initial size and the reallocation size can be tuned in various ways. If you are reading user input, you might go for small values e.g. 256 bytes, if you are reading files off disk or across the network, you might go for larger values e.g. 4Kb or bigger.
The increment size doesn't even need to be constant, you could choose to double the size for each needed reallocation. This is the strategy used by some programming libraries. For example the Java implementation of a hash table uses it I believe and so possibly does the Cocoa implementation of an array.
It's impossible to know beforehand what the best strategy in any particular situation is. I would pick something that feels right and then, if the application has performance issues, I would do testing to tune it. Your code doesn't have to be the fastest possible, but only fast enough.
However one thing I absolutely would not do is overlay a home rolled memory algorithm over the top of the built in allocator. If you find yourself maintaining a list of blocks you are not using instead of freeing them, you are doing it wrong. This is what got OpenSSL into trouble.
realloc
at all. Might be the fastest. - Bo Persson