Tracking Memory Usage in PHP

Question

I'm trying to track the memory usage of a script that processes URLs. The basic idea is to check that there's a reasonable buffer before adding another URL to a cURL multi handler. I'm using a 'rolling cURL' concept that processes a URLs data as the multi handler is running. This means I can keep N connections active by adding a new URL from a pool each time an existing URL processes and is removed.

I've used memory_get_usage() with some positive results. Adding the real_usage flag helped (not really clear on the difference between 'system' memory and 'emalloc' memory, but system shows larger numbers). memory_get_usage() does ramp up as URLs are added then down as the URL set is depleted. However, I just exceeded the 32M limit with my last memory check being ~18M.

I poll the memory usage each time cURL multi signals a request has returned. Since multiple requests may return at the same time, there's a chance a bunch of URLs returned data at the same time and actually jumped the memory usage that 14M. However, if memory_get_usage() is accurate, I guess that's what's happening.

[Update: Should have run more tests before asking I guess, increased php's memory limit (but left the 'safe' amount the same in the script) and the memory usage as reported did jump from below my self imposed limit of 25M to over 32M. Then, as expected slowly ramped down as URLs where not added. But I'll leave the question up: Is this the right way to do this?]

Can I trust memory_get_usage() in this way? Are there better alternative methods for getting memory usage (I've seen some scripts parse the output of shell commands)?

StasM StasM · Accepted Answer · 2010-02-18T18:07:14

real_usage works this way:

Zend's memory manager does not use system malloc for every block it needs. Instead, it allocates a big block of system memory (in increments of 256K, can be changed by setting environment variable ZEND_MM_SEG_SIZE) and manages it internally. So, there are two kinds of memory usage:

How much memory the engine took from the OS ("real usage")
How much of this memory was actually used by the application ("internal usage")

Either one of these can be returned by memory_get_usage(). Which one is more useful for you depends on what you are looking into. If you're looking into optimizing your code in specific parts, "internal" might be more useful for you. If you're tracking memory usage globally, "real" would be of more use. memory_limit limits the "real" number, so as soon as all blocks that are permitted by the limit are taken from the system and the memory manager can't allocate a requested block, there the allocation fails. Note that "internal" usage in this case might be less than the limit, but the allocation still could fail because of fragmentation.

Also, if you are using some external memory tracking tool, you can set this environment variable USE_ZEND_ALLOC=0 which would disable the above mechanism and make the engine always use malloc(). This would have much worse performance but allows you to use malloc-tracking tools.

See also an article about this memory manager, it has some code examples too.

Tracking Memory Usage in PHP

4 Answers