I'm trying to track the memory usage of a script that processes URLs. The basic idea is to check that there's a reasonable buffer before adding another URL to a cURL multi handler. I'm using a 'rolling cURL' concept that processes a URLs data as the multi handler is running. This means I can keep N connections active by adding a new URL from a pool each time an existing URL processes and is removed.
I've used memory_get_usage()
with some positive results. Adding the real_usage
flag helped (not really clear on the difference between 'system' memory and 'emalloc' memory, but system shows larger numbers). memory_get_usage()
does ramp up as URLs are added then down as the URL set is depleted. However, I just exceeded the 32M limit with my last memory check being ~18M.
I poll the memory usage each time cURL multi signals a request has returned. Since multiple requests may return at the same time, there's a chance a bunch of URLs returned data at the same time and actually jumped the memory usage that 14M. However, if memory_get_usage()
is accurate, I guess that's what's happening.
[Update: Should have run more tests before asking I guess, increased php's memory limit (but left the 'safe' amount the same in the script) and the memory usage as reported did jump from below my self imposed limit of 25M to over 32M. Then, as expected slowly ramped down as URLs where not added. But I'll leave the question up: Is this the right way to do this?]
Can I trust memory_get_usage()
in this way? Are there better alternative methods for getting memory usage (I've seen some scripts parse the output of shell commands)?