I'm running a query to bifurcate splunk results into buckets. I want to divide and count files based on sizes they are taking on disk. This can be achieved using rangemap
or eval case
.
As I read here using eval
is faster than rangemap
. But I'm getting different results on using both.
This is the query I'm running -
<source>
| eval size_group = case(SizeInMB < 150, "0-150 MB", SizeInMB < 200 AND SizeInMB >= 150, "150-200 MB", SizeInMB < 300 AND SizeInMB >= 200, "200-300 MB", SizeInMB < 500 AND SizeInMB >= 300, "300-500 MB", SizeInMB < 1000 AND SizeInMB >= 500, "500-1000 MB", SizeInMB > 1000, ">1000 MB")
| stats count by size_group
and this is the result I'm getting -
Whereas using rangemap
this is the query -
<source>
| rangemap field=SizeInMB "0-150MB"=0-150 "151-200MB"=150-200 "201-300MB"=200-300 "301-500MB"=300-500 "501-999MB"=500-1000 default="1000MB+"
| stats count by range
I tried this range too - rangemap field=SizeInMB "0-150MB"=0-150 "150-200MB"=150-200 "200-300MB"=200-300 "300-500MB"=300-500 "500-1000MB"=500-1000 default="1000MB+"
and I get the same result -
There is not a huge difference in both the images results, and we can probably live with it - but I see for the range 150-200MB - it is 445958 vs 445961
, and for 200-300 MB it is 3676 vs 3677
and for 300-500 MB it is 3346 vs 3348
. I want to understand why is that difference, and which one should I trust more? Speedwise eval
seems better, but datawise is it not so correct?