run subquery for each row of csv file passing the field in search string

Question

I want to run a splunk query for all the values in the csv file and replace the value with the field in the csv file. I've imported the file into splunk as input loookup table and able to view the fields using inputlookup query but I want to run that with all the sub queries where I'm fetching maximum count per hour, per day, per week and per month basis

input file is ids.csv which has around 800 rows and its just one column, liek below:

query that im using:

| inputlookup ids.csv | fields ids as id |  [search index="abc" id "search string here" |bin _time span="1hour" | stats count as maxHour by _time | sort - count | head 1]  |appendcols[search
 index="abc" id "search string here" |bin _time span="1day" | stats count as maxDay by _time | sort - count |head 1 ]|appendcols[search
 index="abc" id "search string here" |bin _time span="1week" | stats count as maxWeek by _time | sort - count | head 1 ]|appendcols[search
 index="abc" id "search string here" |bin _time span="1month" | stats count as maxMonth by _time | sort - count | head 1]

Im not getting the expected results for this, Im expecting a tabular format where i get the count for each time range with the specific id by passing id field in the search subquery.

How can I solve this?

Thanks

Charlie Charlie · Accepted Answer · 2019-01-27T19:29:32

Without an active instance in front of me to play with I feel like a much more efficient approach might look like:

index="abc" [inputlookup ids.csv | fields ids as id]
| bin _time span=1h | stats count by id _time
| bin _time span=1d | stats max(count) as countHour sum(count) as countDay by id _time
| multireport 
   [ noop ]
   [ bin _time span=1week | stats sum(countDay) as countWeek by id _time ]
   [ bin _time span=1mon | stats sum(countDay) as countMonth by id _time ]
| stats max(count*) as max* by id

First we fetch all data for each id in the lookup, and using stats we get to the maximum count per hour in every day, and the count per day using bin and stats.

Next we use the (undocumented) multireport command to transform our result set from daily statistics, to add on have weekly and monthly statistics as well. Finally we take the max of each counts to wind up with the max count of an hour, day, week and month by each id in the inputlookup.

Now since multireport is undocumented, and prior uses of it have proven somewhat problematic at times... I would actually propose using eventstats instead like so:

index="abc" [inputlookup ids.csv | fields ids as id]
| bin _time span=1h | stats count by id _time
| bin _time span=1d | stats max(count) as countHour sum(count) as countDay by id _time
| bin _time as week span=1week | eventstats sum(countDay) as countWeek by id week
| bin _time as month span=1month | eventstats sum(countDay) as countMonth by id month
| stats max(count*) as max* by id

Similar to before, we get the daily statistics, but then we can use the as syntax of the bin command to calculate the week and month for each day, and tack on a column for the weekly and monthly counts for every day. Then it's the same max function as previously.

run subquery for each row of csv file passing the field in search string

1 Answers