1
votes

Consider the following two Splunk searches:

index=a | join type=inner MyKey [search index=b]

and:

index=a | join type=inner MyKey [search index=b | where MyVal > 0]

Remarkably, the latter of the searches - the search whose subsearch has a constraint - has three times as many result rows as the former.

The Splunk documentation page for the join command suggests semantics that are close enough for the sake of argument to SQL's join:

A join is used to combine the results of a search and subsearch if specified fields are common to each. You can also join a table to itself using the selfjoin command.

This snippet is relevant to the type=inner argument:

A join is used to combine the results of a search and subsearch if specified fields are common to each. You can also join a table to itself using the selfjoin command.

Based on this information, I assume the two Splunk searches above should be equivalent to the following SQL, respectively:

SELECT *
FROM a
INNER JOIN b ON a.MyKey = b.MyKey

and:

SELECT *
FROM a
INNER JOIN b ON a.MyKey = b.MyKey
WHERE b.MyVal > 0

How is it possible that adding a constraint increases the number of result rows?

Interestingly, the following Splunk search produces a third result - one that matches what I got when I put the same data in an SQL database:

index=a | join type=outer MyKey [search index=b | eval hasmatch=1]
| where hasmatch=1

Some more notes:

  • the MyVal field has no duplicates in either table / index
  • I have verified that the raw events in Splunk's indexes match the raw source data in event counts and values for MyVal
  • the only search-time operations configured for the relevant sourcetypes in props.conf is a report to extract the fields based on a stanza in transforms.conf (the source data is in a CSV dialect)

Can anyone give me some clues here? As far as I'm concerned this behaviour is nonsensical.

1

1 Answers

0
votes

I would suggest that you re-post this question over at the official Splunk forums (splunk-base.splunk.com). There are some very useful and experience "Splunkers" there (Employees, customers, partners, etc.) and it may be a better place for your question.

Cheers,

MHibbin