I have two different systems pushing data into a kdb table. These are driven by a feed of prices. I'd like to compare the values produced so I can ultimately flag large differences. I'm very new to kdb though and finding it hard to even work out a starting point query.
Ultimately I'd like to take a time period (likely a minute,) find a row for each system within this time period where the driving price is the same and compare the derived values.
A good starting point for me though would be to understand how to grab the first row within a timeperiod for each system and compare/join..
Thanks.
Simplifed example data
Example: -
System | Time | driver | result1 | result2
systemA.instrument1| 11:59:59| 101.4 | 3.4 | 4.6
systemA.instrument1| 12:00:01| 101.5 | 3.8 | 4.8
systemA.instrument1| 12:00:02| 101.6 | 3.3 | 2.3
systemA.instrument2| 12:00:02| 106.6 | 11.1 | 11.3
systemA.instrument1| 12:00:05| 101.7 | 3.9 | 5.6
systemB.instrument1| 12:00:09| 101.1 | 3.2 | 7.8
systemB.instrument1| 12:00:14| 101.2 | 3.9 | 3.4
systemB.instrument1| 12:00:17| 101.3 | 3.1 | 8.9
systemB.instrument2| 12:00:19| 106.5 | 11.2 | 11.4
systemB.instrument1| 12:00:58| 101.7 | 3.9 | 9.3
systemB.instrument1| 12:00:59| 101.7 | 3.3 | 3.4
systemB.instrument1| 12:01:03| 101.4 | 3.1 | 5.6
I only want data from 12:00:00 - 12:00:59
The only matching driver between SystemA and SystemB instrument1 is 101.7. I'd like either to be used and the diff between the results shown. For instrument2 the driver never matches so I want to use the driver prices that are closest between the systems.
results | driver | driver diff | result1diff | result2diff
instrument1 | 101.7 | 0 | 0 | 3.7
instrument2 | | 0.1 | 0.1 | 0.1