1
votes

I want to grab data from a KDB data base for a list of roughly 200 days within the last two years. The 200 days are in no particular pattern.

I only need the data from 09:29:00.000 to 09:31:00.000 everyday.

My first approach was to query all of the last two years data that have time stamp between 09:29:00.000 and 09:31:00.000, because I didn't see a way to just query the particular 200 days that I need. However this proved to be too much for my server to handle.

Then I tried to summarize the 2 minute data for each date into an average and just print out the average, so now I will only have 200 rows of data as output. But somehow this still turns out to be too much. I'm not sure if this is because I'm not selecting the data correctly.

My other suspicion is that the query is garbing all the data first then averaging each date, which means averaging is not making it easier to handle. Here's the code that I have:

    select maxPriceB:max(price),  minPriceB:min(price), avgPriceB:avg(price), avgSizeB:avg(qty) by date from dms where date within(2015.01.01, 2016.06,10), time within(09:29:00.000, 09:31:00.000), sym = `ZF6

poms is the table that the data is in ZFU6 is the symbol that im looking for I tried adding the key word distinct after select. I want to know if there's anyway to break up the query, or make the query lighter for the server to handle. Thank you!

1
What are the indexes? If this was a different RDBMS I would have used clustered index in (date, time) tuple. - wilx
@wilx Sorry I'm quite new to KDB and database in general. What exactly are indexes? - casualprogrammer
Most likely the "index" (attribute) is on the sym column. So the order of your where clause should be where date within (), sym=`ZFU6, time within () - terrylynch

1 Answers

1
votes

If you use 32-bit kdb+ and get infamous 'wsfull error then you may try processing one day at a time like this:

raze{select maxPriceB:max(price), minPriceB:min(price), avgPriceB:avg(price), avgSizeB:avg(qty)
     from dms where date=x,sym=`ZF6,time within 09:29:00.000 09:31:00.000}each 2015.01.01+1+til 2016.06.10-2015.01.01