0
votes

I have a query similar to this, where I need to find the number of transactions a specific customer had within a time frame:

select customer_id, count(transactions)
from transactions
where customer_id = 'FKJ90838485'
and purchase_date between '01-JAN-13' and '31-AUG-13'
group by customer_id

The table transactions is not indexed on customer_id but rather another field called transaction_id. Customer_ID is character type while transaction_id is numeric.

'accounting_month' field is also indexed.. this field just stores the month that transactions occured... ie, purchase_date = '03-MAR-13' would have accounting_month = '01-MAR-13'

The transactions table has about 20 million records in the time frame from '01-JAN-13' and '31-AUG-13'

When I run the above query, it has taken more than 40 minutes to come back, any ideas or tips?

3
Are you able to add a covering index? - Andrew Lewis
This query - as it is - would give error. Add the actual query if you want a real answer. Although I'd say any answer would involve adding an index - which you can't do. - ypercubeᵀᴹ
Best index to add would probably be a composite on (customer_id, purchase_date). But as ypercube said, your query gives an error...you have an aggregate function, but no 'group by' clause...and why 'count(transactions)' instead of 'count(*)'? Is there even a 'transactions' column? - runrig
Contact the Database administrator. Request that he add an index on CustomerId. - Charles Bretana
Please do not rely on implicit data type conversion. '01-JAN-13' is a string literal and not a date. This will fail if run from a computer with different NLS settings. - a_horse_with_no_name

3 Answers

4
votes

As others have already commented, the best is to add an index that will cover the query, So:

  • Contact the Database administrator and request that they add an index on (customer_id, purchase_date) because the query is doing a table scan otherwise.

Sidenotes:

  • Use date and not string literals (you may know that and do it already, still noted here for future readers)
  • You don't have to put the customer_id in the SELECT list and if you remove it from there, it can be removed from the GROUP BY as well so the query becomes:

    select count(*) as number_of_transactions
    from transactions
    where customer_id = 'FKJ90838485'
      and purchase_date between DATE '2013-01-01' and DATE '2013-08-31' ;
    
  • If you don't have a WHERE condition on customer_id, you can have it in the GROUP BY and the SELECT list to write a query that will count number of transactions for every customer. And the above suggested index will help this, too:

    select customer_id, count(*) as number_of_transactions
    from transactions
    where purchase_date between DATE '2013-01-01' and DATE '2013-08-31' 
    group by customer_id  ;
    
0
votes

This is just an idea that came up to me. It might work, try running it and see if it is an improvement over what you currently have.

I'm trying to use the transaction_id, which you've said is indexed, as much as possible.

WITH min_transaction (tran_id)
AS (
   SELECT MIN(transaction_ID)
   FROM TRANSACTIONS
   WHERE
      CUSTOMER_ID = 'FKJ90838485'
      AND purchase_date >= '01-JAN-13'
   ), max_transaction (tran_id)
AS (
   SELECT MAX(transaction_ID)
   FROM TRANSACTIONS
   WHERE 
      CUSTOMER_ID = 'FKJ90838485'
      AND purchase_date <= '31-AUG-13'
   )
SELECT customer_id, count(transaction_id)
FROM transactions
WHERE
   transaction_id BETWEEN min_transaction.tran_id AND max_transaction.tran_id
GROUP BY customer_ID
0
votes

May be this will run faster since it look at the transaction_id for the range instead of the purchase_date. I also take in consideration that accounting_month is indexed :

select customer_id, count(*)
from transactions
where customer_id = 'FKJ90838485'
and transaction_id between (select min(transaction_id)
                            from transactions
                           where accounting_month = '01-JAN-13' 
                           )  and
                           (select max(transaction_id)
                            from transactions
                           where accounting_month = '01-AUG-13' 
                           ) 
group by customer_id 

May be you can also try :

select customer_id, count(*)
from transactions
where customer_id = 'FKJ90838485'
and accounting_month between '01-JAN-13' and '01-AUG-13'
group by customer_id