4
votes

I have two tables.

An orders table with customer, and date. A date dimension table from a data warehouse.

The orders table does not contain activity for every date in a given month, but I need to return a result set that fills in the gaps with date and customer.

For Example, I need this:

Customer   Date
===============================
Cust1       1/15/2012   
Cust1       1/18/2012
Cust2       1/5/2012
Cust2       1/8/2012

To look like this:

Customer   Date
============================
Cust1       1/15/2012   
Cust1       1/16/2012   
Cust1       1/17/2012       
Cust1       1/18/2012
Cust2       1/5/2012
Cust2       1/6/2012
Cust2       1/7/2012
Cust2       1/8/2012

This seems like a left outer join, but it is not returning the expected results. Here is what I am using, but this is not returning every date from the date table as expected.

SELECT o.customer, 
       d.fulldate
FROM   datetable d 
       LEFT OUTER JOIN orders o 
                    ON d.fulldate = o.orderdate 
WHERE  d.calendaryear IN ( 2012 ); 
4
Is the gap filling per customer? so if Cust1 and Cust2 have a gap on the same days, do we fill per customer? Your sample data does not show this - gbn
The output has the customer, but the date that is showing up is only the dates found in the orders table. Where i expect to see the all dates from the date table - user1134307
Dates in the date table that have no order date should show up customer null in that query. - paparazzo

4 Answers

6
votes

The problem is that you need all customers for all dates. When you do the left outer join, you are getting NULL for the customer field.

The following sets up a driver table by cross joining the customer names and dates:

SELECT driver.customer, driver.fulldate, o.amount 
FROM   (select d.fulldate, customer
        from datetable d cross join
             (select customer
              from orders
              where year(orderdate) in (2012)
             ) o
        where d.calendaryear IN ( 2012 )
       ) driver LEFT OUTER JOIN
       orders o 
       ON driver.fulldate = o.orderdate and
          driver.customer = o.customer;

Note that this version assumes that calendaryear is the same as year(orderdate).

6
votes

You can use recursive CTE to get all dates between two dates without need for datetable:

;WITH CTE_MinMax AS
(
    SELECT Customer, MIN(DATE) AS MinDate, MAX(DATE) AS MaxDate
    FROM dbo.orders
    GROUP BY Customer
)
,CTE_Dates AS
(
    SELECT Customer, MinDate AS Date
    FROM CTE_MinMax
    UNION ALL
    SELECT c.Customer, DATEADD(DD,1,Date) FROM CTE_Dates c
    INNER JOIN CTE_MinMax mm ON c.Customer = mm.Customer
    WHERE DATEADD(DD,1,Date) <= mm.MaxDate
)
SELECT c.* , COALESCE(o.Amount, 0)
FROM CTE_Dates c
LEFT JOIN Orders o ON c.Customer = o.Customer AND c.Date = o.Date
ORDER BY Customer, Date
OPTION (MAXRECURSION 0)

SQLFiddle DEMO

1
votes

Here is a simple way to do it:

SELECT  A.Customer,
        B.fulldate [Date],
        ISNULL(C.Amount,0) Amount
FROM (  SELECT  Customer, 
                MIN([Date]) MinDate,
                MAX([Date]) MaxDate
        FROM Orders
        GROUP BY Customer) A
LEFT JOIN DateTable B
    ON B.fulldate BETWEEN A.MinDate AND A.MaxDate
LEFT JOIN Orders C
    ON A.Customer = C.Customer 
    AND B.fulldate = C.[Date]
1
votes

Assuming that datetable includes every date of the year you can do with one simple CTE

WITH OrdersCustomerDateBorders AS
(
    SELECT CustomerID, MIN(fulldate) AS FirstOrderDate, MAX(fulldate) AS LastOrderDate
    FROM orders
    GROUP BY customer
)
select o.customer, d.fulldate, ISNULL(o.amount, 0) AS Amount
from orders o
INNER JOIN OrdersCustomerDateBorders OCDB ON OCDB.CustomerID = o.CustomerID
INNER JOIN datetable d ON  ON d.fulldate between OCDB.FirstOrderDate AND OCDB.LastOrderDate
WHERE d.calendaryear in (2012);