6
votes
SiteVisitID     siteName        visitDate
------------------------------------------------------    
   1            site1           01/03/2014
   2            Site2           01/03/2014
   3            site1           02/03/2014
   4            site1           03/03/2014
   5            site2           03/03/2014
   6            site1           04/03/2014
   7            site2           04/03/2014
   8            site2           05/03/2014
   9            site1           06/03/2014
  10            site2           06/03/2014
  11            site1           08/03/2014
  12            site2           08/03/2014
  13            site1           09/03/2014
  14            site2           10/03/2014

There are two sites and each need to have a visit entry for everyday of the month, so considering that today is 11/03/2014 we are expecting 22 entries but there are only 14 entries so missing 8, is there any way in sql we could pull out missing date entries

Up to the current day of the month against sites

siteName    missingDate
-----------------------    
site2       02/03/2014
site1       05/03/2014
site1       07/03/2014
site2       07/03/2014
site2       09/03/2014
site1       10/03/2014
site1       11/03/2014
site2       11/03/2014

Here is my unsuccessful attempt I believe is wrong both logically and syntactically

select 
    siteName, visitDate  
from  
    SiteVisit not in (SELECT siteName, visitDate
                      FROM SiteVisit 
                      WHERE Day(visitDate) != Day(CURRENT_TIMESTAMP) 
                        AND Month(visitDate) = Month(CURRENT_TIMESTAMP))

Note: the above data and columns are simplified version of the actual table

4
If you had a Numbers table (a table with just a single column with values from 0 to a large-ish number), you could use it to generate a list of dates from 1900 to now, and use an "anti-join" to find the missing entries for each site. - Cᴏʀʏ
we only need to generate dates for the current month where no visitdate is provided for each site - GROVER_SYAAN
@GROVER_SYAAN - The point is that the suggested table is the generically useful, and doesn't need rebuilding next month. Instead you just filter which part of the table you need in a WHERE clause... - MatBailie

4 Answers

6
votes

I would recommend you use a table valued function to get you all days in between 2 selected dates as a table (Try it out in this fiddle):

CREATE FUNCTION dbo.GetAllDaysInBetween(@FirstDay DATETIME, @LastDay DATETIME)
RETURNS @retDays TABLE 
(
    DayInBetween DATETIME
)
AS 
BEGIN
    DECLARE @currentDay DATETIME
    SELECT @currentDay = @FirstDay

    WHILE @currentDay <= @LastDay
    BEGIN

        INSERT @retDays (DayInBetween)
            SELECT @currentDay

        SELECT @currentDay = DATEADD(DAY, 1, @currentDay)
    END 

    RETURN
END

(I include a simple table setup for easy copypaste-tests)

CREATE TABLE SiteVisit (ID INT PRIMARY KEY IDENTITY(1,1), visitDate DATETIME, visitSite NVARCHAR(512))

INSERT INTO SiteVisit (visitDate, visitSite)
    SELECT '2014-03-11', 'site1'
    UNION
    SELECT '2014-03-12', 'site1'
    UNION
    SELECT '2014-03-15', 'site1'
    UNION
    SELECT '2014-03-18', 'site1'
    UNION
    SELECT '2014-03-18', 'site2'

now you can simply check what days no visit occured when you know the "boundary days" such as this:

SELECT
        DayInBetween AS missingDate,
        'site1' AS visitSite
    FROM dbo.GetAllDaysInBetween('2014-03-11', '2014-03-18') AS AllDaysInBetween
    WHERE NOT EXISTS 
        (SELECT ID FROM SiteVisit WHERE visitDate = AllDaysInBetween.DayInBetween AND visitSite = 'site1')

Or if you like to know all days where any site was not visited you could use this query:

SELECT
        DayInBetween AS missingDate,
        Sites.visitSite
    FROM dbo.GetAllDaysInBetween('2014-03-11', '2014-03-18') AS AllDaysInBetween
    CROSS JOIN (SELECT DISTINCT visitSite FROM SiteVisit) AS Sites
    WHERE NOT EXISTS
        (SELECT ID FROM SiteVisit WHERE visitDate = AllDaysInBetween.DayInBetween AND visitSite = Sites.visitSite)
    ORDER BY visitSite

Just on a side note: it seems you have some duplication in your table (not normalized) siteName should really go into a separate table and only be referenced from SiteVisit

4
votes

Maybe you can use this as a starting point:

-- Recursive CTE to generate a list of dates for the month
-- You'll probably want to play with making this dynamic
WITH Dates AS
(
    SELECT 
        CAST('2014-03-01' AS datetime) visitDate
    UNION ALL
    SELECT 
        visitDate + 1
    FROM 
        Dates
    WHERE
        visitDate + 1 < '2014-04-01'
)
-- Build a list of siteNames with each possible date in the month
, SiteDates AS 
(
    SELECT 
        s.siteName, d.visitDate 
    FROM 
        SiteVisitEntry s
    CROSS APPLY 
        Dates d
    GROUP BY 
        s.siteName, d.visitDate
)
-- Use a left anti-semi join to find the missing dates
SELECT
    d.siteName, d.visitDate AS missingDate
FROM
    SiteDates d
LEFT JOIN
    SiteVisitEntry e /* Plug your actual table name here */
    ON
    d.visitDate = e.visitDate
    AND
    d.siteName = e.siteName
WHERE
    e.visitDate IS NULL
OPTION 
    (MAXRECURSION 0)
2
votes

This becomes much simpler if you have a 'calendar table'. That is, a table that holds every date of possible interest to you.

The reason for this is that SQL can't report on what's not there. You can write SQL to create the dates that should belong in the gaps (such as in another answer here), then report on the dates created. Or you could just have them in a persisted 'calendar table' and just use that, making the SQL simpler and more often then not, significantly quicker.

In a similar vain, I assume that you have a dimension table that contains all of the sites which can have visits? If so, then you get something like this...

SELECT
  *
FROM
  site
CROSS JOIN
  calendar
LEFT JOIN
  visit
    ON  visit.site_id   = site.id
    AND visit.visitDate = calendar.date
WHERE
      calendar.date   >= DATEADD(month, DATEDIFF(month, 0, GETDATE()), 0) 
  AND calendar.date   <                                    GETDATE()
  AND visit.visitDate IS NULL
0
votes

Here I am implementing a user-defined function genrate.

CREATE FUNCTION genrate(@FROM_date date,@to_date date) 
RETURNS @tab table(missing date) 
AS 
BEGIN 
    ;WITH cte AS
      ( 
        SELECT @FROM_date missing
        UNION ALL 
        SELECT DATEADD(DAY,1,missing)
        FROM cte
        WHERE missing <@to_date 
      )

    INSERT INTO @tab
    SELECT *
    FROM cte
    WHERE missing NOT IN (@FROM_date, @to_date) 
    RETURN 
END

Using the function with the data set we are going to do the cross apply to generate the missing along with date set.

SELECT user_No,
     product_name,
     amount,
     b.missing AS Missing_date
FROM date_tab t 
CROSS APPLY DBO.genrate(T.ORDER_DATE,T.DELIVERY) b