3
votes

I'm having a heck of a time putting together a query that I thought would be quite simple. I have a table that records total hours spent on a task and the user that reported those hours. I need to put together a query that returns how many hours a given user charged to each week of the year (including weeks where no hours were charged).

Expected Output:

    |USER_ID | START_DATE | END_DATE  | HOURS |
    -------------------------------------------
    |'JIM'   | 4/28/2019  | 5/4/2019  |   6   |    
    |'JIM'   | 5/5/2019   | 5/11/2019 |   0   |
    |'JIM'   | 5/12/2019  | 5/18/2019 |   16  |

I have a function that returns the start and end date of the week for each day, so I used that and joined it to the task table by date and summed up the hours. This gets me very close, but since I'm joining on date I obviously end up with NULL for the USER_ID on all zero hour rows.

Current Output:

|USER_ID | START_DATE | END_DATE  | HOURS |
-------------------------------------------
|'JIM'   | 4/28/2019  | 5/4/2019  |   6   |    
| NULL   | 5/5/2019   | 5/11/2019 |   0   |
|'JIM'   | 5/12/2019  | 5/18/2019 |   16  |

I've tried a few other approaches, but each time I end up hitting the same problem. Any ideas?

Schema:

---------------------------------
|          TASK_LOG             |
---------------------------------    
|USER_ID | DATE_ENTERED | HOURS |
-------------------------------
|'JIM'   | 4/28/2019    |   6   |    
|'JIM'   | 5/12/2019    |   6   |
|'JIM'   | 5/13/2019    |   10  |

------------------------------------
|       DATE_HELPER_TABLE           |
|(This is actually a function, but I| 
| put it in a table to simplify)    | 
-------------------------------------    
|DATE | START_OF_WEEK | END_OF_WEEK |
-------------------------------------
|5/3/2019 | 4/28/2019  |  5/4/2019  |    
|5/4/2019 | 4/28/2019  |  5/4/2019  |
|5/5/2019 | 5/5/2019   |  5/11/2019 |
| ETC ...                           |

Query:

 SELECT HRS.USER_ID
        ,DHT.START_OF_WEEK
        ,DHT.END_OF_WEEK
        ,SUM(HOURS)
    FROM DATE_HELPER_TABLE DHT
    LEFT JOIN (
        SELECT TL.USER_ID
            ,TL.HOURS
            ,DHT2.START_OF_WEEK
            ,DHT2.END_OF_WEEK
        FROM TASK_LOG TL
        JOIN DATE_HELPER_TABLE DHT2 ON DHT2.DATE_VALUE = TL.DATE_ENTERED
        WHERE TL.USER_ID = 'JIM1'
        ) HRS ON HRS.START_OF_WEEK = DHT.START_OF_WEEK
    GROUP BY USER_ID
        ,DHT.START_OF_WEEK
        ,DHT.END_OF_WEEK
    ORDER BY DHT.START_OF_WEEK

http://sqlfiddle.com/#!18/02d43/3 (note: for this sql fiddle, I converted my date helper function into a table to simplify)

3

3 Answers

3
votes

Cross join the users (in question) and include them in the join condition. Use coalesce() to get 0 instead of NULL for the hours of weeks where no work was done.

SELECT u.user_id,
       dht.start_of_week,
       dht.end_of_week,
       coalesce(sum(hrs.hours), 0)
       FROM date_helper_table dht
            CROSS JOIN (VALUES ('JIM1')) u (user_id)
            LEFT JOIN (SELECT tl.user_id,
                              dht2.start_of_week,
                              tl.hours
                              FROM task_log tl
                                   INNER JOIN date_helper_table dht2
                                              ON dht2.date_value = tl.date_entered) hrs
                      ON hrs.user_id = u.user_id
                         AND hrs.start_of_week = dht.start_of_week
       GROUP BY u.user_id,
                dht.start_of_week,
                dht.end_of_week
       ORDER BY dht.start_of_week;

I used a VALUES clause here to list the users. If you only want to get the times for particular users you can do so too (or use any other subquery, or ...). Otherwise you can use your user table (which you didn't post, so I had to use that substitute).

However the figures that are produced by this (and your original query) look strange to me. In the fiddle your user has worked for a total of 23 hours in the task_log table. Yet your sums in the result are 24 and 80, that is way to much on its own and even worse taking into account, that 1 hour in task_log isn't even on a date listed in date_helper_table.

I suspect you get more accurate figures if you just join task_log, not that weird derived table.

SELECT u.user_id,
       dht.start_of_week,
       dht.end_of_week,
       coalesce(sum(tl.hours), 0)
       FROM date_helper_table dht
            CROSS JOIN (VALUES ('JIM1')) u (user_id)
            LEFT JOIN task_log tl
                      ON tl.user_id = u.user_id
                         AND tl.date_entered = dht.date_value
       GROUP BY u.user_id,
                dht.start_of_week,
                dht.end_of_week
       ORDER BY dht.start_of_week;

But maybe that's just me.

SQL Fiddle

1
votes

http://sqlfiddle.com/#!18/02d43/65

Using your SQL fiddle, I simply updated the select statement to account for and convert null values. As far as I can tell, there is nothing in your post that makes this option not viable. Please let me know if this is not the case and I will update. (This is not intended to detract from sticky bit's answer, but to offer an alternative)

SELECT ISNULL(HRS.USER_ID, '') as [USER_ID]
    ,DHT.START_OF_WEEK
    ,DHT.END_OF_WEEK
    ,SUM(ISNULL(HOURS,0)) as [SUM]
FROM DATE_HELPER_TABLE DHT
LEFT JOIN (
    SELECT TL.USER_ID
        ,TL.HOURS
        ,DHT2.START_OF_WEEK
        ,DHT2.END_OF_WEEK
    FROM TASK_LOG TL
    JOIN DATE_HELPER_TABLE DHT2 ON DHT2.DATE_VALUE = TL.DATE_ENTERED
    WHERE TL.USER_ID = 'JIM1'
    ) HRS ON HRS.START_OF_WEEK = DHT.START_OF_WEEK
GROUP BY USER_ID
    ,DHT.START_OF_WEEK
    ,DHT.END_OF_WEEK
ORDER BY DHT.START_OF_WEEK
0
votes

Create a dates table that includes all dates for the next 100 years in the first column, the week of the year, day of the month etc in the next.

Then select from that dates table and left join everything else. Do isnull function to replace nulls with zeros.