1
votes

I have the data below:

DECLARE @tbl TABLE (
ID VARCHAR(8)
,WeekDayName VARCHAR(15)
,StartDate VARCHAR(15)
,EndDate VARCHAR(15)
,A_Type VARCHAR(3)
,A_Days VARCHAR(10)
,A_Hours VARCHAR(10)
)
INSERT INTO @tbl (ID, WeekDayName, StartDate, EndDate, A_Type, A_Days, A_Hours)
VALUES
('150017', 'Monday', '2019-12-23', '2019-12-23', '430', 1.00, 8.20)
,('150017', 'Tuesday', '2019-12-24', '2019-12-24', '430', 1.00, 4.10)
,('150017', 'Friday', '2019-12-27', '2019-12-27', '430', 1.00, 8.20)
,('150017', 'Monday', '2019-12-30', '2019-12-30', '430', 1.00, 8.20)
,('150017', 'Tuesday', '2019-12-31', '2019-12-31', '430', 1.00, 4.10)
,('150035', 'Tuesday', '2019-03-12', '2019-03-12', '430', 0.66, 5.45)
,('150041', 'Thursday', '2019-01-17', '2019-01-17', '430', 1.00, 8.20)
,('150041', 'Tuesday', '2019-08-20', '2019-08-20', '430', 1.00, 8.20)
,('150041', 'Friday', '2019-08-21', '2019-08-21', '430', 1.00, 8.20)
,('150045', 'Monday', '2019-05-13', '2019-05-13', '430', 1.00, 8.20)
,('150045', 'Tuesday', '2019-05-14', '2019-05-14', '430', 1.00, 8.20)
,('150045', 'Wednesday', '2019-05-15', '2019-05-15', '430', 1.00, 8.20)
,('150045', 'Monday', '2019-11-25', '2019-11-25', '430', 1.00, 8.20)
,('150045', 'Tuesday', '2019-11-26', '2019-11-26', '430', 1.00, 8.20)
,('150045', 'Wednesday', '2019-11-27', '2019-11-27', '430', 1.00, 8.20)
,('150045', 'Thursday', '2019-11-28', '2019-11-28', '430', 1.00, 8.20)
,('150045', 'Friday', '2019-11-29', '2019-11-29', '430', 1.00, 8.20)
,('150046', 'Monday', '2019-03-11', '2019-03-11', '430', 1.00, 8.20)
,('150048', 'Tuesday', '2019-10-08', '2019-10-08', '430', 0.30, 2.50)
,('150048', 'Monday', '2019-10-28', '2019-10-28', '430', 1.00, 8.20)

The StartDate and EndDate are always identical and they are only working days (Monday to Friday). The WeekDayName is the name of the StartDate. ID is an integer. A_Type is (here) always 430. A_Days (always lower than or equal to 1), A_Hours (always lower than or equal to 8.20).

The desired output is for each ID a StartDate, EndDate, Sum of A_Days, Sum of A_Hours, SumDays. SumDays is the number of working days between StartDate and EndDate. EndDate is the day that which is the last date of a "block" of following dates, grouped by the ID.

E.g.

ID       StartDate    EndDate     A_Days    A_Hours    SumDays
150017   2019-12-23   2019-12-31  5.00      32.80      5
150035   2019-03-12   2019-03-12  0.66      5.45       1
150041   2019-01-17   2019-01-17  1.00      8.20       1
150041   2019-08-20   2019-08-21  2.00      16.40      2
...

150041 has two records, since there are two "blocks" of following dates. The first, 2019-01-17, and the seconds from 2019-08-20 to 2019-08-21.

Could someone please help me with that? (As a bonus, it would be great if it is possible to be able to group by the column A_Type as well). I could not get it working.

Any help is appreciated.

Thank you in advance,

Best

1
What is the logic for SumDays?Suraj Kumar
SumDays is the number of working days between StartDate and EndDate. If it is the same day, SumDays shall be 1.Json
Shouldn't there be two rows in the resultset for id 150017, that has rows on Monday, Tuesday, Friday, Monday, Tuesday?GMB
@GMB Wednesday and Thursday are December 25th and 26th, which are no working days. I did not specify this in my question. Without this specification, you are absolutely right.Json

1 Answers

1
votes

I understand that you want to group togethers records that have the same id and adjacent days (ignoring week-ends). This is a gaps and islands problem.

Here is one approach that uses lag() to retrieve the previous endDate and compare it to the current startDate; everytime dates are not "adjacent", a new group starts (a little additional logic is needed to handle the Friday > Monday gap).

select id, min(startDate) startDate, min(endDate) endDate, sum(a_days) a_days, sum(a_hours) a_hours, count(*) sumDays
from (
    select 
        t.*,
        sum(
            case when
                startDate = dateadd(d, 1, lagEndDate) 
                or (weekDayName = 'Monday' and startDate = dateadd(d, 3, lagEndDate))
            then 0 else 1
            end
        ) over (partition by id order by endDate) grp
    from (
        select t.*, lag(endDate) over(partition by id order by endDate) lagEndDate
        from @tbl t
    ) t
) t
group by id, grp

For your sample data, this produces:

id     | startDate               | endDate                 | a_days | a_hours | sumDays
:----- | :---------------------- | :---------------------- | :----- | :------ | ------:
150017 | 2019-12-23 00:00:00.000 | 2019-12-24 00:00:00.000 | 2.00   | 12.30   |       2
150017 | 2019-12-27 00:00:00.000 | 2019-12-31 00:00:00.000 | 3.00   | 20.50   |       3
150035 | 2019-03-12 00:00:00.000 | 2019-03-12 00:00:00.000 | 0.66   | 5.45    |       1
150041 | 2019-01-17 00:00:00.000 | 2019-01-17 00:00:00.000 | 1.00   | 8.20    |       1
150041 | 2019-08-20 00:00:00.000 | 2019-08-21 00:00:00.000 | 2.00   | 16.40   |       2
150045 | 2019-05-13 00:00:00.000 | 2019-05-15 00:00:00.000 | 3.00   | 24.60   |       3
150045 | 2019-11-25 00:00:00.000 | 2019-11-29 00:00:00.000 | 5.00   | 41.00   |       5
150046 | 2019-03-11 00:00:00.000 | 2019-03-11 00:00:00.000 | 1.00   | 8.20    |       1
150048 | 2019-10-08 00:00:00.000 | 2019-10-08 00:00:00.000 | 0.30   | 2.50    |       1
150048 | 2019-10-28 00:00:00.000 | 2019-10-28 00:00:00.000 | 1.00   | 8.20    |       1