1
votes

I am manually assigning a "Session ID" to my results set. I did this by ordering all events and if the time difference between the current and next event is greater than 2 minutes, set the "session" field to "New Session".

My results set now looks like this.

Table name : tbl_sessions

╔════════════╦═════╦══════════════╗
║    date    ║ ID  ║   session    ║
╠════════════╬═════╬══════════════╣
║ 01/01/2018 ║ 100 ║ Same Session ║
║ 01/01/2018 ║ 100 ║ Same Session ║
║ 01/01/2018 ║ 100 ║ Same Session ║
║ 01/01/2018 ║ 100 ║ Same Session ║
║ 01/01/2018 ║ 100 ║ Same Session ║
║ 01/01/2018 ║ 100 ║ New Session  ║
║ 01/01/2018 ║ 100 ║ Same Session ║
║ 01/01/2018 ║ 100 ║ Same Session ║
║ 01/01/2018 ║ 100 ║ New Session  ║
║ 01/01/2018 ║ 110 ║ New Session  ║
║ 01/01/2018 ║ 110 ║ Same Session ║
║ 01/01/2018 ║ 110 ║ New Session  ║
║ 01/01/2018 ║ 110 ║ New Session  ║
║ 02/01/2018 ║ 200 ║ Same Session ║
║ 02/01/2018 ║ 200 ║ New Session  ║
║ 02/01/2018 ║ 100 ║ Same Session ║
║ 02/01/2018 ║ 100 ║ Same Session ║
║ 02/01/2018 ║ 100 ║ New Session  ║
║ 02/01/2018 ║ 100 ║ Same Session ║
║ 02/01/2018 ║ 100 ║ Same Session ║
╚════════════╩═════╩══════════════╝

I just need to now add a session ID to this table. The first ID needs to be 1. If the field "session" is "Same Session", it should STAY 1. If the field "session" = "New Session" then the ID should be 2. All subsequent "Same Sessions" should then stay 2. and the next time we have "New Session" it should increment to 3. A new ID obviously has a new session too.

Essentially I need my results to look as follows;

╔════════════╦═════╦══════════════╦════════════╗
║    date    ║ ID  ║   session    ║ session ID ║
╠════════════╬═════╬══════════════╬════════════╣
║ 01/01/2018 ║ 100 ║ Same Session ║          1 ║
║ 01/01/2018 ║ 100 ║ Same Session ║          1 ║
║ 01/01/2018 ║ 100 ║ Same Session ║          1 ║
║ 01/01/2018 ║ 100 ║ Same Session ║          1 ║
║ 01/01/2018 ║ 100 ║ Same Session ║          1 ║
║ 01/01/2018 ║ 100 ║ New Session  ║          2 ║
║ 01/01/2018 ║ 100 ║ Same Session ║          2 ║
║ 01/01/2018 ║ 100 ║ Same Session ║          2 ║
║ 01/01/2018 ║ 100 ║ New Session  ║          3 ║
║ 01/01/2018 ║ 110 ║ New Session  ║          4 ║
║ 01/01/2018 ║ 110 ║ Same Session ║          4 ║
║ 01/01/2018 ║ 110 ║ New Session  ║          5 ║
║ 01/01/2018 ║ 110 ║ New Session  ║          6 ║
║ 02/01/2018 ║ 200 ║ New Session  ║          7 ║
║ 02/01/2018 ║ 200 ║ New Session  ║          8 ║
║ 02/01/2018 ║ 100 ║ New Session  ║          9 ║
║ 02/01/2018 ║ 100 ║ Same Session ║          9 ║
║ 02/01/2018 ║ 100 ║ New Session  ║         10 ║
║ 02/01/2018 ║ 100 ║ Same Session ║         10 ║
║ 02/01/2018 ║ 100 ║ Same Session ║         10 ║
╚════════════╩═════╩══════════════╩════════════╝

I know I can use LAG but when I do, the session ID does not stay stick for new sessions.

I am using amazon redshift.

Thanks

The common way is: Use LAG to check iif there's more than two minutes since the last row, then return 1 else 0 as flag. Nest it in another Select doing a Cumulative Sum over the flag. In your case you already did the LAG, so it's a SUM(CASE WHEN session = 'New Session' then 1 else 0 end) OVER (partition by ?? order by date rows unbounded preceding)dnoeth
@dnoeth I already have my flag, as you can see from above. The flag is "New Session" and "Same Session". Could you elaborate on how I would use a cumululative sum on the above table to get my session ID?Steve
Just edited my comment (ok, you need to add +1 to start with 1 instead of 0)dnoeth
Thank you @dnoeth. I used the following code SUM(CASE WHEN session = 'New Session' then 1 else 0 end) OVER (Partition by id order by date rows unbounded preceding) This works only within the same ID. So if I move onto the next ID, the session resets to 0. I need a new ID to also flag up as a New Session.Steve
Then remove the PARTITION BY ID.dnoeth