I am manually assigning a "Session ID" to my results set. I did this by ordering all events and if the time difference between the current and next event is greater than 2 minutes, set the "session" field to "New Session".
My results set now looks like this.
Table name : tbl_sessions
╔════════════╦═════╦══════════════╗
║ date ║ ID ║ session ║
╠════════════╬═════╬══════════════╣
║ 01/01/2018 ║ 100 ║ Same Session ║
║ 01/01/2018 ║ 100 ║ Same Session ║
║ 01/01/2018 ║ 100 ║ Same Session ║
║ 01/01/2018 ║ 100 ║ Same Session ║
║ 01/01/2018 ║ 100 ║ Same Session ║
║ 01/01/2018 ║ 100 ║ New Session ║
║ 01/01/2018 ║ 100 ║ Same Session ║
║ 01/01/2018 ║ 100 ║ Same Session ║
║ 01/01/2018 ║ 100 ║ New Session ║
║ 01/01/2018 ║ 110 ║ New Session ║
║ 01/01/2018 ║ 110 ║ Same Session ║
║ 01/01/2018 ║ 110 ║ New Session ║
║ 01/01/2018 ║ 110 ║ New Session ║
║ 02/01/2018 ║ 200 ║ Same Session ║
║ 02/01/2018 ║ 200 ║ New Session ║
║ 02/01/2018 ║ 100 ║ Same Session ║
║ 02/01/2018 ║ 100 ║ Same Session ║
║ 02/01/2018 ║ 100 ║ New Session ║
║ 02/01/2018 ║ 100 ║ Same Session ║
║ 02/01/2018 ║ 100 ║ Same Session ║
╚════════════╩═════╩══════════════╝
I just need to now add a session ID to this table. The first ID needs to be 1. If the field "session" is "Same Session", it should STAY 1. If the field "session" = "New Session" then the ID should be 2. All subsequent "Same Sessions" should then stay 2. and the next time we have "New Session" it should increment to 3. A new ID obviously has a new session too.
Essentially I need my results to look as follows;
╔════════════╦═════╦══════════════╦════════════╗
║ date ║ ID ║ session ║ session ID ║
╠════════════╬═════╬══════════════╬════════════╣
║ 01/01/2018 ║ 100 ║ Same Session ║ 1 ║
║ 01/01/2018 ║ 100 ║ Same Session ║ 1 ║
║ 01/01/2018 ║ 100 ║ Same Session ║ 1 ║
║ 01/01/2018 ║ 100 ║ Same Session ║ 1 ║
║ 01/01/2018 ║ 100 ║ Same Session ║ 1 ║
║ 01/01/2018 ║ 100 ║ New Session ║ 2 ║
║ 01/01/2018 ║ 100 ║ Same Session ║ 2 ║
║ 01/01/2018 ║ 100 ║ Same Session ║ 2 ║
║ 01/01/2018 ║ 100 ║ New Session ║ 3 ║
║ 01/01/2018 ║ 110 ║ New Session ║ 4 ║
║ 01/01/2018 ║ 110 ║ Same Session ║ 4 ║
║ 01/01/2018 ║ 110 ║ New Session ║ 5 ║
║ 01/01/2018 ║ 110 ║ New Session ║ 6 ║
║ 02/01/2018 ║ 200 ║ New Session ║ 7 ║
║ 02/01/2018 ║ 200 ║ New Session ║ 8 ║
║ 02/01/2018 ║ 100 ║ New Session ║ 9 ║
║ 02/01/2018 ║ 100 ║ Same Session ║ 9 ║
║ 02/01/2018 ║ 100 ║ New Session ║ 10 ║
║ 02/01/2018 ║ 100 ║ Same Session ║ 10 ║
║ 02/01/2018 ║ 100 ║ Same Session ║ 10 ║
╚════════════╩═════╩══════════════╩════════════╝
I know I can use LAG but when I do, the session ID does not stay stick for new sessions.
I am using amazon redshift.
Thanks
SUM(CASE WHEN session = 'New Session' then 1 else 0 end) OVER (partition by ?? order by date rows unbounded preceding)
– dnoeth