I'm analysis a large dataset containing a variable number of observations per subject (ranging from 1 occurrence to 26 occurrences...). As I would need to analyse the time between events, the subjects with only one occurrence are non-informative.
Previously, while working in Stata I would assign a variable (called eg. total) using Stata code:
by idnummer, sort: gen total=_N
In this way every line/subject has a variable 'total' and I could eliminate all subjects total=1.
I have been trying with agg functions and with size but I end up with 'NaN'...
PS: using the "similar questions" on the side I have found the answer to my own question....
df['total'] = df.groupby('idnummer')['sequence'].transform('max')