I am quiet new to Data-Science so maybe this will be quiet easy for more advanced coders. I want to do a repeated measures ANOVA based on pre & post measurements of a test in different groups (Experimental Group vs. Control Group). Every subject only participated in one group.
In my Pandas - df I have the following columns: "Subject ID" (unique), "Condition" (Experimental or Control), "Pre-Measure Value", "Post-Measure Value" ...
subject_id = [1,2,3,4]
condition = [1,2,1,2]
pre = [1.1,2.1,3.1,4.1]
post = [1.2, 2.2, 3.2, 4.2]
sample_df = pd.DataFrame({"Subject ID": subject_id, "Condition": condition, "Pre": pre, "Post": post})
sample_df
How can I analyze this using ANOVA? The packages I've seen use dataframes where the dep variable is in one column whereas in my dataframe the depending measures which I want to evaluate are in two columns. Would I need to add another column specifying whether the value is pre or post for every value and condition. Is there a "handy" function to do something like this?
Specifically the output would need to look like:
subject_id_new = [1,1,2,2,3,3,4,4]
condition_new = [1,1,2,2,1,1,2,2]
measurement = ["pre", "post","pre", "post","pre", "post","pre", "post"]
value = [1.1, 1.2,2.1,2.2,3.1,3.2,4.1,4.2]
new_df = pd.DataFrame({"Subject ID":subject_id_new, "Condition": condition_new, "Measurement": measurement, "Value": value})
Thanks a lot.
statsmodels
library – Code Different