I performed a PCA on my data, and I have 4 principal components. However, it is very difficult to interpret my results with principal components. Therefore, I was wondering can I do a post hoc, by taking the variable with the highest variance in PC1 (say X1) and the variable with the highest variances in PC2 (say X2) and perform a regression analysis, with an outcome variable y, to test their association? (i.e. lm(Y~X1+X2))
Here's an example: I have 4 independent variables: memory test, cognition test, attention test, and processing speed test. I have 1 dependent variable, brain connectivity. Therefore, once I perform a PCA I get something like this:
PC1: 0.7X1+0.2x3
PC2: 0.8X2
PC3: 0.8X3+0.4X4
PC4: 0.1X4
PC1 and PC2 explain 82% of variance in the data. However, I'm not sure what to make of this information. How can I interpret this information based on my original variables? So I was thinking to perform a regression between the variables found within the principle components to analyze further what components may be driving this difference. Lm(connectivity~memory+cognition test)
Does that make sense? How can I go about this?