0
votes

I'm very new to R and this might be a very silly question to ask but I'm quite stuck right now.

I'm currently trying to do a Canonical Correspondence Analysis on my data to see which environmental factors have more weight on community distribution. I'm using the vegan package. My data consists of a table for the environmental factors (dataset EFamoA) and another for an abundance matrix (dataset AmoA). I have 41 soils, with 39 environmental factors and 334 species. After cleaning my data of any variables which are not numerical, I try to perform the cca analysis using the formula notation:

CCA.amoA <- cca (AmoA ~ EFamoA$PH + EFamoA$LOI, data = EFamoA, 
scale = TRUE, na.action = na.omit)

But then I get this error:

Error in weighted.mean.default(newX[, i], ...) : 
'x' and 'w' must have the same length

I don't really know where to go from here and haven't found much regarding this problem anywhere (which leads me to think that it must be some sort of very basic mistake I'm doing). My environmental factor data is not standardized as I red in the cca help file that the algorithm does it but maybe I should standardize it before? (I've also red that scale = TRUE is only for species). Should I convert the data into matrices?

I hope I made my point clear enough as I've been struggling with this for a while now.

Edit: My environmental data has NA values

1
I guess it is not the right way to call cca. In EFamoA, you need to have column AmoA, PH and LOI together. And you can avoid the use of $.user3710546
The thing is that AmoA is not only a column but a full table by itself where rows are species and columns are soils and this creates a matrix of observation counts of species per soil, whereas EFamoA is a table of the environmental factors (EF) and their values in each soil. I could try to create new variables for each EF so I can avoid the use of $ but not sure if this would work. Thank you very much for the comment anyway.Edu VO

1 Answers

0
votes

Alright so I was able to figure it out all by myself and it was indeed a silly thing, turns out my abundance data had soils as columns and species as rows, while environmental factor (EF) data had soils as rows and EF as columns.

using t() on my data, I transposed my data.frame (and collaterally converted it into a matrix) and cca() worked (as "length" was the same, I assume). Transposing the data separately and loading it already transposed works too. Although maybe the t() approach saves the need of creating a whole new file (in case your data was organized using different rows as in my case), it converts the data into a matrix and this might not be desired in some cases, either way, this turned out to be a very simple and obvious thing to solve (took me a while though).