I'm an utter beginner in R - fumbling my way through it for degree :)
i need to summarize a very large data set by site, as there are currently multiple rows per site and around 70 columns of variables - both numeric and categorical. i'm looking at seedling regeneration at each site.
I have 45 study sites, and trying to summarize all my variables per site. currently - each of the study sites has a number of plant species ranging from 5-30+ => so i can have up to 30 rows for each site, as each new species per site has its own row with #trees, #saplings#, seedlings, other variables as columns.
i've tried this code:
i <- sapply(data.df, is.factor) ### convert "factor" variables to "character" for dply analysis
data.df[i] <- lapply(data.df[i], as.character)
select(data.df,site,total_seedlings_m2,age,age_category,landuse_history, exotic_landcover_types,native_landcover_types,prcnt_light_transmittance,avg_canopy_height,prcnt_total_herb_cover,annual_rainfall_mm,annual_sunshine_hours,annual_temp_mean,annual_ground_frost_days,annual_rel_humidity,daily_air_rh_range,daily_air_temp_range,daily_soil_temp_range,total_trees_m2,total_basal_area_m2)
group_by_(site)
summarise_all(data.df)
i want to summarise all columns (although i need to do a mixture of Sum/Mean for different variables)
I'm just trialling this method. when i want to group data by site - which should give me 45 data rows, i get an error:
Error in UseMethod("group_by_") : no applicable method for 'group_by_' applied to an object of class "character"
it says i'm using "group_by_" when im actually using "group_by"
is there an easy fix? and is there a way to be able to summarise all columns and either add or average columns depending on variable (I would "sum" seedlings counts and would get Mean of micro-climate data)
first time asking for help online so hopefully this makes a little bit of sense :)
group_by_
. Please make this question reproducible by including some or all ofdata.df
. – neilfwsddply
function in theplyr
package. I find it a lot nicer – morgan121%>%
. The specific error is because you’re trying to group some character vector called “site”, not data.df as expected – divibisandat <- select(dat, ...
– divibisan