I know there are a plethora of packages/functions such as (janitor) "tabyl" & "pastec" to get the descriptive values of variables, but I don't know how to apply them over only certain columns.
For example
library(pastec)
stat.desc(iris)
will return the mean/sd etc., for all the variable, but I want to apply it only to the numeric variables. I don't want to subset, because my data set has over 20 columns and the numeric columns are interspersed in varying orders.
Something else I tried is:
library(janitor)
lapply(iris,tabyl)
Which is great, except that I don't want tabyl applied over all the columns (because columns with 14,000 ID's makes for an ugly output) & my ultimate aim is to throw this into a neat looking excel file.
Any idea's for how I can apply these cool functions for 'numeric' types and 'character'/'factor' types separately? Or to specific columns specified in a vector?
lapply
to. Something likenums = sapply(iris, is.numeric); lapply(iris[nums], tabyl)
. Or, write yourself a wrapper function that looks at the column type and picks the right function to use. – Gregor ThomasDescTools
has aDesc
function that produces different summary stats for different variable types. If you have Microsoft Word, it will pass tables and plots to an open Word document. – dcarlson