32
votes

I have this admission_table containing ADMIT, GRE, GPA and RANK.

> head(admission_table)
  ADMIT GRE  GPA RANK
1     0 380 3.61    3
2     1 660 3.67    3
3     1 800 4.00    1
4     1 640 3.19    4
5     0 520 2.93    4
6     1 760 3.00    2

I'm trying to convert the summary of this table into data.frame. I want to have ADMIT, GRE, GPA and RANK as my column headers.

> summary(admission_table)
     ADMIT             GRE             GPA             RANK      
 Min.   :0.0000   Min.   :220.0   Min.   :2.260   Min.   :1.000  
 1st Qu.:0.0000   1st Qu.:520.0   1st Qu.:3.130   1st Qu.:2.000  
 Median :0.0000   Median :580.0   Median :3.395   Median :2.000  
 Mean   :0.3175   Mean   :587.7   Mean   :3.390   Mean   :2.485  
 3rd Qu.:1.0000   3rd Qu.:660.0   3rd Qu.:3.670   3rd Qu.:3.000  
 Max.   :1.0000   Max.   :800.0   Max.   :4.000   Max.   :4.000  

 > as.data.frame(summary(admission_table))
   Var1      Var2             Freq
1           ADMIT Min.   :0.0000  
2           ADMIT 1st Qu.:0.0000  
3           ADMIT Median :0.0000  
4           ADMIT Mean   :0.3175  
5           ADMIT 3rd Qu.:1.0000  
6           ADMIT Max.   :1.0000  
7             GRE  Min.   :220.0  
8             GRE  1st Qu.:520.0  
9             GRE  Median :580.0  
10            GRE  Mean   :587.7  
11            GRE  3rd Qu.:660.0  
12            GRE  Max.   :800.0  
13            GPA  Min.   :2.260  
14            GPA  1st Qu.:3.130  
15            GPA  Median :3.395  
16            GPA  Mean   :3.390  
17            GPA  3rd Qu.:3.670  
18            GPA  Max.   :4.000  
19           RANK  Min.   :1.000  
20           RANK  1st Qu.:2.000    
21           RANK  Median :2.000  
22           RANK  Mean   :2.485  
23           RANK  3rd Qu.:3.000  
24           RANK  Max.   :4.000  

As I'm trying to convert into data.frame, this is the only result I get. I want the data frame have the exact output just like the summary table because after that I want to insert that into Oracle database using this line of code:

dbWriteTable(connection,name="SUM_ADMISSION_TABLE",value=as.data.frame(summary(admission_table)),row.names = FALSE, overwrite = TRUE ,append = FALSE)

Is the any way to do so?

3
Do you really want that exact output? With the Min. :0.0000 type of structure? Or would a column that indicates the stat, and a column that indicates the value be sufficient?A5C1D2H2I1M1N2O1R2T1
How to achieve OP result now R? None of the answers work nowadaysrobertspierre

3 Answers

60
votes

You can consider unclass, I suppose:

data.frame(unclass(summary(mydf)), check.names = FALSE, stringsAsFactors = FALSE)
#              ADMIT             GRE             GPA            RANK
# 1 Min.   :0.0000   Min.   :380.0   Min.   :2.930   Min.   :1.000  
# 2 1st Qu.:0.2500   1st Qu.:550.0   1st Qu.:3.047   1st Qu.:2.250  
# 3 Median :1.0000   Median :650.0   Median :3.400   Median :3.000  
# 4 Mean   :0.6667   Mean   :626.7   Mean   :3.400   Mean   :2.833  
# 5 3rd Qu.:1.0000   3rd Qu.:735.0   3rd Qu.:3.655   3rd Qu.:3.750  
# 6 Max.   :1.0000   Max.   :800.0   Max.   :4.000   Max.   :4.000  
str(.Last.value)
# 'data.frame': 6 obs. of  4 variables:
#  $     ADMIT: chr  "Min.   :0.0000  " "1st Qu.:0.2500  " "Median :1.0000  " "Mean   :0.6667  " ...
#  $      GRE : chr  "Min.   :380.0  " "1st Qu.:550.0  " "Median :650.0  " "Mean   :626.7  " ...
#  $      GPA : chr  "Min.   :2.930  " "1st Qu.:3.047  " "Median :3.400  " "Mean   :3.400  " ...
#  $      RANK: chr  "Min.   :1.000  " "1st Qu.:2.250  " "Median :3.000  " "Mean   :2.833  " ...

Note that there is a lot of excessive whitespace there, in both the names and the values.

However, it might be sufficient to do something like:

do.call(cbind, lapply(mydf, summary))
#          ADMIT   GRE   GPA  RANK
# Min.    0.0000 380.0 2.930 1.000
# 1st Qu. 0.2500 550.0 3.048 2.250
# Median  1.0000 650.0 3.400 3.000
# Mean    0.6667 626.7 3.400 2.833
# 3rd Qu. 1.0000 735.0 3.655 3.750
# Max.    1.0000 800.0 4.000 4.000
3
votes

Another way to output a dataframe is:

as.data.frame(apply(mydf, 2, summary))

Works if only numerical columns are selected.

And it may throw an Error in dimnames(x) if there are columns with NA's. It's worth checking for that without the as.data.frame() function first.

0
votes

None of these solutions actually capture the output of the summary function. The tidy() function extracts the elements from a summary object and makes a bland data.frame, so it does not preserve other features or formatting.

If you want the exact output of the summary function in a data frame, you can do:

output<-capture.output(summary(thisModel), file=NULL,append=FALSE)
output_df <-as.data.frame(output)

This retains all of the new lines and is suitable for writing to XLSX, etc., which will result in the output appropriately spaced across rows.

If you want this output collapsed into a single cell, you can do:

output_collapsed <- paste0(output,sep="",collapse="\n")
output_df <-as.data.frame(output_collapsed)