0
votes

I have a dataframe with column names and row names assgined. I tried to save the dataframe with write.csv and write.table, but when I open the csv file in excel the row of the colum names is on more rows. This doesn't represent a problem for me if when I reload the dataframe in R it does not give me other issues. I reloaded the data with read.csv, if I set header=FALSE strange things happen, if I use header=TRUE, when I use view(), the dataframe looks ok. But if I try to extract rows from it, it always gives me the first row ( the one with column names). I tried to save the column names in a vector, delete the row from the dataset and rename it with colnames(df)=.., but it does not let me rename it. What can I do? First I uploaded a csv file and I've done some operations to it

data = read.csv("movie.csv",header=TRUE)
#head(data)
attach(data)
bipartite_network=data[,c("director_name","actor_1_name","actor_2_name","actor_3_name")]
d1=data[,c("director_name","actor_1_name")]
d2=data[,c("director_name","actor_2_name")] 
d3=data[,c("director_name","actor_3_name")]
names(d1)=names(d2)
names(d3)=names(d2)
network=rbind(d1,d2)
network=rbind(network,d3)
network=distinct(network)
actors=as.data.frame(network[,2])
actors=distinct(actors)
dim_col=dim(actors)[1][1]
directors=as.data.frame(network[,1])
directors=distinct(directors)
dim_row=dim(directors)[1][1]

adj_matrix=matrix(0,dim_row,dim_col)
colnames(adj_matrix)=actors[,1]
rownames(adj_matrix)=directors[,1]

for(i in 1:(dim(network)[1])){
  adj_matrix[network[i,1],network[i,2]]=1

}


# count=0
# for(i in 1:dim(adj_matrix)[1]){
#   for(j in 1: dim(adj_matrix)[2]){
#     if(adj_matrix[i,j]==1){
#       count=count+1
#     }
#     
#   }}

g=graph.incidence(adj_matrix,weighted = NULL)
csv_graph=as.data.frame(adj_matrix)

col_sums=colSums(csv_graph)
row_sums=rowSums(csv_graph)
reduced_graph=adj_matrix[!(row_sums<10),]
reduced_graph2=adj_matrix[,!(col_sums<5)]
reduced_graph2=t(reduced_graph2)

#eliminate missing value
reduced_graph=reduced_graph[,!(colnames(reduced_graph)=="")]
reduced_graph2=reduced_graph2[,!(colnames(reduced_graph2)=="")]


write.csv(reduced_graph2,"actor_director_reduced_network.csv",col.names = TRUE,row.names = TRUE)
write.csv(reduced_graph,"director_actor_reduced_network.csv",col.names = TRUE,row.names = TRUE)

But after I saved it if I open it I obtain what you can see in the following screen:

enter image description here

Then :

network=read.csv("director_actor_reduced_network.csv", header=TRUE)
head(network)


> head(network)
                X CCH.Pounder Johnny.Depp Christoph.Waltz Tom.Hardy Doug.Walker Daryl.Sabara J.K..Simmons Brad.Garrett Chris.Hemsworth
  Alan.Rickman Henry.Cavill Kevin.Spacey Giancarlo.Giannini Peter.Dinklage Will.Smith Aidan.Turner Emma.Stone Mark.Addy Christopher.Lee
  Naomi.Watts Leonardo.DiCaprio Robert.Downey.Jr. Liam.Neeson Bryce.Dallas.Howard Albert.Finney Hugh.Jackman Steve.Buscemi
  Glenn.Morshower Bingbing.Li Tim.Holmes Jeff.Bridges Joe.Mantegna Ryan.Reynolds Tom.Hanks Christian.Bale Jason.Statham Peter.Capaldi
  Jennifer.Lawrence Benedict.Cumberbatch Eddie.Marsan Jake.Gyllenhaal Charlie.Hunnam Harrison.Ford A.J..Buckley Kelly.Macdonald
  Sofia.Boutella John.Ratzenberger Tzi.Ma Oliver.Platt Robin.Wright Channing.Tatum Jim.Broadbent Amy.Poehler ChloÃ..Grace.Moretz Jet.Li
  Jimmy.Bennett Tom.Cruise Jeanne.Tripplehorn Joseph.Gordon.Levitt Scarlett.Johansson Angelina.Jolie.Pitt Gary.Oldman Tamsin.Egerton
  Keanu.Reeves Jon.Hamm Judy.Greer Damon.Wayans.Jr. Jack.McBrayer Vivica.A..Fox Gerard.Butler Nick.Stahl Bradley.Cooper
  Matthew.McConaughey Mark.Chinnery Paul.Walker Brad.Pitt Nicolas.Cage Justin.Timberlake Dominic.Cooper Bruce.Spence Jennifer.Garner
  Zack.Ward Anthony.Hopkins Robert.Pattinson Janeane.Garofalo Bernie.Mac Robin.Williams Essie.Davis Josh.Gad Steve.Bastoni Kelli.Garner
  Matthew.Broderick Seychelle.Gabriel Philip.Seymour.Hoffman Elisabeth.Harnois Ty.Burrell Jada.Pinkett.Smith Toby.Stephens
  Ed.Begley.Jr. Bruce.Willis John.Michael.Higgins Sam.Shepard Matt.Frewer Kevin.Rankin Chris.Evans Colin.Salmon James.D.Arcy
  Don.Johnson Mark.Rylance Matt.Damon Jim.Parsons Salma.Hayek Toby.Jones Daniel.Radcliffe Alfre.Woodard Rupert.Grint Miguel.Ferrer
  Ronny.Cox Tony.Curran Jeremy.Renner Michael.Gough Clint.Howard Karen.Allen Suraj.Sharma Demi.Moo

I have to cut this cause it goes to long.

> str(network)
'data.frame':   333 obs. of  6256 variables:
 $ X                             : Factor w/ 333 levels "Étienne Faure",..: 120 238 51 22 150 193 271 79 122 108 ...
 $ CCH.Pounder                   : int  1 0 0 0 0 0 0 0 0 0 ...
 $ Johnny.Depp                   : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Christoph.Waltz               : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Tom.Hardy                     : int  1 0 0 0 0 0 0 0 0 0 ...
 $ Doug.Walker                   : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Daryl.Sabara                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ J.K..Simmons                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Brad.Garrett                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Chris.Hemsworth               : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Alan.Rickman                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Henry.Cavill                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Kevin.Spacey                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Giancarlo.Giannini            : int  1 0 0 0 0 0 0 0 0 0 ...
 $ Peter.Dinklage                : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Will.Smith                    : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Aidan.Turner                  : int  1 0 0 0 0 0 0 0 0 0 ...
 $ Emma.Stone                    : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Mark.Addy                     : int  0 0 1 0 0 0 0 0 0 0 ...
 $ Christopher.Lee               : int  0 1 0 0 0 0 0 0 1 0 ...
 $ Naomi.Watts                   : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Leonardo.DiCaprio             : int  0 0 0 0 0 1 0 0 0 0 ...
 $ Robert.Downey.Jr.             : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Liam.Neeson                   : int  1 0 0 0 0 0 0 0 0 0 ...
 $ Bryce.Dallas.Howard           : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Albert.Finney                 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Hugh.Jackman                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Steve.Buscemi                 : int  1 0 0 0 0 0 0 0 0 0 ...
 $ Glenn.Morshower               : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Bingbing.Li                   : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Tim.Holmes                    : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Jeff.Bridges                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Joe.Mantegna                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Ryan.Reynolds                 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Tom.Hanks                     : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Christian.Bale                : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Jason.Statham                 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Peter.Capaldi                 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Jennifer.Lawrence             : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Benedict.Cumberbatch          : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Eddie.Marsan                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Jake.Gyllenhaal               : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Charlie.Hunnam                : int  1 0 0 0 0 0 0 0 0 0 ...
 $ Harrison.Ford                 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ A.J..Buckley                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Kelly.Macdonald               : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Sofia.Boutella                : int  0 0 0 0 0 0 0 0 0 0 ...
 $ John.Ratzenberger             : int  0 0 1 0 0 0 0 0 0 0 ...
 $ Tzi.Ma                        : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Oliver.Platt                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Robin.Wright                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Channing.Tatum                : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Jim.Broadbent                 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Amy.Poehler                   : int  0 0 0 0 0 0 0 0 0 0 ...
 $ ChloÃ..Grace.Moretz           : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Jet.Li                        : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Jimmy.Bennett                 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Tom.Cruise                    : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Jeanne.Tripplehorn            : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Joseph.Gordon.Levitt          : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Scarlett.Johansson            : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Angelina.Jolie.Pitt           : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Gary.Oldman                   : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Tamsin.Egerton                : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Keanu.Reeves                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Jon.Hamm                      : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Judy.Greer                    : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Damon.Wayans.Jr.              : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Jack.McBrayer                 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Vivica.A..Fox                 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Gerard.Butler                 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Nick.Stahl                    : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Bradley.Cooper                : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Matthew.McConaughey           : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Mark.Chinnery                 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Paul.Walker                   : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Brad.Pitt                     : int  1 0 0 0 0 0 0 0 0 0 ...
 $ Nicolas.Cage                  : int  0 0 1 0 0 0 0 0 0 0 ...
 $ Justin.Timberlake             : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Dominic.Cooper                : int  1 0 0 0 0 0 0 0 0 0 ...
 $ Bruce.Spence                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Jennifer.Garner               : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Zack.Ward                     : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Anthony.Hopkins               : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Robert.Pattinson              : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Janeane.Garofalo              : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Bernie.Mac                    : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Robin.Williams                : int  1 0 0 0 0 0 0 0 0 0 ...
 $ Essie.Davis                   : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Josh.Gad                      : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Steve.Bastoni                 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Kelli.Garner                  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Matthew.Broderick             : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Seychelle.Gabriel             : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Philip.Seymour.Hoffman        : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Elisabeth.Harnois             : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Ty.Burrell                    : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Jada.Pinkett.Smith            : int  0 0 0 0 0 0 0 0 0 0 ...
 $ Toby.Stephens                 : int  0 0 0 0 0 0 0 0 0 0 ...
  [list output truncated]
1
You need to show some code and some sample data. Please show at least some of the commands you are trying, verbatim in code formatting. Also show the structure of your CSV. At a minimum, after reading it into R, post the result of head(df) and str(df) so we can see what is there. If things don't seem right, perhaps also post the first 2-3 rows of raw data.Gregor Thomas
If I use head(), it tries to show all the first row, the one which contains the colnames. I could upload the csv file itself to show it, but I didn't write code, except: data= read.csv("filename",header=true), head(data), data[1,], data[2,]. All this command show all the same thing, i.e. what was the colnames of the dataframe before I saved it, using write.csv.Peanojr
(a) Please edit your question to show code you used. (b) Please type str(data) (or df, whatever you're calling it) into R and edit the result into your question. (c) Open the original CSV file in a text editor (not Excel), something like RStudio, Notepad, TextEdit, etc, and show us a snippet of what's there, by editing it into your question--as text. (d) Please clarify what is the picture you uploaded - is it the CSV file before you read it into R, or the result when you "tried to save it"? If you can do these very specific things, it will help us help you.Gregor Thomas

1 Answers

0
votes

You need to pass col.names=NA.

From the 'write.table' help file:

By default there is no column name for a column of row names. If ‘col.names = NA’ and ‘row.names = TRUE’ a blank column name is added, which is the convention used for CSV files to be read by spreadsheets.

Just one of those quirks of R you need to learn/remember. Or look up every time (like I seem to have to).

======== Edit ========

Your updated description indicates that your spreadsheet may not be working right. Make sure the files are named with the ".csv" extension and that your spreadsheet program is using commas to delineate cells.

If that doesn't work, we are going to need a better description of the problem.