264
votes

I need an easy way to take a tar file and convert it into a string (and vice versa). Is there a way to do this in Ruby? My best attempt was this:

file = File.open("path-to-file.tar.gz")
contents = ""
file.each {|line|
  contents << line
}

I thought that would be enough to convert it to a string, but then when I try to write it back out like this...

newFile = File.open("test.tar.gz", "w")
newFile.write(contents)

It isn't the same file. Doing ls -l shows the files are of different sizes, although they are pretty close (and opening the file reveals most of the contents intact). Is there a small mistake I'm making or an entirely different (but workable) way to accomplish this?

9
That's a gzipped tar file (I hope). There are no "lines". Pls clarify what you're trying to achieve. - Brent.Longborough
are you trying to look at the compressed data or uncompressed content? - David Nehme
so chars in a compressed data stream will have roughly 1 in 256 chance of landing on "\n" defining end of a line, and that's ok if it doesn't expect "\r" too, see my answer below - Purfideas
This question should be re-titled as "Convert binary file to string", since IO.read would be the preferred answer otherwise. - Ian

9 Answers

398
votes

First, you should open the file as a binary file. Then you can read the entire file in, in one command.

file = File.open("path-to-file.tar.gz", "rb")
contents = file.read

That will get you the entire file in a string.

After that, you probably want to file.close. If you don’t do that, file won’t be closed until it is garbage-collected, so it would be a slight waste of system resources while it is open.

244
votes

If you need binary mode, you'll need to do it the hard way:

s = File.open(filename, 'rb') { |f| f.read }

If not, shorter and sweeter is:

s = IO.read(filename)
114
votes

To avoid leaving the file open, it is best to pass a block to File.open. This way, the file will be closed after the block executes.

contents = File.open('path-to-file.tar.gz', 'rb') { |f| f.read }
17
votes

how about some open/close safety.

string = File.open('file.txt', 'rb') { |file| file.read }
16
votes

on os x these are the same for me... could this maybe be extra "\r" in windows?

in any case you may be better of with:

contents = File.read("e.tgz")
newFile = File.open("ee.tgz", "w")
newFile.write(contents)
13
votes

Ruby have binary reading

data = IO.binread(path/filaname)

or if less than Ruby 1.9.2

data = IO.read(path/file)
6
votes

You can probably encode the tar file in Base64. Base 64 will give you a pure ASCII representation of the file that you can store in a plain text file. Then you can retrieve the tar file by decoding the text back.

You do something like:

require 'base64'

file_contents = Base64.encode64(tar_file_data)

Have look at the Base64 Rubydocs to get a better idea.

0
votes

Ruby 1.9+ has IO.binread (see @bardzo's answer) and also supports passing the encoding as an option to IO.read:

  • Ruby 1.9

    data = File.read(name, {:encoding => 'BINARY'})
    
  • Ruby 2+

    data = File.read(name, encoding: 'BINARY')
    

(Note in both cases that 'BINARY' is an alias for 'ASCII-8BIT'.)

-1
votes

If you can encode the tar file by Base64 (and storing it in a plain text file) you can use

File.open("my_tar.txt").each {|line| puts line}

or

File.new("name_file.txt", "r").each {|line| puts line}

to print each (text) line in the cmd.