0
votes

I'm adapting the RubyZip recursive zipping example (found here) to work with write_buffer instead of open and am coming across a host of issues. I'm doing this because the zip archive I'm producing has word documents in it and I'm getting errors on opening those word documents. Therefore, I'm trying the work-around that RubyZip suggests, which is using write_buffer instead of open (example found here).

The problem is, I'm getting errors because I'm using an absolute path, but I'm not sure how to get around that. I'm getting the error "#//', name must not start with />"

Second, I'm not sure what to do to mitigate the issue with word documents. When I used my original code, which worked and created an actual zip file, any word document in that zip file had the following error upon opening: "Word found unreadable content in Do you want to recover the contents of this document? If you trust the source of this document, click Yes." The unreadable content error is the reason why I went down the road of attempting to use write_buffer.

Any help would be appreciated.

Here is the code that I'm currently using:

require 'zip'
require 'zip/zipfilesystem'

module AdvisoryBoard
  class ZipService
    def initialize(input_dir, output_file)
      @input_dir = input_dir
      @output_file = output_file
    end

    # Zip the input directory.
    def write
      entries = Dir.entries(@input_dir) - %w[. ..]
      path = ""

      buffer = Zip::ZipOutputStream.write_buffer do |zipfile|
        entries.each do |e|
          zipfile_path = path == '' ? e : File.join(path, e)
          disk_file_path = File.join(@input_dir, zipfile_path)

          @file = nil
          @data = nil

          if !File.directory?(disk_file_path)
            @file = File.open(disk_file_path, "r+b")
            @data = @file.read

            unless [@output_file, @input_dir].include?(e)
              zipfile.put_next_entry(e)
              zipfile.write @data
            end

            @file.close
          end
        end

        zipfile.put_next_entry(@output_file)

        zipfile.put_next_entry(@input_dir)
      end

      File.open(@output_file, "wb") { |f| f.write(buffer.string) }
    end
  end
end
1
You mention two different error scenarios, but its not clear what those errors are. Can you expand your question to include what particular errors you see?Stephen Crosby
Sure thing! @StephenCrosby: I expanded on the question to include the errors I'm experiencingTina Bell Vance

1 Answers

0
votes

I was able to get word documents to open without any warnings or corruption! Here's what I ended up doing:

require 'nokogiri'
require 'zip'
require 'zip/zipfilesystem'

  class ZipService
    # Initialize with the directory to zip and the location of the output archive.
    def initialize(input_dir, output_file)
      @input_dir = input_dir
      @output_file = output_file
    end

    # Zip the input directory.
    def write
      entries = Dir.entries(@input_dir) - %w[. ..]

      ::Zip::File.open(@output_file, ::Zip::File::CREATE) do |zipfile|
        write_entries entries, '', zipfile
      end
    end

    private

    # A helper method to make the recursion work.
    def write_entries(entries, path, zipfile)
      entries.each do |e|
        zipfile_path = path == '' ? e : File.join(path, e)
        disk_file_path = File.join(@input_dir, zipfile_path)

        if File.directory? disk_file_path
          recursively_deflate_directory(disk_file_path, zipfile, zipfile_path)
        else
          put_into_archive(disk_file_path, zipfile, zipfile_path, e)
        end
      end
    end

    def recursively_deflate_directory(disk_file_path, zipfile, zipfile_path)
      zipfile.mkdir zipfile_path
      subdir = Dir.entries(disk_file_path) - %w[. ..]
      write_entries subdir, zipfile_path, zipfile
    end

    def put_into_archive(disk_file_path, zipfile, zipfile_path, entry)
      if File.extname(zipfile_path) == ".docx"
        Zip::File.open(disk_file_path) do |zip|
          doc = zip.read("word/document.xml")
          xml = Nokogiri::XML.parse(doc)
          zip.get_output_stream("word/document.xml") {|f| f.write(xml.to_s)}
        end
        zipfile.add(zipfile_path, disk_file_path)
      else
        zipfile.add(zipfile_path, disk_file_path)
      end
    end
  end