4
votes

I am trying to read an excel file from a string using Apache POI 3.9 without any success. I am not too familiar with java.

Just to clarify, in my program I already have the excel file as a string and I am mocking that behaviour by using the readFile function.

Program:

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.ByteBuffer;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;

import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;

public class Test {

    static String readFile(String path, Charset encoding) throws IOException 
    {
        byte[] encoded = Files.readAllBytes(Paths.get(path));
        return encoding.decode(ByteBuffer.wrap(encoded)).toString();
    }

    public static void main(String[] args) throws IOException, InvalidFormatException {
        String result = readFile("data.xlsx", StandardCharsets.UTF_8);

        InputStream is = new ByteArrayInputStream(result.getBytes("UTF-8"));

        Workbook book = WorkbookFactory.create(is);
    }

}

The error I am getting is:

Exception in thread "main" java.util.zip.ZipException: invalid block type
    at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164)
    at java.util.zip.ZipInputStream.read(ZipInputStream.java:193)
    at java.io.FilterInputStream.read(FilterInputStream.java:107)
    at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource$FakeZipEntry.<init>(ZipInputStreamZipEntrySource.java:127)
    at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(ZipInputStreamZipEntrySource.java:55)
    at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:83)
    at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:267)
    at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:73)
    at Test.main(Test.java:28)

Any help would be appreciated.

cheers

4
You may find this library useful: github.com/eaorak/excelreaorak

4 Answers

6
votes

So the fix for my problem was

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Paths;

import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;

public class Test {

    public static void main(String[] args) throws IOException, InvalidFormatException {
        byte[] result = Files.readAllBytes(Paths.get("data.xlsx"));     
        InputStream is = new ByteArrayInputStream(result);
        Workbook book = WorkbookFactory.create(is);
    }

}
3
votes

It looks like you're making this way too complicated. Just follow the Apache POI Quick Guide, which suggests reading the file with a FileInputStream. There's no need for reading the bytes into a byte array and using a ByteArrayInputStream.

Use one of the following, copied from the guide:

// Use a file
Workbook wb = WorkbookFactory.create(new File("MyExcel.xls"));

// Use an InputStream, needs more memory
Workbook wb = WorkbookFactory.create(new FileInputStream("MyExcel.xlsx"));
0
votes

What are you doing? You're reading a binary file into a byte[] and convert it to a String using UTF-8. Later you're converting it back to a byte stream using UTF-8 again. What for? Skip all the steps inbetween:

public static void main(String[] args) throws IOException, InvalidFormatException {
    InputStream is = new FileInputStream("data.xlsx");
    Workbook book = WorkbookFactory.create(is);
}
0
votes

This bugged me for a while. None of the suggested fixes worked for me. What did resolve the issue was to add a to the maven-resources-plugin, thus

        <plugin>
            <artifactId>maven-resources-plugin</artifactId>
            <version>2.5</version>
            <configuration>
              <encoding>UTF-8</encoding>
              <nonFilteredFileExtensions>
                <nonFilteredFileExtension>docx</nonFilteredFileExtension>
                <nonFilteredFileExtension>xls</nonFilteredFileExtension>
                <nonFilteredFileExtension>xlsx</nonFilteredFileExtension>
              </nonFilteredFileExtensions>
            </configuration>
        </plugin>