1
votes

I am using Java 8, excel and apache poi for my project. There are certain cell values that I am interested in extracting from excel using java. I am trying to detect text which is strikeout in the excel cells, but the format of text is little different that is why I am facing some problems.

Below is how data laid out in my excel sheet:

enter image description here enter image description here

After extacting this data from excel, I always save it in string arraylist format like this a = [text 1, text 2, text 3]. code is mentioned below if you want to see how I am storing data in this arraylist.

What I want:

I want to ignore all those texts which are strikeout, so in above case I expect to have output like this [text 2, text 3] for first picture and second picture.

What I tried:

For the sake of just detecting strikeout values, I tried below code first:

XSSFRichTextString text = new XSSFRichTextString(a.get(0));                             
XSSFFont font = text.getFontAtIndex(0);
Boolean font_striked = font.getStrikeout();

but above code is not working as font_striked returns null, it must return true or false

The code which partially works in my case on single line cell values is:

boolean striked_out = sheet.getRow(row_index).getCell(column_index).getCellStyle(). getFont().getStrikeout();

This code only works if there is single line value in the cell and not with bullet list as shown above. It fails as it is not made for such kind of text.

P.S I believe that if somehow I am able to detect even a single strikeout string in bullet points from arraylist, I can make it work for all the data.

As per the answer below I have updated my question adding following code to show how I make my string arraylist

How I convert data in excel into Arraylist:

String value_header = cell.getStringCellValue();
String[] newline_split = value_header.split("-");

for (int i = 0; i < newline_split.length; i++){
            final_values = newline_split[i].
                  replace("\n"," ").replaceAll("\\s{2,}", " ").trim();
            XSSFRichTextString text = new XSSFRichTextString(final_values);                     
            XSSFFont font = text.getFontAtIndex(0);
            Boolean font_striked = font.getStrikeout();
} // for ends here
3
"I always save it in an arraylist format " How? Show code please. And what kind of ArrayList? ArrayList<String>? If so, then all formats will be lost since String does not have some formats. So new XSSFRichTextString(a.get(0)) will be a XSSFRichTextString without formats.Axel Richter
@AxelRichter as you asked, I have update my code in P.S section, let me know what you think.Dhruvify

3 Answers

2
votes

You will need to get the RichTextString first, then go through all FormattingRuns, check whether it is stroked out and only if not, then get the appropriated substring and put it into the List:

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.usermodel.CellType.*;
import org.apache.poi.xssf.usermodel.*;

import java.io.FileInputStream;

import java.util.List;
import java.util.ArrayList;

class ReadExcelRichTextCells {

 public static void main(String[] args) throws Exception {

  Workbook wb  = WorkbookFactory.create(new FileInputStream("ExcelRichTextCells.xlsx"));

  Sheet sheet = wb.getSheetAt(0);
  for (Row row : sheet) {
   for (Cell cell : row) {

    switch (cell.getCellTypeEnum()) {
     case STRING:
      XSSFRichTextString richtextstring = (XSSFRichTextString)cell.getRichStringCellValue();
      String textstring = richtextstring.getString();

      List<String> textparts = new ArrayList<String>();

      if (richtextstring.hasFormatting()) {
       for (int i = 0; i < richtextstring.numFormattingRuns(); i++) {

        if (richtextstring.getFontOfFormattingRun(i)==null || !richtextstring.getFontOfFormattingRun(i).getStrikeout()) {

         int indexofformattingrun = richtextstring.getIndexOfFormattingRun(i);
         String textpart = textstring.substring(indexofformattingrun, 
                                                indexofformattingrun + richtextstring.getLengthOfFormattingRun(i));
         String[] textpart_split = textpart.split("-");
         for (int j = 0; j < textpart_split.length; j++){
          String text = textpart_split[j].replace("\n", "").trim();       
          if (!"".equals(text)) textparts.add(text);
         }
        }
       } 
      } else {
       textparts.add(textstring);
      }

      System.out.println(textparts);
      break;

     //...
     default:
      System.out.println("default cell"); //should never occur
    }
   }
  }

  wb.close();

 }
}
1
votes

This is how to get the strikethrough in Excel with VBA:

Public Sub IsMyActivecellStriked()
    Debug.Print ActiveCell.Font.Strikethrough
End Sub

If you have something like this:

enter image description here

Then the you should find a way to access the characters and check for them. Like this:

Option Explicit
Public Sub TestMe()

    Dim strRange    As String
    Dim varArr      As Variant
    Dim varStr      As Variant
    Dim lngStart    As Long
    Dim lngEnd      As Long

    strRange = [a1]
    varArr = Split(strRange, Chr(10))

    For Each varStr In varArr
       lngStart = InStr(1, strRange, varStr)
       Debug.Print [a1].Characters(Start:=lngStart, Length:=Len(varStr)).Font.Strikethrough
       Debug.Print [a1].Characters(Start:=lngStart, Length:=Len(varStr)).Text
    Next varStr

End Sub

This will give you the following in the immediate window:

False
aaa
True
bbb
True
ccc
False
ddd

This should be possible to be translated into Java with the POI library.

1
votes

As per I understanding above question Question (plz Correct me if I am wrong..!) It should show whether your text in cell is strikethrough or not. ( TRUE or FALSE)

Below I have created a demo with that :

public class ApachePOI {    
    public static void main(String[] args) {

        //Using workbook
        XSSFWorkbook workbook;

        try {
            //Access excel file as workbook
            workbook = new XSSFWorkbook(new FileInputStream(new File("/testExcelfile.xlsx")));

            // first sheet of excel file
            XSSFSheet xssfFirstSheet = workbook.getSheetAt(0);

            //Check for A1  cell that strikethrough or not
            boolean strikedOutTextStatus = xssfFirstSheet.getRow(0).getCell(0).getCellStyle().getFont().getStrikeout();

            //print status of A1 cell text
            System.out.println(strikedOutTextStatus); 

// UPDATED CODE

      if(strikedOutTextStatus){

            String cellStringValue = xssfFirstSheet.getRow(0).getCell(0).getStringCellValue();

            System.out.println("cell Value  : "+cellStringValue.replace("-", "").replace(" ", ""));
        }


        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }           
    }    
}