The problem only occurs if the Excel
used is not a English one. Then the formula is not really =TEXT(A2,"yyyy-MM-dd")
for example but =TEXT(A2,"JJJJ-MM-TT")
in my German Excel
for example.
As you see, the format part within the TEXT
function will always be locale dependant although all other formula parts will be en_US locale always. This is because that format part is in a string within the formula which will not be changed. So in German it is =TEXT(A2,"JJJJ-MM-TT")
(Year = Jahr, Day = Tag) and in French it is =TEXT(A2,"AAAA-MM-JJ")
(Year = Année, Day = Jour) .
And because apache poi
's FormulaEvaluator
does not have locale settings until now, that formula cannot be evaluated properly.
Then we have two possibilities.
First we could hope the stored cell value should be the needed string. So if cell formula starts with "TEXT" and contains "JJJJ-MM-TT" then do not evaluating because this will not be properly. Instead take the string cell value from Excel
's last evaluation.
Second we could replacing the locale dependent format part with the en_US one in the formula and then let apache poi
evaluate. At least if we only wants reading and not rewriting the Excel
file this will not destroing something in the Excel
file.
Code first approach:
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.usermodel.CellType;
import org.apache.poi.ss.util.*;
import org.apache.poi.ss.formula.eval.ErrorEval;
import java.io.FileInputStream;
class ReadExcelExample {
private static String getString(Cell cell, FormulaEvaluator evaluator) {
if (cell == null) return "null";
String text = "";
switch (cell.getCellType()) {
//switch (cell.getCellTypeEnum()) {
case STRING:
text = cell.getRichStringCellValue().getString();
break;
case NUMERIC:
if (DateUtil.isCellDateFormatted(cell)) {
text = String.valueOf(cell.getDateCellValue());
} else {
text = String.valueOf(cell.getNumericCellValue());
}
break;
case BOOLEAN:
text = String.valueOf(cell.getBooleanCellValue());
break;
case FORMULA:
text = cell.getCellFormula();
//if formula is TEXT(...,"JJJJ-MM-TT") then do not evaluating:
if (cell.getCellFormula().startsWith("TEXT") && cell.getCellFormula().contains("JJJJ-MM-TT")) {
text = text + ": value got from cell = " + cell.getRichStringCellValue().getString();
} else {
CellValue cellValue = evaluator.evaluate(cell);
switch (cellValue.getCellType()) {
//switch (cellValue.getCellTypeEnum()) {
case STRING:
text = text + ": " + cellValue.getStringValue();
break;
case NUMERIC:
if (DateUtil.isCellDateFormatted(cell)) {
text = text + ": " + String.valueOf(DateUtil.getJavaDate(cellValue.getNumberValue()));
} else {
text = text + ": " + String.valueOf(cellValue.getNumberValue());
}
break;
case BOOLEAN:
text = text + ": " + String.valueOf(cellValue.getBooleanValue());
break;
case ERROR:
throw new RuntimeException("from CellValue: " + ErrorEval.getText(cellValue.getErrorValue()));
default:
throw new RuntimeException("unexpected cellValue type " + cellValue.getCellType());
}
}
break;
case ERROR:
throw new RuntimeException("from Cell: " + ErrorEval.getText(cell.getErrorCellValue()));
case BLANK:
text = "";
break;
default:
throw new RuntimeException("unexpected cell type " + cell.getCellType());
}
return text;
}
public static void main(String[] args) throws Exception {
//Workbook wb = WorkbookFactory.create(new FileInputStream("SAMPLE.xls"));
Workbook wb = WorkbookFactory.create(new FileInputStream("SAMPLE.xlsx"));
DataFormatter formatter = new DataFormatter(new java.util.Locale("en", "US"));
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
Sheet sheet = wb.getSheetAt(0);
for (Row row : sheet) {
for (Cell cell : row) {
CellReference cellRef = new CellReference(row.getRowNum(), cell.getColumnIndex());
System.out.print(cellRef.formatAsString());
System.out.print(" - ");
String text = "";
try {
text = getString(cell, evaluator);
} catch (Exception ex) {
text = ex.toString();
}
System.out.println(text);
}
}
wb.close();
}
}
German Excel:
Result:
A1 - Value
B1 - Formula
A2 - Fri Jan 11 00:00:00 CET 2019
B2 - TEXT(A2,"JJJJ-MM-TT"): value got from cell = 2019-01-11
A3 - 123.45
B3 - A3*2: 246.9
B4 - java.lang.RuntimeException: from CellValue: #DIV/0!
B5 - TODAY(): Fri Jan 11 00:00:00 CET 2019
B6 - B5=A2: true
A7 - java.lang.RuntimeException: from CellValue: #N/A
B8 - TEXT(TODAY(),"JJJJ-MM-TT"): value got from cell = 2019-01-11
English Calc:
Result:
A1 - Value
B1 - Formula
A2 - Fri Jan 11 00:00:00 CET 2019
B2 - TEXT(A2,"yyyy-MM-dd"): 2019-01-11
A3 - 123.45
B3 - A3*2: 246.9
B4 - java.lang.RuntimeException: from CellValue: #DIV/0!
B5 - TODAY(): Fri Jan 11 00:00:00 CET 2019
B6 - B5=A2: true
A7 - java.lang.RuntimeException: from CellValue: #N/A
B8 - TEXT(TODAY(),"yyyy-MM-dd"): 2019-01-11
Code second approach (replacing the locale dependent format part with the en_US one):
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.usermodel.CellType;
import org.apache.poi.ss.util.*;
import java.io.FileInputStream;
import java.util.Locale;
class ExcelEvaluateTEXTDiffLocales {
private static String getString(Cell cell, DataFormatter formatter, FormulaEvaluator evaluator, Locale locale) {
String text = "";
if (cell.getCellType() == CellType.FORMULA) {
String cellFormula = cell.getCellFormula();
text += cellFormula + ":= ";
if (cellFormula.startsWith("TEXT")) {
int startFormatPart = cellFormula.indexOf('"');
int endFormatPart = cellFormula.lastIndexOf('"') + 1;
String formatPartOld = cellFormula.substring(startFormatPart, endFormatPart);
String formatPartNew = formatPartOld;
if ("de".equals(locale.getLanguage())) {
formatPartNew = formatPartNew.replace("T", "D"); // Tag = Day
// Monat = Month
formatPartNew = formatPartNew.replace("J", "Y"); // Jahr = Year
//...
} else if ("fr".equals(locale.getLanguage())) {
formatPartNew = formatPartNew.replace("J", "D"); // Jour = Day
// Mois = Month
formatPartNew = formatPartNew.replace("A", "Y"); // Année = Year
//...
} //...
cellFormula = cellFormula.replace(formatPartOld, formatPartNew);
cell.setCellFormula(cellFormula);
}
}
try {
text += formatter.formatCellValue(cell, evaluator);
} catch (org.apache.poi.ss.formula.eval.NotImplementedException ex) {
text += ex.toString();
}
return text;
}
public static void main(String[] args) throws Exception {
//Workbook wb = WorkbookFactory.create(new FileInputStream("SAMPLE.xls"));
Workbook wb = WorkbookFactory.create(new FileInputStream("SAMPLE.xlsx"));
Locale locale = new Locale("fr", "CH");
DataFormatter formatter = new DataFormatter(locale);
FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
Sheet sheet = wb.getSheetAt(0);
for (Row row : sheet) {
for (Cell cell : row) {
CellReference cellRef = new CellReference(row.getRowNum(), cell.getColumnIndex());
System.out.print(cellRef.formatAsString());
System.out.print(" - ");
String text = "";
text = getString(cell, formatter, evaluator, locale);
System.out.println(text);
}
}
wb.close();
}
}
French Calc:
Result:
A1 - Value
B1 - Formula
A2 - 1/11/2019
B2 - TEXT(A2,"AAAA-MM-JJ"):= 2019-01-11
A3 - 123.45
B3 - A3*2:= 246.9
B4 - 1/A4:= #DIV/0!
B5 - TODAY():= 1/12/2019
B6 - B5=A2:= FALSE
A7 - NA():= #N/A
B8 - TEXT(TODAY(),"AAAA-MM-JJ"):= 2019-01-12
Hint: Used apache poi
version here is 4.0.1
. Maybe lower versions might have further evaluation issues.
Excel
used is not a English one. Then the formula is not really=TEXT(A2,"yyyy-MM-dd")
for example but=TEXT(A2,"JJJJ-MM-TT")
in my GermanExcel
for example. And becauseapache poi
'sFormulaEvaluator
does not have locale settings until now, that formula cannot be evaluated properly. Then one could only hope the stored cell value which should be the needed string. So if cell formula starts with "TEXT" then do not evaluating but do taking the string cell value fromExcel
's last evaluation. – Axel Richter