diff --git a/src/documentation/content/xdocs/hssf/eval-devguide.xml b/src/documentation/content/xdocs/hssf/eval-devguide.xml index 994ca7a54a..bd58291fc5 100644 --- a/src/documentation/content/xdocs/hssf/eval-devguide.xml +++ b/src/documentation/content/xdocs/hssf/eval-devguide.xml @@ -24,7 +24,7 @@ into RPN tokens using the FormulaParser class in POI-HSSF main. (If you dont know what RPN tokens are, now is a good time to read - this.) + this.)

The big picture

RPN tokens are mapped to Eval classes. (Class hierarchy for the Evals @@ -55,8 +55,7 @@ ValueEval objects which are set into the AreaEval and RefEval (ok, since AreaEval and RefEval are interfaces, the implementations of AreaEval and RefEval - but you'll figure all that out from the code)

-

OperationEvals for the standard operators have been implemented and - basic testing has been done

+

OperationEvals for the standard operators have been implemented and tested.

FunctionEval and FuncVarEval

FunctionEval is an abstract super class of FuncVarEval. The reason for this is that in the FormulaParser Ptg classes, there are two Ptgs, FuncPtg and FuncVarPtg. In my tests, I did not see FuncPtg being used so there is no corresponding FuncEval right now. But in case the need arises for a FuncVal class, FuncEval and FuncVarEval need to be isolated with a common interface/abstract class, hence FunctionEval.

@@ -65,120 +64,124 @@
Walkthrough of an "evaluate()" implementation.

So here is the fun part - lets walk through the implementation of the excel - function... AVERAGE()

+ function... SQRT()

The Code -public Eval evaluate(Eval[] operands) { - double d = 0; - int count = 0; - ValueEval retval = null; - for (int i = 0, iSize = operands.length; i < iSize; i++) { - if (operands[i] == null) continue; - if (operands[i] instanceof AreaEval) { - AreaEval ap = (AreaEval) operands[i]; - Object[] values = ap.getValues(); - for (int j = 0, jSize = values.length; j < jSize; j++) { - if (values[j] == null) continue; - if (values[j] instanceof NumberEval) { - //inside areas, ignore bools - d += ((NumberEval) values[j]).getNumberValue(); - count++; - } - else if (values[j] instanceof RefEval) { - RefEval re = (RefEval) values[j]; - ValueEval ve = re.getInnerValueEval(); - if (ve != null && ve instanceof NumberEval) { - d += ((NumberEval) ve).getNumberValue(); - count++; - } - } - } - } - else if (operands[i] instanceof NumericValueEval) { - // for direct operands evaluate bools - NumericValueEval np = (NumericValueEval) operands[i]; - d += np.getNumberValue(); - count++; - } - else if (operands[i] instanceof RefEval) { - RefEval re = (RefEval) operands[i]; - ValueEval ve = re.getInnerValueEval(); - if (ve instanceof NumberEval) { - //if it is a reference, ignore bools - NumberEval ne = (NumberEval) ve; - d += ne.getNumberValue(); - count++; - } - } +public class Sqrt extends NumericFunction { + + private static final ValueEvalToNumericXlator NUM_XLATOR = + new ValueEvalToNumericXlator((short) + ( ValueEvalToNumericXlator.BOOL_IS_PARSED + | ValueEvalToNumericXlator.EVALUATED_REF_BOOL_IS_PARSED + | ValueEvalToNumericXlator.EVALUATED_REF_STRING_IS_PARSED + | ValueEvalToNumericXlator.REF_BOOL_IS_PARSED + | ValueEvalToNumericXlator.STRING_IS_PARSED + )); + + protected ValueEvalToNumericXlator getXlator() { + return NUM_XLATOR; } - if (retval == null) { - retval = (Double.isNaN(d)) ? - (ValueEval) ErrorEval.ERROR_503 : new NumberEval(d/count); + public Eval evaluate(Eval[] operands, int srcRow, short srcCol) { + double d = 0; + ValueEval retval = null; + + switch (operands.length) { + default: + retval = ErrorEval.VALUE_INVALID; + break; + case 1: + ValueEval ve = singleOperandEvaluate(operands[0], srcRow, srcCol); + if (ve instanceof NumericValueEval) { + NumericValueEval ne = (NumericValueEval) ve; + d = ne.getNumberValue(); + } + else if (ve instanceof BlankEval) { + // do nothing + } + else { + retval = ErrorEval.NUM_ERROR; + } + } + + if (retval == null) { + d = Math.sqrt(d); + retval = (Double.isNaN(d)) ? (ValueEval) ErrorEval.VALUE_INVALID : new NumberEval(d); + } + return retval; } - return retval; + }
Implementation Details
Modelling Excel Semantics -

Strings are ignored. Booleans are ignored!!! (damn Oo.o! I was almost misled here - nevermind). Actually here's the info on Bools: +

Strings are ignored. Booleans are ignored!!!. Actually here's the info on Bools: if you have formula: "=TRUE+1", it evaluates to 2. So also, when you use TRUE like this: "=SUM(1,TRUE)", you see the result is: 2. So TRUE means 1 when doing numeric calculations, right? Wrong! Because when you use TRUE in referenced cells with arithmetic functions, it evaluates to blank - meaning it is not evaluated - as if it was string or a blank cell. eg. "=SUM(1,A1)" when A1 is TRUE evaluates to 1. - So you have to do this kind of check for every possible data type as a function argument for any function before you understand the behaviour of the function. The operands can be entered in excel as comma separated or as a region specified like: A2:D4. Regions are treated as a single token by the parser hence we have AreaEval which stores the ValueEval at each cell in a region in a 1D array. So in our function if the operand is of type AreaEval we need to get the array of ValueEvals in the region of the AreaEval and iterate over each of them as if each of them were individual operands to the AVERAGE function. + This behaviour changes depending on which function you are using. eg. SQRT(..) that was + described earlier treats a TRUE as 1 in all cases. This is why the configurable ValueEvalToNumericXlator + class had to be written.

-

Thus, since sometimes, Excel treats - Booleans as the numbers 0 and 1 (for F and T respectively). - Hence BoolEval and NumberEval both implement a common interface: - NumericValueEval (since numbers and bools are also valid string - values, they also implement StringValueEval interface which is - also implemented by StringEval).

-

- The ValueEval inside an AreaEval can be one of: - NumberEval, BoolEval, StringEval, ErrorEval, BlankEval. - So you must handle each of these cases. - Similarly, RefEvals have a property: innerValueEval that returns the ValueEval at the referenced cell. The ValueEval inside a RefEval can be one of: NumberEval, BoolEval, StringEval, ErrorEval, BlankEval. So you must handle each of these cases - see how excel treats each one of them. -

- +

Note that when you are extending from an abstract function class like + NumericFunction (rather than implementing the interface o.a.p.hssf.record.formula.eval.Function directly) + you can use the utility methods in the super class - singleOperandEvaluate(..) - to quickly + reduce the different ValueEval subtypes to a small set of possible types. However when + implemenitng the Function interface directly, you will have to handle the possiblity + of all different ValueEval subtypes being sent in as 'operands'. (Hard to put this in + word, please have a look at the code for NumericFunction for an example of + how/why different ValueEvals need to be handled) +

Testing Framework - TODO! FormulaEval comes with a testing framework, where you add - formula's and their expected values to an Excel sheet, and the test code - automatically validates them. Since this is still in flux, the docs - will be put online once the system is stable +

Automated testing of the implemented Function is easy. + The source code for this is in the file: o.a.p.h.record.formula.GenericFormulaTestCase.java + This class has a reference to the test xls file (not /a/ test xls, /the/ test xls :) + which may need to be changed for your environment. Once you do that, in the test xls, + locate the entry for the function that you have implemented and enter different tests + in a cell in the FORMULA row. Then copy the "value of" the formula that you entered in the + cell just below it (this is easily done in excel as: + [copy the formula cell] > [go to cell below] > Edit > Paste Special > Values > "ok"). + You can enter multiple such formulas and paste their values in the cell below and the + test framework will automatically test if the formula evaluation matches the expected + value (Again, hard to put in words, so if you will, please take time to quickly look + at the code and the currently entered tests in the patch attachment "FormulaEvalTestData.xls" + file). +

\ No newline at end of file diff --git a/src/documentation/content/xdocs/hssf/eval.xml b/src/documentation/content/xdocs/hssf/eval.xml index acf88b421a..30785274c2 100644 --- a/src/documentation/content/xdocs/hssf/eval.xml +++ b/src/documentation/content/xdocs/hssf/eval.xml @@ -23,7 +23,7 @@
Status

The code currently provides implementations for all the arithmatic operators. - It also provides implementations for about 30 built in + It also provides implementations for approx. 20 built in functions in Excel. The framework however makes is easy to add implementation of new functions. See the Formula evaluation development guide for details.

@@ -47,19 +47,17 @@ HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(sheet, wb); CellReference cellReference = new CellReference("B3"); HSSFRow row = sheet.getRow(cellReference.getRow()); HSSFCell cell = row.getCell(cellReference.getCol()); -String formulaString = c.getCellFormula(); -HSSFFormulaEvaluator.CellValue cellValue = - evaluator.evaluate(formulaString); +HSSFFormulaEvaluator.CellValue cellValue = evaluator.evaluate(cell); switch (cellValue.getCellType()) { case HSSFCell.CELL_TYPE_BOOLEAN: - System.out.println(cellValue.getBooleanCellValue()); + System.out.println(cellValue.getBooleanValue()); break; case HSSFCell.CELL_TYPE_NUMERIC: - System.out.println(cellValue.getNumberCellValue()); + System.out.println(cellValue.getNumberValue()); break; case HSSFCell.CELL_TYPE_STRING: - System.out.println(cellValue.getStringCellValue()); + System.out.println(cellValue.getStringValue()); break; case HSSFCell.CELL_TYPE_BLANK: break; @@ -83,7 +81,7 @@ switch (cellValue.getCellType()) {
Using HSSFFormulaEvaluator.<strong>evaluateInCell</strong>(HSSFCell cell) -FileInputStream fis = new FileInputStream("c:/temp/test.xls"); +FileInputStream fis = new FileInputStream("/somepath/test.xls"); HSSFWorkbook wb = new HSSFWorkbook(fis); HSSFSheet sheet = wb.getSheetAt(0); HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(sheet, wb); @@ -92,7 +90,7 @@ HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(sheet, wb); CellReference cellReference = new CellReference("B3"); HSSFRow row = sheet.getRow(cellReference.getRow()); HSSFCell cell = row.getCell(cellReference.getCol()); -String formulaString = c.getCellFormula(); + if (cell!=null) { switch (evaluator.evaluateInCell(cell).getCellType()) { @@ -121,10 +119,6 @@ if (cell!=null) {
-
- -
-
Performance Notes