Overview

There are two good choices for reading & writing Microsoft Excel Spreadsheet files from Java, in a platform independent way, - jexcelapi and Jakarta POI (HSSF). Both of them provide nice interface to access Excel data structure and even generate new spreadsheet. I have done extensive tests with both of them for a high-profile project for a Fortune 500 company. Previously also I had successfully used HSSF for another high profile client. In the paragraphs below I present my conclusions and sample code for reading Excel spreadsheet from Java using both the libraries.

Comparison of JExcelAPI with Jakarta-POI (HSSF)

1. JExcelAPI is clearly not suitable for important data. It fails to read several files. Even when it reads it fails on cells for unknown reasons. In short JExcelAPI isn't suitable for enterprise use.

2. HSSF is the POI Project's pure Java implementation of the Excel '97(-2002) file format. It is a mature product and was able to correctly and effortlessly read excel data generated from various sources, including non-MS Excel products like Open Office, and for various versions of Excel. It is very robust and well featured. Highly recommended.

3. Performance was never a consideration in our tests because a) data integrity is the single most important factor and b) there didn't appear to be any significant performance difference while running the tests; both of them were very fast. We didn't bother to time it for the above reasons.

How to read Excel Excel Spreadsheet from Java using Jakarta POI (HSSF)


try {
    POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream(file));
    HSSFWorkbook wb = new HSSFWorkbook(fs);
    HSSFSheet sheet = wb.getSheetAt(0);
    HSSFRow row;
    HSSFCell cell;

    int rows; // No of rows
    rows = sheet.getPhysicalNumberOfRows();

    int cols = 0; // No of columns
    int tmp = 0;

    // This trick ensures that we get the data properly even if it doesn't start from first few rows
    for(int i = 0; i < 10 || i < rows; i++) {
        row = sheet.getRow(i);
        if(row != null) {
            tmp = sheet.getRow(i).getPhysicalNumberOfCells();
            if(tmp > cols) cols = tmp;
        }
    }

    for(int r = 0; r < rows; r++) {
        row = sheet.getRow(r);
        if(row != null) {
            for(int c = 0; c < cols; c++) {
                cell = row.getCell((short)c);
                if(cell != null) {
                    // Your code here
                }
            }
        }
    }
} catch(Exception ioe) {
    ioe.printStackTrace();
}

This sample should get you started. Don't forget to import appropriately.

Gotchas while using Jakarta POI (HSSF)

  • getPhysicalNumberOfRows() returns the physical number of rows which may be more than the actual (logical) number of rows. The same goes for getPhysicalNumberOfCells().
  • You should check for nulls when fetching the HSSFRow and HSSFCell objects as shown.
  • Remember that Excel tables are often sparsely populated. So choose your data structures accordingly.
  • POI accesses the data by sheet. In JExcelAPI you can directly access the data in any row and column.

How to access Excel Spreadsheet using JExcelAPI


File fp = new File(file);
try {
    Workbook wb = Workbook.getWorkbook(fp);
    Sheet sheet = wb.getSheet(0);
    int columns = sheet.getColumns();
    int rows = sheet.getRows();

    String data;
    for(int col = 0;col < columns;col++) {
        for(int row = 0;row < rows;row++) {
            data = sheet.getCell(col, row).getContents();
            // Your code here
        }
    }
} catch(Exception ioe) {
    System.out.println("Error: " + ioe);
}

Gotchas while using JExcelAPI

  • JExcelAPI may often fail to fetch the data from certain cells or even the whole sheet. Unfortunately it gives a warning instead of an error to indicate the problem.
  • JExcelAPI doesn't expose the full meta-data of the spreadsheet like POI does.
  • JExcelAPI doesn't properly recognize the data type in cells. In all cases it indicated String data in our tests even when there were numeric or date fields.

Concluding thoughts on accessing Excel spreadsheets from Java

Both JExcelAPI and Jakarta POI (HSSF) are open source software to read & write data from / to Excel spreadsheet even on non-Microsoft platforms. In my tests HSSF came out to be the clear leader and recommended solution because of robustness and features.