Floating Point numbers with Microsoft Excel Open XML Spreadsheet (.XLSX) files

Sean Leoszko · August 2019

When using .XLSX files in Aperture, be aware that floating point numbers will be treated differently with this file type compared to other excel files (.xls & .csv). XLSX stores values internally in scientific notation. If you open the attached file in Excel you’ll see:

However, if you save the XLSX file as a xml file and open up file in Notepad the you will see that some values will be stored with a longer floating point value that was defined in the excel spreadsheet :

Column A2’s data is stored as 7.0000000000000007E-2. Due to the limited resolution of floating point, the number 7.00000000007E-2 is the nearest possible representation of 0.07 that can be stored in a float.

When loading into Data Studio with “Auto-parse scientific numbers” set to true, you get:

When “Auto-parse scientific numbers” is false you see:

In other words, the data is stored in Excel as 0.0700000000000007. Data Studio picks up the raw value and loads that.

Why does Excel store the decimal in that way?

The answer is complicated and related to floating point inaccuracy. There’s a similar question here: http://stackoverflow.com/questions/12862524/why-does-excel-xlsx-save-some-values-in-xml-with-additional-values, and some more useful info here: http://stackoverflow.com/questions/2100490/floating-point-inaccuracy-examples. Any application that reads XLSX files will behave the same way.

Excel .xls files do not suffer from this problem. Or rather, the number is stored in the same way but only a truncated, displayed version can be read by applications like Data Studio (or Pandora), so it’s not visible (note that this can lead to other issues particularly in the calculations – there’s a bit more on that here: https://en.wikipedia.org/wiki/Numeric_precision_in_Microsoft_Excel)

Workaround

You can choose to use XLS to get around this issue or if that is not possible you can add a rounding with set scale (Decimal Places) transformation in your column.

You can see below that the 372.7988874444 value on row 3 in excel is actually a 10 decimal point value and the transformation has kept this extended value:

Floating Point numbers with Microsoft Excel Open XML Spreadsheet (.XLSX) files

Categories