Opened 9 years ago

Closed 9 years ago

#1007 closed defect (fixed)

precision in floating point operations

Reported by: Vlad Merticariu Owned by: Dimitar Misev
Priority: major Milestone: 9.1.x
Component: qlparser Version: development
Keywords: Cc: Alex Dumitru, Peter Baumann
Complexity: Medium

Description

The following files have a variable called chlor_a, Float32:

https://rsg.pml.ac.uk/shared_files/olcl

Ingesting the data in rasdaman works correctly (have checked min value, max value, and random data points).

numpy has been used for double checking.

Computing the average of the cells after excluding the nulls (which are MAX_FLOAT in this case) works correctly for small subsets (up to 1000 by 1000). Going larger than that, the average becomes increasingly different from the one computed with numpy.

This seems to be a precision issue with the floating point ops. To test this I tried:

select encode(condense + over x in [1:100000] using 0.3330f,"csv")

Result object 1: {33251.2}

which should be 33300.0

Change History (12)

comment:1 by Dimitar Misev, 9 years ago

Ok so our problem is with using float rather than double in condense. Test code:

  float add = 0.3330f;
  float res_float = 0;
  double res_double = 0;
  for (int i = 1; i <= 100000; i++)
  {
    res_float += add;
    res_double += add;
  }
  printf("res_float %.5f\n", res_float);
  printf("res_double %.5f\n", res_double);

Output:

res_float 33251.16406
res_double 33300.00043

comment:2 by Vlad Merticariu, 9 years ago

This happens in add_cells as well, so maybe we should use larger types in some operations to avoid overflows?

comment:3 by Dimitar Misev, 9 years ago

Yes, I'm just hoping there won't be some issues with the protocol, let's see.

comment:4 by Vlad Merticariu, 9 years ago

It appears that the largest C float is 3.4E+38, so it probably isn't an overflow issue. Strange that this happens in the test code your provided.

comment:5 by Dimitar Misev, 9 years ago

Not an overflow, but probably it has to do with the representation of 0.3330 as float which has a small error (but bigger than double's). Even res_double has an error of .00043
See e.g. https://en.wikipedia.org/wiki/Kahan_summation_algorithm, we should look into using some math library at some point.

comment:6 by Alex Dumitru, 9 years ago

The largest float doesn't matter, the distribution of precision in floating values is not liniar.

comment:7 by Dimitar Misev, 9 years ago

Component: rasserverqlparser
Milestone: 9.1.x

Workaround solution for the moment is to use d instead of f type specifier:

rasql -q 'select condense + over x in [1:100000] using 0.3330d' --out string

comment:8 by Dimitar Misev, 9 years ago

When no type specifier is used however it seems like float is assumed; I think it's best stick to double by default. But even if f is specified on the constant we should still have a double sum.. it's a bit weirdly done now in the condenser implementation, which uses the type of the operands to perform the operation.

Last edited 9 years ago by Dimitar Misev (previous) (diff)

comment:9 by Dimitar Misev, 9 years ago

Patch submitted.

comment:10 by Dimitar Misev, 9 years ago

Cc: Peter Baumann added

My patch fixes it for floats, but it actually affects other types as well, e.g.

$ rasql -q 'select condense + over x in [1:1000] using 1c' --out string

  Result element 1: 232

comment:11 by Dimitar Misev, 9 years ago

Patch submitted.

comment:12 by Dimitar Misev, 9 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.