#1007 closed defect (fixed)

precision in floating point operations

Reported by: vmerticariu Owned by: dmisev
Priority: major Milestone: 9.1.x
Component: qlparser Version: development
Keywords: Cc: mdumitru, pbaumann
Complexity: Medium


The following files have a variable called chlor_a, Float32:


Ingesting the data in rasdaman works correctly (have checked min value, max value, and random data points).

numpy has been used for double checking.

Computing the average of the cells after excluding the nulls (which are MAX_FLOAT in this case) works correctly for small subsets (up to 1000 by 1000). Going larger than that, the average becomes increasingly different from the one computed with numpy.

This seems to be a precision issue with the floating point ops. To test this I tried:

select encode(condense + over x in [1:100000] using 0.3330f,"csv")

Result object 1: {33251.2}

which should be 33300.0

Change History (12)

comment:1 Changed 22 months ago by dmisev

Ok so our problem is with using float rather than double in condense. Test code:

  float add = 0.3330f;
  float res_float = 0;
  double res_double = 0;
  for (int i = 1; i <= 100000; i++)
    res_float += add;
    res_double += add;
  printf("res_float %.5f\n", res_float);
  printf("res_double %.5f\n", res_double);


res_float 33251.16406
res_double 33300.00043

comment:2 Changed 22 months ago by vmerticariu

This happens in add_cells as well, so maybe we should use larger types in some operations to avoid overflows?

comment:3 Changed 22 months ago by dmisev

Yes, I'm just hoping there won't be some issues with the protocol, let's see.

comment:4 Changed 22 months ago by vmerticariu

It appears that the largest C float is 3.4E+38, so it probably isn't an overflow issue. Strange that this happens in the test code your provided.

comment:5 Changed 22 months ago by dmisev

Not an overflow, but probably it has to do with the representation of 0.3330 as float which has a small error (but bigger than double's). Even res_double has an error of .00043
See e.g. https://en.wikipedia.org/wiki/Kahan_summation_algorithm, we should look into using some math library at some point.

comment:6 Changed 22 months ago by mdumitru

The largest float doesn't matter, the distribution of precision in floating values is not liniar.

comment:7 Changed 22 months ago by dmisev

  • Component changed from rasserver to qlparser
  • Milestone set to 9.1.x

Workaround solution for the moment is to use d instead of f type specifier:

rasql -q 'select condense + over x in [1:100000] using 0.3330d' --out string

comment:8 Changed 22 months ago by dmisev

When no type specifier is used however it seems like float is assumed; I think it's best stick to double by default. But even if f is specified on the constant we should still have a double sum.. it's a bit weirdly done now in the condenser implementation, which uses the type of the operands to perform the operation.

Last edited 22 months ago by dmisev (previous) (diff)

comment:9 Changed 22 months ago by dmisev

Patch submitted.

comment:10 Changed 22 months ago by dmisev

  • Cc pbaumann added

My patch fixes it for floats, but it actually affects other types as well, e.g.

$ rasql -q 'select condense + over x in [1:1000] using 1c' --out string

  Result element 1: 232

comment:11 Changed 20 months ago by dmisev

Patch submitted.

comment:12 Changed 19 months ago by dmisev

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.