Opened 8 years ago

Last modified 7 years ago

#1557 closed enhancement

stddev / var condensers — at Version 3

Reported by: Dimitar Misev Owned by: mtarigradschi
Priority: major Milestone: 9.5
Component: qlparser Version: development
Keywords: Cc: Peter Baumann, bbell, Vlad Merticariu
Complexity: Medium

Description (last modified by Dimitar Misev)

SQL and various array DBMS support standard deviation and variance aggregations. Quoting from the SQL standard:

  • If VAR_POP is specified, then the population variance of <value expression> evaluated for each row that qualifies, defined as the sum of squares of the difference of <value expression> from the mean of <value expression>, divided by the number of rows that qualify.
  • If VAR_SAMP is specified, then the sample variance of <value expression> evaluated for each row that qualifies, defined as the sum of squares of the difference of <value expression> from the mean of <value expression>, divided by the number of rows that qualify minus 1 (one).
  • If STDDEV_POP is specified, then the population standard deviation of <value expression> evaluated for each row that qualifies, defined as the square root of the population variance.
  • If STDDEV_SAMP is specified, then the sample standard deviation of <value expression> evaluated for each row that qualifies, defined as the square root of the sample variance.

So we have these functions:

  • VAR_POP(generalExp)
  • VAR_SAMP(generalExp)
  • STDDEV_POP(generalExp)
  • STDDEV_SAMP(generalExp)

Change History (3)

comment:1 by Dimitar Misev, 8 years ago

Description: modified (diff)

comment:2 by bbell, 7 years ago

found a 1-pass algo for computing std. dev.

https://www.strchr.com/standard_deviation_in_one_pass

anyone know of a better algo than this?

comment:3 by Dimitar Misev, 7 years ago

Description: modified (diff)
Owner: set to mtarigradschi
Status: newassigned

I'd say let's go ahead and implement support for these functions? They are really standard stuff and straightforward to implement, so it's pretty low-hanging fruit.

You could start by cloning QtAddCells for example, and adapt it to compute stddev/var (not sure if it could be done with the usual condenser implementation in ops.cc though). Brennan is quite familiar with the tile-based evaluation in qlparser and can help with questions on this.

Note: See TracTickets for help on using tickets.