Opened 8 years ago
Last modified 7 years ago
#1557 closed enhancement
stddev / var condensers — at Version 3
Reported by: | Dimitar Misev | Owned by: | mtarigradschi |
---|---|---|---|
Priority: | major | Milestone: | 9.5 |
Component: | qlparser | Version: | development |
Keywords: | Cc: | Peter Baumann, bbell, Vlad Merticariu | |
Complexity: | Medium |
Description (last modified by )
SQL and various array DBMS support standard deviation and variance aggregations. Quoting from the SQL standard:
- If VAR_POP is specified, then the population variance of <value expression> evaluated for each row that qualifies, defined as the sum of squares of the difference of <value expression> from the mean of <value expression>, divided by the number of rows that qualify.
- If VAR_SAMP is specified, then the sample variance of <value expression> evaluated for each row that qualifies, defined as the sum of squares of the difference of <value expression> from the mean of <value expression>, divided by the number of rows that qualify minus 1 (one).
- If STDDEV_POP is specified, then the population standard deviation of <value expression> evaluated for each row that qualifies, defined as the square root of the population variance.
- If STDDEV_SAMP is specified, then the sample standard deviation of <value expression> evaluated for each row that qualifies, defined as the square root of the sample variance.
So we have these functions:
- VAR_POP(generalExp)
- VAR_SAMP(generalExp)
- STDDEV_POP(generalExp)
- STDDEV_SAMP(generalExp)
Change History (3)
comment:1 by , 8 years ago
Description: | modified (diff) |
---|
comment:2 by , 7 years ago
comment:3 by , 7 years ago
Description: | modified (diff) |
---|---|
Owner: | set to |
Status: | new → assigned |
I'd say let's go ahead and implement support for these functions? They are really standard stuff and straightforward to implement, so it's pretty low-hanging fruit.
You could start by cloning QtAddCells for example, and adapt it to compute stddev/var (not sure if it could be done with the usual condenser implementation in ops.cc though). Brennan is quite familiar with the tile-based evaluation in qlparser and can help with questions on this.
found a 1-pass algo for computing std. dev.
https://www.strchr.com/standard_deviation_in_one_pass
anyone know of a better algo than this?