Opened 11 years ago
Closed 11 years ago
#700 closed defect (fixed)
WCS tupleList is in column-major whereas it should be in row-major
Reported by: | Dimitar Misev | Owned by: | Veranika Liaukevich |
---|---|---|---|
Priority: | major | Milestone: | 9.0.x |
Component: | petascope | Version: | development |
Keywords: | range domain function | Cc: | Peter Baumann, adumitru, Jelmer Oosthoek, abeccati |
Complexity: | Medium |
Description
The output of csv in rasdaman is in column-major order, and is verbatim copied to the WCS GML output as far as I know, which expects row-major order.
Here's a script that converts csv of mr to png, considering that the csv is in column-major the output is as expected, if you do it in row-major it's messed up however.
#!/usr/bin/python import re import os ncols = 256 nrows = 211 header = '''ncols %s nrows %s xllcorner %s yllcorner %s cellsize %s ''' % (ncols,nrows,0,0,1) output = open("/tmp/data_rasql.asc","w") output.write(header) os.system("cd /tmp && rasql -q 'select csv(c) from mr as c' --out file") tupleList = open("/tmp/rasql_1.csv","r").readline().strip().replace("{","").replace("}","").split(",") for i in range(nrows): for j in range(ncols): output.write(tupleList[i + (j * nrows)] + " ") output.write("\n") output.close() os.system("gdal_translate -of PNG -ot Byte /tmp/data_rasql.asc /tmp/out.png")
Change History (18)
comment:1 by , 11 years ago
comment:2 by , 11 years ago
A third option is to use
rasql -q 'select encode(c, "AAIGrid") from mr as c' --out file
however this only works for 2D.
comment:3 by , 11 years ago
1 is suicidal, I prefer the second option.
An other alternative is that Petascope specifies the gml:coverageFunction
to follow the sequence rule returned by the CSV output, hence (eg 3D):
<gmlrgrid:sequenceRule axisOrder="+3 +2 +1">Linear</gmlrgrid:sequenceRule>
See "Linear" sequence rule in:
http://rasdaman.org/wiki/PetascopeSubsets
I would stick to the default sequence rule and add the rowmajor
option in CSV anyway.
comment:4 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Ok reassigning to Veranika, as she has done quite some changes in the CSV converter last.
To recap, csv() should allow a parameter format=rowmajor
:
select csv(c, "format=rowmajor") from mr as c
The default output should still be column-major however.
Later on perhaps we could also add format=gml
, so that instead of braces a space is printed.
comment:5 by , 11 years ago
Cc: | added |
---|---|
Keywords: | range domain function added |
I would call the parameter: sequencerule
, or gridfunction
, instead of format
.
As possible values I would put something not involving row
or column
terms, but rather inner_outer
(+1 +2 __ +N-1 +N
), outer_inner
('+N +N-1 __ +2 +1'
, what is currently done).
To complete the picture, also the startpoint
parameter could be added, but I would let this to when it is really needed.
Regarding default values, I would make them the GML way: starting point is sdom.lo
for every dimension (current implementation), and inner_outer
as the default listing order.
But this might break back-compatibility, so I guess outer_inner
should be kept as default.
Again, in order to keep back-compatibility I believe Petascope should not also change the order of coordinates in the tuple list (range values), is that right?
So we could add the gml:coverageFunction
in our templates:
<gml:coverageFunction> <gml:GridFunction> <gml:sequenceRule axisOrder="+N +N-1 __ +2 +1">Linear</gml:sequenceRule> <gml:startPoint>0 0 __ 0 0</gml:startPoint> </gml:GridFunction> </gml:coverageFunction>
This can be implemented relatively quickly but I would like to have the nulla osta before tackling it.
comment:6 by , 11 years ago
Refs: GML 3.2.1 (OGC 07-036), Sections 19.3.11, 19.3.12, 19.3.13, 19.3.14.
comment:7 by , 11 years ago
I also prefer "inner_outer" and "outer_inner" terms rather than "row_major" and "column_major", as the latter depend on the notion which of dimensions describes rows and which describes columns (as I am used to the notion, in which first dimension describes rows, so [0:2, 0:1] is a table with 3 rows and 2 columns, and thus current CSV encoder already returns array in the row-major order).
comment:8 by , 11 years ago
The patch was committed, now you can just add option "order=inner_outer" to the csv converter. The corresponding changes in QL Guide from Peter are coming.
comment:9 by , 11 years ago
caveat: for 9.0.1, this only is available with csv() and inv_csv(), not yet with encode(c,"csv").
follow-up: 11 comment:10 by , 11 years ago
While this new feature should not be used by Petascope for backcompatibility (define coverage function instead, and keep outer_inner
order), I suggest to exploit it somehow when encoding to some binary format.
$ grep row-major qlparser/qtencode.cc // for all bands, convert data from column-major form (from Rasdaman) to row-major form (GDAL)
comment:11 by , 11 years ago
Replying to pcampalani:
While this new feature should not be used by Petascope for backcompatibility (define coverage function instead, and keep
outer_inner
order), I suggest to exploit it somehow when encoding to some binary format.
$ grep row-major qlparser/qtencode.cc // for all bands, convert data from column-major form (from Rasdaman) to row-major form (GDAL)
That's something else, I don't see how this ticket is related.
comment:12 by , 11 years ago
I thought this option was rooted in the way data is extracted from rasdaman (not CSV only), so that the column-major/row-major transformations could be avoided: might mean a good performance gain. Anyway I might misunderstand the internal mechanics here, so spawn to an other ticket if that can make sense, otherwise let's just blow my comments.
comment:13 by , 11 years ago
IMHO it is good to have control on both levels - rasql clients requesting CSV may need this as well.
comment:14 by , 11 years ago
storage is done by rasdaman in a way that the arrays arrive natively in the C++ engine. Also, it needs to be delivered this way to the client to keep the promise that "you can readily iterate over the array with your C++ client code". So we cannot change that, but we can indeed change provisioning of data for various purposes → use encode() parameters, as we do.
comment:15 by , 11 years ago
Yes, outer_inner (default) order uses native internal array layout, and thus is preferred to be used to prevent random reads from the memory and to gain maximum performance.
comment:16 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:17 by , 11 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Re-opening: WCS output is not fixed, see my pending patch. A `gml:coverageFunction' must be specified to fix the WCS output.
Veranika, I believe you patched the RasQL CSV encoder right?
PS always refer to the associated changeset when you resolve a ticket thx.
comment:18 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Coverage function added for gridded coverages in changeset:82f7b71.
Now the declared mapping from domain points to rangeset values is correct (outer-inner).
We have two options for fixing this
I'm more in favor of 2, it will be faster and we may need it outside of petascope as well.