Opened 12 years ago
Closed 11 years ago
#387 closed feature (fixed)
Query on coverages with different domains is unusable
Reported by: | Jeroen Dries | Owned by: | Dimitar Misev |
---|---|---|---|
Priority: | critical | Milestone: | 9.0 |
Component: | petascope | Version: | 8.4 |
Keywords: | wcps scale rounding | Cc: | Piero Campalani, Dimitar Misev, ungarj, Peter Baumann |
Complexity: | Hard |
Description
Our rasdaman setup contains 3D coverages with different spatial domains and also a number of coverages that are used as masks. We can't create a new mask for every 3D coverage to match the spatial domain, so it seems that the use of the 'scale' expression in WCPS is the solution.
This is an example query that uses the scale function:
for c in (TAMSAT_RFE), m in (GAUL_Africa_1), l in (GLC2000_Africa) return encode( coverage averagesOverTime over $T t(0:1060) values ((1/(count(((((c[x(26.8033:29.0793),y(-13.9166:-12.2299)])[t($T)] >=0) * ( scale(((m[x(26.8033:29.0793),y(-13.9166:-12.2299)]) =35535),{x:"CRS:1"(0:61), y:"CRS:1"(0:45)}) ) * scale(((l[x(26.8033:29.0793),y(-13.9166:-12.2299)]) = 13 ) ,{x:"CRS:1"(0:61), y:"CRS:1"(0:45)}) )) > 0)) * add( ( (c[x(26.8033:29.0793),y(-13.9166:-12.2299)])[t($T)] * ((c[x(26.8033:29.0793),y(-13.9166:-12.2299)])[t($T)] >=0) * ( scale(((m[x(26.8033:29.0793),y(-13.9166:-12.2299)]) =35535),{x:"CRS:1"(0:61), y:"CRS:1"(0:45)}) ) * scale(((l[x(26.8033:29.0793),y(-13.9166:-12.2299)]) = 13 ) ,{x:"CRS:1"(0:61), y:"CRS:1"(0:45)}) )) )), "csv")
While this query works, there are major issues:
- Computing the scaling parameters requires a calculation that takes coverage metadata and the bounding box as input. It can not be done easily by someone who is writing the query by hand.
- The computation of the scaling parameters x:"CRS:1"(0:61), y:"CRS:1"(0:45) is subject to rounding errors.
The problem is that the input parameters are usually decimals such as the coordinates of the bounding box, and the output are integers. This means that using different hardware or a different programming language may result in an integer that is wrong by one.
When sending such 'incorrectly' rounded parameters to rasdaman, it responds with an exception about spatial domains that do not match. This check is a bit strange because it implies that Rasdaman is able to compute these parameters itself, so why does the user have to compute them in the first place?
- If the bounding box crosses the border of the coverage, the scaling parameters have to be adapted accordingly. This is again something that is very hard for a user to do.
Preferrably, Rasdaman should provide a solution so that the user does not need to provide this kind of parameter at all, as the bounding boxes already provide sufficient information.
Finally, I'm also uncertain about the code that computes the scaling parameters: it seems to assume that there is a linear relation between the lon/lat coordinates of the bounding box, and the pixel coordinates of the coverage, while in most coordinate reference systems, this is not the case at all.
This is a critical issue because it makes it very hard for end-users to write their own Rasdaman/WCPS queries.
Change History (18)
comment:1 by , 12 years ago
Cc: | added |
---|---|
Component: | undecided → petascope |
Owner: | changed from | to
Status: | new → assigned |
comment:2 by , 11 years ago
Cc: | added |
---|---|
Keywords: | wcps scale rounding added |
comment:3 by , 11 years ago
Hi,
you indeed understood the point about the linearity assumption, but this was just a remark on the side of the actual issue. The actual issue is the fact that our end users would have to supply these scaling parameters in the first place. It would be a lot more usable if Rasdaman applies the scaling automatically, which should be possible as it already computes the necessary parameters to do the error checking.
The query example without the scaling would be shorter, and does not require us to do a computation that is subject to rounding error:
{{
for c in (TAMSAT_RFE), m in (GAUL_Africa_1), l in (GLC2000_Africa)
return encode(
coverage averagesOverTime
over $T t(0:1060)
values ((1/(count(((((c[x(26.8033:29.0793),y(-13.9166:-12.2299)])[t($T)] ≥0) * ( ((m[x(26.8033:29.0793),y(-13.9166:-12.2299)]) =35535) ) * ((l[x(26.8033:29.0793),y(-13.9166:-12.2299)]) = 13 ) )) > 0))
- add(
(
(c[x(26.8033:29.0793),y(-13.9166:-12.2299)])[t($T)]
- ((c[x(26.8033:29.0793),y(-13.9166:-12.2299)])[t($T)] ≥0)
- ( ((m[x(26.8033:29.0793),y(-13.9166:-12.2299)]) =35535) )
- ((l[x(26.8033:29.0793),y(-13.9166:-12.2299)]) = 13 ) ))
)),
"csv")
}}
About the linearity: for now this is fine, as we indeed work with rectified coverages at the moment, thanks for confirming that this is known behaviour.
comment:4 by , 11 years ago
Complexity: | Medium → Hard |
---|---|
Priority: | critical → major |
Type: | defect → feature |
So it seems this is more a feature request than a defect, indeed a valuable one that should be put in the list of "looking for sponsorship" ones. Also lowering it from "Critical" as it has viable workarounds (indeed still not user friendly but we have to keep priority mission oriented).
Not sure about your statement on the automatic computation relating to the error checking; I guess it is more due to the fact that you scale them to result in different sizes but this is probably better answered by mrusu.
comment:5 by , 11 years ago
Hi,
the reason that I created it as a defect lies with the fact that it is almost impossible for a user who is writing a query to compute the correct scaling parameters. In the example the parameters are 61 and 45, but when the user computes them, he ends up with a decimal. Depending on the decimal inputs, hardware and programming language, this decimal can be 60.99999, or 61.0001, or 61.5555, so how does the user determine whether the correct value is 60, 61 or 62? In 2 out of 3 cases he will get an exception! This seems like a defect to me.
comment:6 by , 11 years ago
Sounds reasonable but I assume you know the definition of defect (http://en.wikipedia.org/wiki/Software_bug) and there is no specification for automatic scaling yet that the software does not respect.
Again, acknowledged that user friendliness should improve and I hope you'll help us on that (Feel free to contribute actively) but please put any further discussion (if needed) on the dev mailing list where you can do a detailed technical discussion, not on this tracker.
Thanks for your understanding.
comment:7 by , 11 years ago
The issue is that the implementation tries to conform to a standard (WCPS 1.0), which does not mention anything about this topic as far as I know. But we could extend the implementation with our specific functions, and it would still conform to WCPS 1.0?
jdries, how would you suggest to solve this problem?
There has to be some function to explicitly trigger this auto scaling, otherwise there's an issue (which coverage of the many should be the master, according to which the others will be scaled up or down?). So something like this maybe
autoscale(coverage, toMatchThisCoverage)
comment:8 by , 11 years ago
Indeed, we need to propose an extension or modification to support this behavior. In our use cases, we always have one 'master' coverage, and multiple masks that need to be rescaled to match the master. So this triggers the questions whether this is sufficient. Is it for instance possible that there are 2 masters in the same query? (I guess not…)
Another question is whether the autoscale requires an explicit function, like in your example, or could be done implicitly. You could for instance implicitly assume that the first coverage that is listed is always the master.
The benefit of an implicit approach is that it doesn't make queries longer, and that it ensures that queries always work. The drawback may be that a user is not always aware of the auto scaling, so it may give unexpected results.
For the explicit function, could you for instance embed it in the sample query above? This would give a better idea of what the query would look like.
Thanks!
comment:9 by , 11 years ago
The imageCrsDomain() function could be used to let Petascope calculate the pixel/image domain that needs to be specified in the WCPS scale() operation. However, it looks like this has to be done in a couple of steps: (1) retrieve the pixel domain using the imageCrsDomain() function, (2) parse the results and retrieve the lower (lo) and upper (hi) bounds, (3) use these bounds in the scaling operation.
It would be easier if the imageCrsDomain() function can be used inline within the scale() operation so the query can be expressed as a single WCPS query. However, the output syntax of the imageCrsDomain function (lo,hi) is different from the input syntax of the scale operation (lo:hi). Isn't this strange?
A workaround could be that WCPS provides the means to retrieve the lo and hi values (just like RASQL using .lo and .hi) but I don't think this is the case at the moment …
comment:10 by , 11 years ago
Cc: | added |
---|
comment:11 by , 11 years ago
Cc: | added; removed |
---|---|
Milestone: | → 9.0 |
Priority: | major → critical |
I raise priority of this ticket, it's critical functionality for many users. All we need is to think of the details on how the auto-scaling should be carried out and control.
I think it could work well in the way implicit type casting works, as long as we cover clearly all the possible situations that can arise.
covA op covB will fail with error when they don't match in domain I think, and what we here want is to match their domains in some sensible way (most likely upscaling the smaller dimensions?).
comment:12 by , 11 years ago
Automatic up/down scaling is a lot of hidden trouble. How about using
scale(covA, imageCrsDomain/domain(covB))
This is much better than computing the scale domain by hand. Only thing we need is to make the domain metadata functions usable within other functions.
comment:13 by , 11 years ago
definitely seconding Dimitar. The operations tentatively have been built so that there is an atomic meaning without hidden agenda. That said, metadata handling in WCPS 1.0 is insufficient. Currently, WCPS 2.0 is in draft which adapts it to OGC's current coverage model, GMLCOV, and greatly adds capabilities in this respect. So what Dimitar describes will be possible with WCPS 2. The full spec is expected to be published in Spring 2014.
comment:14 by , 11 years ago
I submitted a patch. The WCPS test for this is
for c in ( mr ), c2 in (rgb) return avg( scale( c, imageCrsDomain(c2) ) )
scale( c, imageCrsDomain(c2) ) means that c will be scaled to the domain of c2.
comment:15 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:16 by , 11 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
This doesn't seem to work when combining a 3D coverage (Biopar_Africa_LAI used in query below) with a 2D coverage (GAUL_Africa_0 in query below) as explained in the ticket description.
Query:
for c in (Biopar_Africa_LAI), m in (GAUL_Africa_0)
return encode(
scale ( m[x(20:21),y(-27:-26)], imageCrsDomain(c[t(0),x(20:21),y(-27:-26)]) )
,"csv"
)
Stack trace:
[28 Oct 2013 11:09:15] ERROR PetascopeInterface@443: Error stack trace: InternalComponentError: Domain name not found: t at petascope.PetascopeInterface.handleProcessCoverages(PetascopeInterface.java:691) at petascope.PetascopeInterface.doGet(PetascopeInterface.java:362) at petascope.PetascopeInterface.doPost(PetascopeInterface.java:214) at javax.servlet.http.HttpServlet.service(HttpServlet.java:637) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:679) Caused by: WcpsError: Domain name not found: t at petascope.wcps.server.core.CoverageInfo.getDomainIndexByName(CoverageInfo.java:183) at petascope.wcps.server.core.ScaleCoverageExpr.<init>(ScaleCoverageExpr.java:127) at petascope.wcps.server.core.CoverageExpr.<init>(CoverageExpr.java:92) at petascope.wcps.server.core.EncodeDataExpr.<init>(EncodeDataExpr.java:72) at petascope.wcps.server.core.XmlQuery.startParsing(XmlQuery.java:110) at petascope.wcps.server.core.ProcessCoveragesRequest.<init>(ProcessCoveragesRequest.java:95) at petascope.wcps.server.core.Wcps.pcPrepare(Wcps.java:120) at petascope.wcps.server.core.Wcps.pcPrepare(Wcps.java:114) at petascope.PetascopeInterface.handleProcessCoverages(PetascopeInterface.java:642) ... 16 more
comment:17 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | reopened → assigned |
comment:18 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Fixed now with changeset:975fa62934a54bca81528bb111c87e9c21841258
Hi jdries,
thanks for the thorough report, this is precious.
By now, although I did not develop the scaling extension, I can personally reply on your comment about the scaling parameter computation: the same linearity assumption is taken when converting any subset to pixel coordinates, whether they are meters or degrees. As you say, many unrectified remote sensing products do not have fixed-resolution pixels, so that there's no linear relationship between geo-coords and pixel-indices, but
rasdaman
can only acceptgml:RectifiedGridCoverages
which imply a fixed resolution and apply to rectified images.Was this the point, or did I get it wrong?