Opened 9 years ago
Closed 5 years ago
#1049 closed defect (fixed)
subsetting in rasdaman is inconsistent
Reported by: | Dimitar Misev | Owned by: | apercov |
---|---|---|---|
Priority: | major | Milestone: | 10.0 |
Component: | qlparser | Version: | development |
Keywords: | Cc: | Peter Baumann, Vlad Merticariu | |
Complexity: | Medium |
Description (last modified by )
Subsetting does not behave very consistently, depending on the actual tiles I'd say (discovered in ticket #1036).
Subsetting needs to conform with SQL/MDA, i.e. the subset must be within the array sdom.
Anything else, e.g. a subset that intersects or is completely outside of the array sdom is an exception: "subset S not within array domain D".
Currently we have dedicated systemtest in source:systemtest/testcases_mandatory/test_subsetting the queries should be verified that they produce the correct outputs or errors.
Attachments (1)
Change History (23)
follow-up: 3 comment:1 by , 9 years ago
comment:2 by , 9 years ago
As my last test with Ricardo (ticket 1036) I can select outside of coverage and it returns for me an array with all is NULL value (I think it is not valid). Subsetting should work like a 'crop image':
+ If you crop outside of an image, you got nothing (a message is nice for user to let they know they have queried outside of coverage).
+ If you crop just a piece of image and the rest is outside you will get only a piece of image (with no NULL pixels of outside image).
This is very natural understanding and I thought Rasdaman has followed from very first time.
comment:3 by , 9 years ago
Replying to dmisev:
The correct behavior should be:
- if any subset index falls out of the sdom an error is thrown
- otherwise data is returned, where data is
- actual cell values at positions where tiles are materialized
- null where no tiles are materialized
@Peter can you confirm?
confirmed, this reflects exactly the expected behavior.
comment:4 by , 9 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:6 by , 9 years ago
I've fixed the exception throwing some time ago, but there were still some corner cases segfaulting. In the meantime it seems I managed to lose my patch.. so I'll have to redo that.
comment:7 by , 9 years ago
I need to confirm: the only valid subset is the subset that intersects the sdom?
comment:8 by , 9 years ago
- Consider collection
test_subsetting_single
:- 2D RGB array
- sdom:
[0:399,0:343]
- tiles:
[0:399,0:343]
select sdom(c[-400:150,-200:350]) from test_subsetting_single as c
What should the query above print?
[-400:150,-200:350]
[0:150,0:343]
- this is what it currently prints- error?
comment:10 by , 9 years ago
I'd expect either 1 or 3.
- is certainly incorrect: the query asks for a certain subset and gets something else.
comment:11 by , 9 years ago
I see that the QL Guide p 50 is inexact: "The generalized trim operator allows restriction, extension, and a combination of both operations in a shorthand syntax. This operator does not check for proper subsetting or supersetting of the domain modifier."
As to choices above:
- is a possible option, but I do not like it because addressing errors are not captured.
- is awkward to me: hard to anticipate, error prone.
- is what I would prefer myself, but I know there are people with other opinions.
If you want to achieve (1) I'd suggest
select extend(c, [-400:399,-200:343]) [-400:150,-200:350] from...
Agreed, this is awkward, but it clearly expresses intended user behavior.
PS: What does SQL/MDA specify? We should go along its line.
comment:12 by , 9 years ago
I would vote for 3 as well, it's better to throw an error than to take some more or less random decision for the user.
We need to be careful though, I'm assuming there are plenty of tools(perhaps ours as well) that expect the current behavior. Maybe it should be left for 10.0?
comment:13 by , 9 years ago
Ok so subsetting should be restricted to the sdom, i.e. in this case it would result in 3. This is the case in SQL/MDA as well.
In the implementation of rasdaman it's restricted up to intersecting with the sdom (although it's buggy so instead of 1. as I would expect, it prints 2.). So if we are to change the ql-guide then this should definitely be left for v10.
comment:15 by , 8 years ago
Cc: | removed |
---|---|
Owner: | changed from | to
I'm reassigning this ticket to Brennan. The first step is to specify a comprehensive test that will cover all (or most at least) theoretically possible variations, including erroneous cases.
Attached is a patch I've started working on sometime ago, that also contains a bunch of tests. It could be a good starting point.
by , 8 years ago
Attachment: | 0001-ticket-1049-test-subsetting.patch added |
---|
incomplete patch addressing this ticket
comment:16 by , 8 years ago
Milestone: | 9.1.x → 10.0 |
---|---|
Priority: | critical → major |
comment:17 by , 8 years ago
FYI, the original idea was:
- have a convenient shorthand notation which is safe against domain mismatch; these are the brackets.
- have a general, different function which permits freedom of partial overlaps etc; we called it extend().
comment:18 by , 8 years ago
I added a handful of tests in a recent patch which illustrate the sdom inconsistencies.
Current known_fails: requests where the sought-after sdom coordinates are not a subset of the existing spatial coordinates SHOULD throw an exception, but at the moment, they simply return the intersection. In case the intersection would be empty, the completely-out-of-bounds sdom is returned.
comment:19 by , 5 years ago
To summarize: subsetting needs to conform with SQL/MDA, i.e. the subset must be within the array sdom.
Anything else, e.g. intersection or outside of the array sdom is an exception: "subset S not within array domain D".
This is also in line with the *
: if subsetting would be allowed to extend out of the sdom, then suddenly the *
becomes generally not well-defined.
comment:20 by , 5 years ago
Description: | modified (diff) |
---|---|
Owner: | changed from | to
comment:21 by , 5 years ago
Status: | assigned → accepted |
---|
comment:22 by , 5 years ago
Resolution: | → fixed |
---|---|
Status: | accepted → closed |
The correct behavior should be:
@Peter can you confirm?