Opened 21 months ago

Last modified 5 months ago

#1049 assigned defect

subsetting in rasdaman is inconsistent

Reported by: dmisev Owned by: bbell
Priority: major Milestone: 10.0
Component: qlparser Version: development
Keywords: Cc: pbaumann, vmerticariu
Complexity: Medium

Description

Subsetting does not behave very consistently, depending on the actual tiles I'd say (discovered in ticket #1036).

We need to define exactly what the behavior of subsetting should be and make sure it's enforced.

rasql -q 'select sdom(c) from rgb AS c' --out string --quiet
[0:399,0:343]

rasql -q 'select sdom(c[10:20,*:*]) from rgb AS c' --out string --quiet
[10:20,0:343]

rasql -q 'select sdom(c[-10:10,*:*]) from rgb AS c' --out string --quiet
[0:10,0:343]

rasql -q 'select sdom(c[-10:-5,*:*]) from rgb AS c' --out string --quiet
rasdaman error 0: Exception: Transfer Failed

rasql -q 'select sdom(c[-10:-5,0:100]) from rgb AS c' --out string --quiet
[-10:-5,0:100]

rasql -q 'select sdom(c[-10:-5,-10:100]) from rgb AS c' --out string --quiet
[-10:-5,-10:100]

rasql -q 'select sdom(c[-10:5,-10:100]) from rgb AS c' --out string --quiet
[0:5,0:100]

rasql -q 'select sdom(c[-10:-5,200:500]) from rgb AS c' --out string --quiet
[-10:-5,200:500]

rasql -q 'select sdom(c[50:100,200:500]) from rgb AS c' --out string --quiet
[50:100,200:343]

Attachments (1)

0001-ticket-1049-test-subsetting.patch (63.6 KB) - added by dmisev 6 months ago.
incomplete patch addressing this ticket

Download all attachments as: .zip

Change History (19)

comment:1 follow-up: Changed 21 months ago by dmisev

The correct behavior should be:

  • if any subset index falls out of the sdom an error is thrown
  • otherwise data is returned, where data is
    • actual cell values at positions where tiles are materialized
    • null where no tiles are materialized

@Peter can you confirm?

comment:2 Changed 21 months ago by bphamhuu

As my last test with Ricardo (ticket 1036) I can select outside of coverage and it returns for me an array with all is NULL value (I think it is not valid). Subsetting should work like a 'crop image':

+ If you crop outside of an image, you got nothing (a message is nice for user to let they know they have queried outside of coverage).
+ If you crop just a piece of image and the rest is outside you will get only a piece of image (with no NULL pixels of outside image).

This is very natural understanding and I thought Rasdaman has followed from very first time.

comment:3 in reply to: ↑ 1 Changed 21 months ago by pbaumann

Replying to dmisev:

The correct behavior should be:

  • if any subset index falls out of the sdom an error is thrown
  • otherwise data is returned, where data is
    • actual cell values at positions where tiles are materialized
    • null where no tiles are materialized

@Peter can you confirm?

confirmed, this reflects exactly the expected behavior.

comment:4 Changed 21 months ago by dmisev

  • Owner changed from bphamhuu to dmisev
  • Status changed from new to assigned

comment:5 Changed 18 months ago by pbaumann

where do we stand on this?

comment:6 Changed 18 months ago by dmisev

I've fixed the exception throwing some time ago, but there were still some corner cases segfaulting. In the meantime it seems I managed to lose my patch.. so I'll have to redo that.

comment:7 Changed 18 months ago by dmisev

I need to confirm: the only valid subset is the subset that intersects the sdom?

comment:8 Changed 18 months ago by dmisev

  • Consider collection test_subsetting_single:
    • 2D RGB array
    • sdom: [0:399,0:343]
    • tiles: [0:399,0:343]
select sdom(c[-400:150,-200:350]) from test_subsetting_single as c

What should the query above print?

  1. [-400:150,-200:350]
  2. [0:150,0:343] - this is what it currently prints
  3. error?
Last edited 18 months ago by dmisev (previous) (diff)

comment:9 Changed 18 months ago by bphamhuu

I expect the 2 is correct way as it has data in this region.

comment:10 Changed 18 months ago by dmisev

I'd expect either 1 or 3.

  1. is certainly incorrect: the query asks for a certain subset and gets something else.
Last edited 18 months ago by dmisev (previous) (diff)

comment:11 Changed 18 months ago by pbaumann

I see that the QL Guide p 50 is inexact: "The generalized trim operator allows restriction, extension, and a combination of both operations in a shorthand syntax. This operator does not check for proper subsetting or supersetting of the domain modifier."

As to choices above:

  1. is a possible option, but I do not like it because addressing errors are not captured.
  2. is awkward to me: hard to anticipate, error prone.
  3. is what I would prefer myself, but I know there are people with other opinions.

If you want to achieve (1) I'd suggest

select extend(c, [-400:399,-200:343]) [-400:150,-200:350] from...

Agreed, this is awkward, but it clearly expresses intended user behavior.

PS: What does SQL/MDA specify? We should go along its line.

comment:12 Changed 18 months ago by mdumitru

I would vote for 3 as well, it's better to throw an error than to take some more or less random decision for the user.
We need to be careful though, I'm assuming there are plenty of tools(perhaps ours as well) that expect the current behavior. Maybe it should be left for 10.0?

comment:13 Changed 18 months ago by dmisev

Ok so subsetting should be restricted to the sdom, i.e. in this case it would result in 3. This is the case in SQL/MDA as well.

In the implementation of rasdaman it's restricted up to intersecting with the sdom (although it's buggy so instead of 1. as I would expect, it prints 2.). So if we are to change the ql-guide then this should definitely be left for v10.

comment:14 Changed 18 months ago by pbaumann

yes, makes sense to me. Let's do this.

comment:15 Changed 6 months ago by dmisev

  • Cc mdumitru removed
  • Owner changed from dmisev to bbell

I'm reassigning this ticket to Brennan. The first step is to specify a comprehensive test that will cover all (or most at least) theoretically possible variations, including erroneous cases.

Attached is a patch I've started working on sometime ago, that also contains a bunch of tests. It could be a good starting point.

Changed 6 months ago by dmisev

incomplete patch addressing this ticket

comment:16 Changed 5 months ago by dmisev

  • Milestone changed from 9.1.x to 10.0
  • Priority changed from critical to major

comment:17 Changed 5 months ago by pbaumann

FYI, the original idea was:

  • have a convenient shorthand notation which is safe against domain mismatch; these are the brackets.
  • have a general, different function which permits freedom of partial overlaps etc; we called it extend().

comment:18 Changed 5 months ago by bbell

I added a handful of tests in a recent patch which illustrate the sdom inconsistencies.

Current known_fails: requests where the sought-after sdom coordinates are not a subset of the existing spatial coordinates SHOULD throw an exception, but at the moment, they simply return the intersection. In case the intersection would be empty, the completely-out-of-bounds sdom is returned.

Last edited 5 months ago by bbell (previous) (diff)
Note: See TracTickets for help on using tickets.