Opened 11 years ago

Closed 11 years ago

#267 closed task (fixed)

Tiling with rasimport

Reported by: Dimitar Misev Owned by: Alexander Herzig
Priority: critical Milestone: 9.0
Component: rasgeo Version: 8.3
Keywords: Cc: a.beccati@…, Peter Baumann, joachim.ungar@…, HerzigA@…, Piero Campalani
Complexity: Trivial

Description (last modified by abeccati)

Check what is the tiling strategy that rasimport uses. Is it fixed to a certain tile size and configuration, or it's adaptable to the input maybe?

Check whether it's easily possible to make it flexible (i.e. allow using the rasql storage layout sublanguage).

Update documentation to include the tiling parameter that has been implemented as flexible tiling strategy specification solution.

Attachments (4)

a.png (201 bytes ) - added by Dimitar Misev 11 years ago.
b.png (190 bytes ) - added by Dimitar Misev 11 years ago.
tilingtest_2.txt (3.2 KB ) - added by Alexander Herzig 11 years ago.
0001-provisional-patch-adding-tiling-support-to-rasimport.patch (7.5 KB ) - added by Alexander Herzig 11 years ago.

Download all attachments as: .zip

Change History (35)

comment:1 by Dimitar Misev, 11 years ago

Cc: HerzigA@… Piero Campalani added
Description: modified (diff)

comment:2 by abeccati, 11 years ago

Probably an option to be specified on the command line with the tiling substring to be passed to the insert inside rasgeo would be most flexible.

in reply to:  description ; comment:3 by herziga@…, 11 years ago

Replying to dmisev:

Check what is the tiling strategy that rasimport uses.

rasimport uses rasql's 'insert into COLLNAME values … ' statement without specifying any tiling scheme at all. BTW, is there a default tiling scheme for this case?

Check whether it's easily possible to make it flexible (i.e. allow using the rasql storage layout sublanguage).

As Alan has suggested, easiest would be to have an option like
—tiling '<here comes the tiling spec as string parameter>'
and if it's specified, it just goes at the end of the 'insert into COLLNAME values …' statement.
Would that work?

comment:4 by Peter Baumann, 11 years ago

Alex, that should work. Only inconvenience is that the string has to be enclosed in quotes properly to make it one shell word, such as:

$ rasimport … —tiling "area of interest [blabla]"

…which seems acceptable. So Alan's suggestion is favored by me, too.

in reply to:  3 ; comment:5 by Dimitar Misev, 11 years ago

Replying to herziga@…:

rasimport uses rasql's 'insert into COLLNAME values … ' statement without specifying any tiling scheme at all. BTW, is there a default tiling scheme for this case?

If I understood correctly, rasimport imports data partitioning it manually into chunks of a variable size (the chunk size is computed depending on some parameters)? So the tiles in the object are equivalent to the chunks that rasimport commits.

By default there's no tiling, I still need to change this to the most meaningful generic tiling spec.

in reply to:  5 comment:6 by herziga@…, 11 years ago

Replying to dmisev:

If I understood correctly, rasimport imports data partitioning it manually into chunks of a variable size (the chunk size is computed depending on some parameters)? So the tiles in the object are equivalent to the chunks that rasimport commits.

Yes, that's correct. rasimport first creates (insert into COLLNAME …) an initial one pixel image (e.g. [0:0,0:0] for 2D) and then subsequently updates it by chunks of rows (e.g. update <COLLNAME> as m set m assign shift(<MDD>, <r_Point>) where oid(m) = <OID>). If I now were to specify a tiling scheme with the initial insert statement, would those incoming chunks (update statement) be adjusted to that scheme automatically or would they have to be put in in appropriate chunks (tiles) according to the scheme?

by Dimitar Misev, 11 years ago

Attachment: a.png added

by Dimitar Misev, 11 years ago

Attachment: b.png added

comment:7 by Dimitar Misev, 11 years ago

Yes it will automatically partition the update chunks according to the tiling scheme, but the problem is that it won't automatically accommodate the existing tiles.

To give you an example, suppose the tiling is regular 512x512 tiles, but rasimport commits 750x512 chunks. Then the resulting tiles in rasdaman after inserting two chunks with rasimport will be as

but it should be as

But this is a general issue of partial updates, not of rasimport I'd say. So as long as we use some fixed larger chunk size in rasimport (e.g. 100MB) I think issues like this will be minimized.

comment:8 by Dimitar Misev, 11 years ago

So in conclusion: I think the —tiling "tiling_spec" which will be passed verbatim at the end of the first insert statement as "tiling tiling_spec" is a pretty good solution.

in reply to:  8 comment:9 by herziga@…, 11 years ago

Replying to dmisev:

So in conclusion: I think the —tiling "tiling_spec" which will be passed verbatim at the end of the first insert statement as "tiling tiling_spec" is a pretty good solution.

Sweet! BTW, rasimport uses 128MiB as chunksize

comment:10 by Dimitar Misev, 11 years ago

To me it seemed like it's variable, because rasimport in certain cases was creating a huge number of small tiles in my experience.

in reply to:  10 comment:11 by herziga@…, 11 years ago

Replying to dmisev:

To me it seemed like it's variable, because rasimport in certain cases was creating a huge number of small tiles in my experience.

Mmmh, that's interesting. rasimport uses a fixed number of rows (nrows = maxMem_bytes / (numColumns * pixelsize_bytes)) for each iteration step, only the last chunk may be smaller. Perhaps I'm missing something in my own code?? BTW it's in rasimport's importImage(…) function.

comment:12 by Dimitar Misev, 11 years ago

Yes, this formula is pretty much what I ended up with when I investigated, but didn't have time to try understand the reason for it. So apparently the chunk size is not fixed, but perhaps what you mean is that it's limited to 128MB?

comment:13 by herziga@…, 11 years ago

Sorry, you're right, that's what I meant. The reason is to be able to process images which don't fit into RAM; 128MiB is just an arbitrary choice. We could turn it into a parmater though?

comment:14 by abeccati, 11 years ago

Milestone: 8.4

in reply to:  7 comment:15 by Alexander Herzig, 11 years ago

Replying to dmisev:
I just made an initial test with rasgeo and the new tiling option. Unfortunately, it doesn't seem to work with the current rasgeo workflow logic of importing an image as chunks of rows by partial updates:

rasimport -f t1.img -coll t1tiled1 -tiling "tiling regular [0:499,0:499] index rc_index"
ERROR - rimport::main, l. 1371: Exception: The tile configuration is incompatible to the marray domain.

I assume 'compatible' means, the chunk size must not be smaller than the tile size (for any or all dimensions?)? If that's the case, we had to adjust rasgeo such that the chunk size is adjusted (made compatible) to the tile size. This involved revising the whole logic to partition the data as well as adding capability to parse the tiling specification in the first place. Since you mentioned earlier that partial updates and accommodating for existing tiles is more of a server rather than a client problem, I was wondering whether you've got any ideas how to proceed in this case? Is this something you're going to address in the future, or do we have to implement 'tiling upon import' for large data on the client side?

comment:16 by Dimitar Misev, 11 years ago

The regular tiling is a bit constrained, it has to divide evenly the image domain, but even then it may still be a problem with the chunks, I'm not sure.

Can you try maybe with aligned tiling and leave out the index? E.g.

tiling aligned [0:499,0:499] tile size 250000

(multiply the tile size by the type size)

in reply to:  16 comment:17 by Alexander Herzig, 11 years ago

Replying to dmisev:
Aligned tiling seems to work with rasimport, at least it doesn't throw any exceptions and the image is imported correctly. However, I don't know how to check whether the tiling is correct though.
Strangely enough, I couldn't import an image using partial updates and aligned tiling on the command line (s. tilingtest_2.txt). I also tried directional tiling on the commandline, but it didn't work either using the 'partial update' workflow (and hence failed with rasimport). So, it seems only aligned tiling is working with partial updates and therefore with rasimport. See tilingtest_2.txt for the few tests I did.

by Alexander Herzig, 11 years ago

Attachment: tilingtest_2.txt added

comment:18 by Dimitar Misev, 11 years ago

Yes, with directional tiling it won't work, because it expects that the limits you give when you insert the array match the domain of the inserted array, unless the dimension is marked as *

Maybe you can attach a patch here and I'll check if the aligned tiling worked well.

comment:19 by Dimitar Misev, 11 years ago

Owner: changed from Dimitar Misev to Alexander Herzig
Status: newassigned

comment:20 by Peter Baumann, 11 years ago

Dimitar, did you have a chance to check the patch?

comment:21 by Dimitar Misev, 11 years ago

Oh I didn't notice a patch was uploaded, trac doesn't seem to send notifications for attachment uploads.

The patch is fine, just missing to update the README with the new parameter. It can be applied and later we can fix the README

comment:22 by Dimitar Misev, 11 years ago

Alex can you please upload the patch to the patchmanager?

comment:23 by Dimitar Misev, 11 years ago

Resolution: fixed
Status: assignedclosed

comment:24 by ungarj, 11 years ago

Complexity: Medium

Alex, thanks a lot for patching! However, I agree with Dimitar that a README file or help entry would be useful for us. Should we reopen the ticket?

in reply to:  24 comment:25 by Alexander Herzig, 11 years ago

Cc: a.beccati@… added

Replying to ungarj:
Very good point, Joachim, we shouldn't forget about that. Not quite sure how to handle this,
shall we re-open this one or open a new one?

comment:26 by abeccati, 11 years ago

Complexity: MediumTrivial
Description: modified (diff)
Priority: majorminor
Resolution: fixed
Status: closedreopened

Reopened and updated accordingly.

comment:27 by Dimitar Misev, 11 years ago

Milestone: 8.48.5

comment:28 by abeccati, 11 years ago

Description: modified (diff)
Priority: minorcritical

We got some feedback by users about that missing documentation so I'm raisin priority

comment:29 by abeccati, 11 years ago

Status: reopenedassigned

comment:30 by Dimitar Misev, 11 years ago

Milestone: 8.59.0

comment:31 by Dimitar Misev, 11 years ago

Resolution: fixed
Status: assignedclosed

Documentation fixed.

Note: See TracTickets for help on using tickets.