Changes between Initial Version and Version 1 of Ticket #1073


Ignore:
Timestamp:
Nov 18, 2015, 4:46:22 PM (8 years ago)
Author:
Dimitar Misev
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #1073 – Description

    initial v1  
    1 The flat directory organization of the tile files in $RASDATA is not scalable as we reach filesystem limits. Therefore tiles should be divided into subdirectories.
     1The flat directory organization of the tile files in $RASDATA is not scalable as we reach filesystem limits. Therefore tiles should be distributed into subdirectories.
    22
    3 This can be implemented in a straightforward way, so that dir_id = tile_id / files_no_per_dir. We should check what could be the best parameters here to fit various common filesystems.
     3Currently all data is stored in a single directory $RASDATA, i.e. we have
     4
     5{{{
     6$RASDATA
     7 |_ RASBASE
     8 |_ 1
     9 |_ 2
     10 |_ 3
     11 |_ ..
     12}}}
     13 
     14= Proposed restructuring =
     15
     16{{{
     17$RASDATA
     18 |_ RASBASE
     19 |_ TILES
     20      |_ ..
     21}}}
     22
     23How should TILES be structured? Maximum number of subdirectories across the most common filesystems:
     24* ext3 : 32,000
     25* ext4 : unlimited in theory, but may be set to 64,000 by default
     26* xfs  : tested to millions and performance is not impacted
     27* btrfs: similar to xfs
     28* ntfs : 2^32-1 theoretically (same limit as number of files in a directory)
     29
     30Between 10,000 and 100,000 files per directory seems like a good number well supported across filesystems. If we take 100,000 on ext3 that gives us a lower limit of 3 billion tiles.
     31
     32Based on this my proposal is to distribute tiles in 100,000 per directory, so that we have this organization:
     33
     34{{{
     35$RASDATA
     36 |_ RASBASE
     37 |_ TILES
     38      |_ 0
     39      |  |_ 1
     40      |  |_ 2
     41      |  |_ 3
     42      |  |_ ...
     43      |   
     44      |_ 1
     45      |  |_ 100,000
     46      |  |_ 100,001
     47      |  |_ 100,002
     48      |  |_ ...
     49      | 
     50      |_ ...
     51}}}
     52
     53The subdirectory index in TILES is dir_index = tile_index / 100,000. The 100,000 number can be a compile time constant that can be adjusted as necessary. By default it is maybe better if it is 2^16 or 2^17 so that the dir_index can be computed with a fast bit shift.
     54
     55I would like to stay away from creating complicated tree-like schemes nesting multiple subdirectories. It's the job of the filesystem to handle this load, if we ever reach some limits with this scheme on a particular filesystem it seems very unlikely that we'll be able to work around it ourselves, without actually adapting the filesystem underneat.
     56
     57Rasdaman could support both structures (old and new) with a simple check at startup; in v10.0 we can enforce this structure. update_db.sh can be executed to migrate to the new directory structure.