Opened 9 years ago
Last modified 9 years ago
#1073 closed defect
Filestorage - divide tile files into subdirectories — at Version 1
Reported by: | Dimitar Misev | Owned by: | Dimitar Misev |
---|---|---|---|
Priority: | critical | Milestone: | 9.1.x |
Component: | relblobif | Version: | development |
Keywords: | Cc: | Peter Baumann, Vlad Merticariu, Alex Dumitru | |
Complexity: | Medium |
Description (last modified by )
The flat directory organization of the tile files in $RASDATA is not scalable as we reach filesystem limits. Therefore tiles should be distributed into subdirectories.
Currently all data is stored in a single directory $RASDATA, i.e. we have
$RASDATA |_ RASBASE |_ 1 |_ 2 |_ 3 |_ ..
Proposed restructuring
$RASDATA |_ RASBASE |_ TILES |_ ..
How should TILES be structured? Maximum number of subdirectories across the most common filesystems:
- ext3 : 32,000
- ext4 : unlimited in theory, but may be set to 64,000 by default
- xfs : tested to millions and performance is not impacted
- btrfs: similar to xfs
- ntfs : 232-1 theoretically (same limit as number of files in a directory)
Between 10,000 and 100,000 files per directory seems like a good number well supported across filesystems. If we take 100,000 on ext3 that gives us a lower limit of 3 billion tiles.
Based on this my proposal is to distribute tiles in 100,000 per directory, so that we have this organization:
$RASDATA |_ RASBASE |_ TILES |_ 0 | |_ 1 | |_ 2 | |_ 3 | |_ ... | |_ 1 | |_ 100,000 | |_ 100,001 | |_ 100,002 | |_ ... | |_ ...
The subdirectory index in TILES is dir_index = tile_index / 100,000. The 100,000 number can be a compile time constant that can be adjusted as necessary. By default it is maybe better if it is 216 or 217 so that the dir_index can be computed with a fast bit shift.
I would like to stay away from creating complicated tree-like schemes nesting multiple subdirectories. It's the job of the filesystem to handle this load, if we ever reach some limits with this scheme on a particular filesystem it seems very unlikely that we'll be able to work around it ourselves, without actually adapting the filesystem underneat.
Rasdaman could support both structures (old and new) with a simple check at startup; in v10.0 we can enforce this structure. update_db.sh can be executed to migrate to the new directory structure.