Opened 9 years ago

Last modified 9 years ago

#1073 closed defect

Filestorage - divide tile files into subdirectories — at Version 2

Reported by: Dimitar Misev Owned by: Dimitar Misev
Priority: critical Milestone: 9.1.x
Component: relblobif Version: development
Keywords: Cc: Peter Baumann, Vlad Merticariu, Alex Dumitru
Complexity: Medium

Description (last modified by Dimitar Misev)

The flat directory organization of the tile files in $RASDATA is not scalable as we reach filesystem limits. Therefore tiles should be distributed into subdirectories.

Currently all data is stored in a single directory $RASDATA, i.e. we have

$RASDATA
 |_ RASBASE
 |_ 1
 |_ 2
 |_ 3
 |_ ..


Proposed restructuring

$RASDATA
 |_ RASBASE
 |_ TILES
      |_ ..

How should TILES be structured? Maximum number of subdirectories across the most common filesystems:

  • ext3 : 32,000
  • ext4 : unlimited in theory, but may be set to 64,000 by default
  • xfs : tested to millions and performance is not impacted
  • btrfs: similar to xfs
  • ntfs : 2^32-1 theoretically (same limit as number of files in a directory)

Between 10,000 and 100,000 files per directory seems like a good number well supported across filesystems. If we take 100,000 on ext3 that gives us a lower limit of 3 billion tiles.

Based on this my proposal is to distribute tiles in 100,000 per directory, so that we have this organization:

$RASDATA
 |_ RASBASE
 |_ TILES
      |_ 0
      |  |_ 1
      |  |_ 2
      |  |_ 3
      |  |_ ...
      |   
      |_ 1
      |  |_ 100,000
      |  |_ 100,001
      |  |_ 100,002
      |  |_ ...
      |  
      |_ ...

The subdirectory index in TILES is dir_index = tile_index / 100,000. The 100,000 number can be a compile time constant that can be adjusted as necessary. By default it is maybe better if it is 2^16 or 2^17 so that the dir_index can be computed with a fast bit shift.

I would like to stay away from creating complicated tree-like schemes nesting multiple subdirectories. It's the job of the filesystem to handle this load, if we ever reach some limits with this scheme on a particular filesystem it seems very unlikely that we'll be able to work around it ourselves, without actually adapting the filesystem underneat.

Rasdaman could support both structures (old and new) with a simple check at startup; in v10.0 we can enforce this structure. update_db.sh can be executed to migrate to the new directory structure.

Change History (2)

comment:1 by Dimitar Misev, 9 years ago

Description: modified (diff)

comment:2 by Dimitar Misev, 9 years ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.