[[PageOutline]] = Guidelines = == OS optimizations == * ''filesystem'' - it's recommended to use the '''xfs''' filesystem, over ''ext4'' or similar. Mount with '''noatime''', e.g. in `/etc/fstab`: {{{ # / was on /dev/sda1 during installation UUID=212a6de3-515b-4e12-aa17-3e466f6966aa / ext4 noatime,errors=remount-ro 0 1 }}} * ''scheduler'' - '''deadline''' for spinning disk, '''noop''' for ssd or high caching raid, e.g. {{{ echo deadline > /sys/block/DEVICE/queue/scheduler }}} * ''read ahead'' - enable with below command, which can be made permanent by putting in `/etc/rc.local` for example: {{{ /sbin/blockdev --setra 4096 /dev/DEVICE }}} * ''write cache'' - put the below in `/etc/sysctl.conf` and load with `sysctl -p`: {{{ vm.dirty_background_ratio = 5 vm.dirty_background_ratio = 10 }}} * ''shared memory'' - run attachment:calc_kernel_shm.sh and substituted the outputs into `/etc/sysctl.conf`: {{{ kernel.shmmax=VALUE kernel.shmall=VALUE }}} == PostgreSQL == The default PostgreSQL configuration can be tuned for better performance with rasdaman. The parameters below should be adapted in the `postgresql.conf`, typically found under `/etc/postgresql`. * ''max_connections'' - as long as a single user is using the database, this can be decreased to about 40 * ''shared_buffers'' - should be 25%-33% of your total RAM * ''work_mem'' - `(total RAM / connections) / 4..16`, but not lower then 128 MB. When you have 4 GB RAM and you are alone then 256 MB is fine. * ''maintenance_work_mem'' - total RAM / 16 * ''synchronous_commit'' - set to ''off'' when doing bulk ingestion * ''wal_buffer'' - should be 16 MB; by default 1/32 from the shared buffer value is taken as long as it is lower then 16 MB otherwise it is 16 MB * ''checkpoint_segments'' - 256 is a good start value when doing bulk ingestion * ''checkpoint_completion_target'' - 0.0 (especially important for bulk ingestion) * ''random_page_cost'' - 2.0 for spinning disk and 1.01 for ssd, high caching raid and when DB fits into RAM * ''effective_cache_write'' - total RAM - shared_buffers value * ''logging_collector'' - set to on if it isn't set already * ''log_line_prefix'' - set to '%t ' to get timestamps in the logs = Benchmarks = == Ingestion == Below are the results of ingesting one dataset using various tilings and postgres parameters. === Regular tiling === The dataset is 150 png rgb images of size 1024x608, so that each image is 1867776 bytes uncompressed, or 280 MB in total. * Default postgres parameters: || '''Tiling scheme''' || '''Tile size''' || '''Import time''' || '''DB size before''' || '''DB size after''' || '''BLOBs size''' || '''BLOBs size2''' || || 0:31,0:31,0:149 || 460800 || 84 minutes || 184M || '''19416M''' || 267 MB || 18 GB || * Changed parameters in postgresql.conf: shared buffers = 1024 MB, temp_buffers = 8MB, fsync = off, synchronous_commit = off, wal_sync_method = fsync, full_page_writes = off, wal_buffers = 1MB, wal_writer_delay = 2000ms, checkpoint_segments = 32 || '''Tiling scheme''' || '''Tile size''' || '''Import time''' || '''DB size before''' || '''DB size after''' || '''BLOBs size''' || '''BLOBs size2''' || || 0:31,0:31,0:149 || 460800 || 35 minutes || 184M || '''19416M''' || 267 MB || 18 GB || * Using [#Tilecache tile cache] || '''Tiling scheme''' || '''Tile size''' || '''Import time''' || '''DB size before''' || '''DB size after''' || '''BLOBs size''' || '''BLOBs size2''' || || 0:31,0:31,0:149 || 460800 || '''48 seconds''' || 37M || 329M || 267 MB || 267 MB || === Regular tiling + VACUUM === The dataset is 50 png rgb images of size 1024x608, so that each image is 1867776 bytes uncompressed, or 89 MB in total. * Changed parameters in postgresql.conf: shared buffers = 1024 MB, temp_buffers = 8MB, fsync = off, synchronous_commit = off, wal_sync_method = fsync, full_page_writes = off, ''wal_buffers = 8MB, wal_writer_delay = 10000ms, checkpoint_segments = 16'' || '''Tiling scheme''' || '''Tile size''' || '''Import time''' || '''DB size before''' || '''DB size after''' || '''BLOBs size''' || '''BLOBs size2''' || '''BLOBs after [#VACUUMtables VACUUM]''' || 0:31,0:31,0:149 || 460800 || 7m 34s || 47M || '''2464M''' || 267 MB || 1776 MB || 96MB || = PostgreSQL tips = == Total size of BLOBs == BLOBs are stored in the `pg_largeobject` in postgres. Each BLOB is divided into rows (pages) of 2048 bytes typically. More info [http://www.postgresql.org/docs/8.4/static/catalog-pg-largeobject.html here] Query below computes the space that all BLOBs take. {{{ SELECT pg_size_pretty(count(loid) * 2048) FROM pg_largeobject; }}} == Total size of BLOBs 2 == {{{ SELECT tablename, pg_size_pretty(size) AS size_pretty, pg_size_pretty(total_size) AS total_size_pretty FROM (SELECT *, pg_relation_size(schemaname||'.'||tablename) AS size, pg_total_relation_size(schemaname||'.'||tablename) AS total_size FROM pg_tables) AS TABLES WHERE TABLES.tablename = 'pg_largeobject' ORDER BY total_size DESC; }}} == Size of individual BLOBs == {{{ SELECT loid, pg_size_pretty(count(*) * 2048) FROM pg_catalog.pg_largeobject GROUP BY loid ORDER BY count(*) DESC; }}} == Size of RASBASE tables == {{{ SELECT tablename, pg_size_pretty(size) AS size_pretty, pg_size_pretty(total_size) AS total_size_pretty FROM (SELECT *, pg_relation_size(schemaname||'.'||tablename) AS size, pg_total_relation_size(schemaname||'.'||tablename) AS total_size FROM pg_tables) AS TABLES WHERE TABLES.schemaname = 'public' ORDER BY total_size DESC; }}} == Reset WAL == [http://www.postgresql.org/docs/9.1/static/wal-intro.html Write-ahead logging] is used to ensure data integrity. In postgres these logs are stored in pg_xlog and it's [http://www.postgresql.org/docs/9.1/static/wal-internals.html recommended] to put them in a separate disk. To reset the logs: 1. `service stop postgresql` 1. `sudo -u postgres pg_resetxlog $PGDATA` == Remove orphaned BLOBs == Orphaned BLOBs are BLOBs that are stored but they are not referenced by oid from any table in the database. To clear them up use [http://www.postgresql.org/docs/8.4/static/vacuumlo.html vacuumlo] * `vacuumlo RASBASE` Note, vacuumlo is an additional module made available through postgresql-contrib == VACUUM tables == Tuples that have been deleted or obsoleted are normally not physically deleted, and thus the space continues to be still used, until a `VACUUM` is performed on the database or specific tables. `VACUUM` alone doesn't free up the disk space for further use by the OS, `VACUUM FULL` is necessary for this. In rasdaman it's most important to do periodic VACUUM on the BLOBs table: * psql -d RASBASE -c "VACUUM FULL pg_largeobject" or * vacuumdb -d RASBASE -t "pg_largeobject" -f = Tile cache = A tile cache can be enabled in rasdaman during bulk ingestion, in order to work around inefficiencies of BLOB handling in postgres and slow performance for certain tiling schemes. == How to use == The cache can be enabled in rasmgr.conf by specifying a `--cachelimit` parameter after the `-xp` parameter: {{{ --cachelimit (default: 0) specifies upper limit in bytes on using memory for caching }}} An example `rasmgr.conf` configuration that allows the cache to use up to 1GB memory: {{{ define dbh rasdaman_host -connect RASBASE define db RASBASE -dbh rasdaman_host define srv N1 -host HOST -type n -port 9000 -dbh rasdaman_host change srv N1 -countdown 20000000 -autorestart on -xp --timeout 30000000 --cachelimit 1073741824 }}} When enabled, `INSERT` and `UPDATE` rasql statements can be issued in the normal way. At the end of the ingestion a `COMMIT` command should be executed, to make sure that the cache is cleared and everything is flushed to disk. E.g. {{{ rasql -q 'insert into coll ...' rasql -q 'update coll ...' rasql -q 'update coll ...' rasql -q 'update coll ...' # done with ingestion rasql -q 'commit' }}} == Important limitation == At the moment the tile cache should be used during bulk ingestion '''_ONLY_''', and disabled afterwards. Leaving the tile cache on will cause selection queries to fail randomly. Furthermore, it is best to have only one server defined in rasmgr.conf, and `-countdown` and `--timeout` parameters set to very large values.